PQR - A Simple Design Pattern for Multicore Enterprise Applications

Fellow Code Talker <a href="index.php?option=com_comprofiler&amp;task=userProfile&amp;user=92&amp;Itemid=29">Christopher Diggins</a> recently blogged <a href="index.php?option=com_myblog&amp;show=The-Concurrent-Future-is-not-Multi-Threaded.html&amp;Itemid=29">The Concurrent Future is not Multi-Threaded</a>. He raises interesting issues, raises, in fact, the most interesting issue, when he writes:</p><blockquote>&quot;There is simply no way that multi-threaded

InformationWeek Staff, Contributor

April 11, 2008

5 Min Read

Chris Waves the Red Flag

Fellow Code Talker Christopher Diggins recently blogged The Concurrent Future is not Multi-Threaded. He raises interesting issues, raises, in fact, the most interesting issue, when he writes:

"There is simply no way that multi-threaded code can remain the defacto method of constructing concurrent applications."

The Problem

The tricks which deliver efficiency in the multithreaded emulation of parallelism made familiar by sound application programming practices on popular platforms do not efficiently leverage multiple cores.

Chris offers pointers in theoretical directions such as Erlang. He's probably right, but there also exists a simple, pragmatic, mechanistic (as opposed to theoretical) design pattern that addresses scaling enterprise applications to multicore operating platforms.

The Idea

The idea is to turn your enterprise application into a workflow system which exhibits vector parallelism. Vector parallelism is probably the best model you can aspire to without advanced coding techniques. We're trying here to avoid advanced coding techniques and keep it simple and mechanistic.


  1. All work in the system can be factored in such a way that processing of a real-world work unit becomes internally:

    • A -> B -> C

  2. And A - B - C are of roughly the same load

  3. And A - B - C are parallelizable processes eligible under your chosen operating system and execution platform to be scheduled for separate processors and can thus run at the same time

Then you have a vector parallelized system that will scale nicely to the kind of multiprocessor server boxes and concurrent operating systems that are easily obtainable these days.

The Pattern

The Pattern (ooh, do I get to pick a fancy name?) let's call PQR for Process - Queue - Repertory.

That's the mechanistic beauty of it: it's three snap-together parts. Implement them as you wish, as suits you, as you are familiar with whatever tools you use.

  • Process

    • Does preferably One Thing and One Thing Only

      • Factor into evenly sized work units and the operating system scheduler is entirely adequate for realtime.

      • Simplicity is a virtue in coding as in life.

    • Gets its work from an In Queue

      • Which is the Out Queue of another Process or a Queue used for Dispatching to Processes

    • Outputs to its Out Queue

      • Which is, of course, the In Queue of some other Process or the return Queue to a Dispatcher

  • Queue

    • Connects Processes

    • Three kinds

      • In

      • Out

        • Work output of a Process

        • Is effectively the In Queue of another Process

        • In larger systems the PQR pattern can be extended to have Services (PQRS, get it!?) which are channels that dispatch to groups of Processes.

      • Command

        • System commands to individual Processes

  • Repertory

    • Local persistent and transient data that belongs

      • to an individual Process

      • to a group of Processes

    • Implement this as an in-memory cache that behaves like a

      • Database

      • Tuple store

The Benefits

Again, PQR is mechanistic. Were not looking for fabulous theoretical advantages; instead, we're looking for something:

  • easy to code

  • that runs fast

  • easy to understand

  • easy to maintain

  • distributable across multiple platforms

    • message architecture

    • locality of Repertory

  • offers small scope to typical parallelization errors

Good factoring itself assures that under a decent operating system scheduler the Process loads will be reasonably well balanced. Using Queues as the synchronization method assures correctness: if you can draw a map of your Processes and Queues, you can prove correctness of a PQR design.

The Implementation

Here's how you do PQR:

  • Grab some queuing middleware

  • Grab some in-memory cache middleware

  • Write in a compiler language without its own scheduler

    • Just leverage the operating system support for multiproc scheduling

    • If the operating system isn't sufficient, change operating systems

You see? You parallelize without ever having to write another lock, another spinner, another semaphore. All the nasty, error-prone parallelization code is in the message queue middleware. You've just built a giant chutes-and-ladders system where the little balls (work units) roll down the path and get dispatched to other chutes down other paths, with a small and carefully factored bit of work getting done at each landing.

The Case Study

In 2006-2007, Absolute Performance, Inc., (a profitable, privately held vendor of enterprise monitoring software targeted principally at SAAS operations ) accepted my recommendation to apply the PQR pattern to Version 5 of their "System Shepherd" ™ backend server. This represented a complete recoding from the Version 4 multitasking system which had been polished over a period of 7 years.

The software development team was expanded to eight (8) coders. The pattern was explained and design tasks shared across the team, with those having the most design experience focussing on workflow and those with junior experience tasked with implementation details and middleware qualification.

The project required that components of the Version 4 server be incrementally replaced by the new PQR-style Version 5 code. Partly because of the amazing skills of the motivated team and partly because of PQR's emphasis on factoring (PQR is all factoring, that's all it is) this design goal was emphatically achieved. Customers were serviced with the transformational code which, while occasionally lacking advanced features which were not yet supported, nonetheless executed reliably with easily predictable performance characteristics.

Won't bore you with benchmarks, let's just say Absolute Performance's System Shepherd now-complete distributed Version 5 runs Amazingly Faster than Version 4, and thus is able to carry a vastly greater load with the equivalent hardware with distributability so far limited more by the cost of test lab hardware than by any intrinsic limit to distribution in PQR itself.

A Few Hints

  • It still helps to read up on parallel systems.

    • Dust off those course books from your comp sci classes!

  • Your PQR project will be a successive modelling project

    • That means you will build increasingly more load capable increments of the planned system in incremental installments

  • Careful measurement is required!

    • You must profile your system and determine where the load is as a verification of the correctness of your factoring exercise.

    • Build instrumentation into your PQR system from the start!

  • You will want to design a central control program that:

    • maps, starts and stops Processes

    • flushs queues

    • reports metrics

  • No design pattern answers all questions

  • No design pattern is a substitute for common sense.

Read more on multithreading here

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights