Cooking an old puzzle - InformationWeek
Software // Information Management
4/25/2005
09:51 AM
Commentary
50%
50%
RELATED EVENTS
Dun & Bradstreet CAO Discusses Her Roll & The Skills Needed to Do the Job
Jul 26, 2017
LWhat does a chief analytics officer do each day? To help us answer that question we welcome Dun & ...Read More>>

# Cooking an old puzzle

I just got an email about Puzzle #11 in my old book SQL PUZZLES & ANSWER from Rainer Gemulla at TU Dresden, Fak. Informatik, Institut SyA, in Dresden, Germany. It is a very nice cook and it is embarassing to see how needlessly complex the other answers were:

I just got an email about Puzzle #11 in my old book SQL PUZZLES & ANSWER from Rainer Gemulla at TU Dresden, Fak. Informatik, Institut SyA, in Dresden, Germany. It is a very nice cook and it is embarassing to see how needlessly complex the other answers were:WORK ORDER PUZZLE

Cenk Ersoy asked this question on the Gupta Forum on CompuServe. He has a table that looks like this. In a factory, a project is described in a work order, which has a series of steps that it must go through. A step on the workorder is either completed or it awaiting the completion of one or more of the steps that come before it. His table looks like this:

CREATE TABLE Projects (workorder CHAR(5) NOT NULL, step INTEGER NOT NULL CHECK (step BETWEEN 0 AND 1000), status CHAR(1) NOT NULL CHECK (status IN ('Completed', 'Waiting')), PRIMARY KEY (workorder, step));

With some sample data like this:

Projects workorder step status ================================= 'AA100' 0 'Completed' 'AA100' 1 'Waiting' 'AA100' 2 'Waiting' 'AA200' 0 'Waiting' 'AA200' 1 'Waiting' 'AA300' 0 'Completed' 'AA300' 1 'Completed'

He would like to get the work orders where the step is zero and the status is 'Completed', but all other legs for that work order have a status of 'Waiting'. For example, the query should return only 'AA100' in the sample data.

This is really fairly straight forward, but you have to re-word the query specification into the passive voice to see the answer. Instead of saying, "all other legs for that work order have status of Waiting", instead say "Waiting is the status of all the non-zero legs" and the answer falls out aimmediately, thus:

SELECT workorder FROM Projects AS P1 WHERE step = 0 AND status = 'Completed' AND 'Waiting' = ALL (SELECT status FROM Projects AS P2 WHERE step <> 0 AND P1.workorder = P2.workorder);

Another rewording would be to say that we are looking for a work order group (i.e. a group of steps) which has certain properties in the status column for certain steps. Using a characteristic function in a SUM() will let us know if all the elements of the group meet the criteria; if they all do, then the count of the characteristic function is the size of the group.

SELECT workorder FROM Projects AS P1 GROUP BY workorder HAVING SUM(CASE WHEN step <> 0 AND status = 'Waiting' THEN 1 WHEN step = 0 AND status = 'Completed' THEN 1 ELSE 0 END) = COUNT (step);

or if you do not have a CASE expression, you can use some algebra.

SELECT workorder FROM Projects AS P1 GROUP BY workorder HAVING SUM( ((1-ABS(SIGN(step)) * POSITION ('Waiting' IN status)) + ((1-ABS(step)) * POSITION ('Completed' IN status)) ) = COUNT (step);

Since this query involves only one copy of the table and makes a single pass thru it, it should be as fast as possible. There are also a subtle optimization tricks in the CASE expression. The CASE expression's WHEN clauses are tested in the order they appear, so you should arrange the WHEN clauses in order from mostly likely to occur to least likely to occur.

While not required by the standard, the terms of an AND predicate will also often execute in the order of their appearance when they are all join predicates (i.e. involve columns from two tables) or are all search arguments (i.e. a column from one table compared to a constant value -- also called SARGs in the literature). Therefore you can put the SARG with the smallest datatype first to improve performance; INTEGERs compare faster than long CHAR(n).

A version of the second answer which avoided the subquery was ï¼ent by Mr. Francisco Moreno of Columbia, South America. The original version used the Oracle DECODE() function, but his query would translate into SQL-92 like this:

SELECT workorder FROM Projects GROUP BY workorder HAVING COUNT(*) -- total rows in the workorder = COUNT(CASE WHEN leg || status = '0Completed' THEN 1 ELSE NULL END) -- total 0 & 'Completed' rows + COUNT(CASE WHEN status = 'Waiting' THEN 1 ELSE NULL END); -- total 'Waiting' rows

This puts the COUNT(*) on one side of a comparison operator and not in an expression. This can be a help for some optimizers.

one of my students (Stephan Gneist) found a nice solution for the SQL puzzle #11 (work order), similar to: SELECT workorder FROM Projects WHERE status = 'Completed' GROUP BY workorder HAVING SUM(leg) = 0; This solution exploits the not null and check contraints of the table definition and does not require any join. Bye, Rainer. -- Rainer Gemulla TU Dresden, Fak. Informatik, Institut SyA, 01062 Dresden, Germany Phone +49-351-463-38492, Fax +49-351-463-38259 email: gemulla@inf.tu-dresden.de

More Insights