Substantial and accidental forms of a SQL expression by Eduardo Bellani
Multiplicity of parts, variety, and unity of plan which
combines the parts into a coherent whole, –
such are the elements of order found in all beauty.
(De Wulf 2012)
I have been working with data intensive applications for a while and of course that means exposure to a lot of SQL, both personally and training other developers.
In these experiences I have noticed that it is sometimes harder than it should be to grasp what SQL is doing, specially since its accidental form makes understanding its substantial form so hard. In this post I’m going to try to help the reader separate the two and understand SQL better.
But first, some definitions of terms:
… form is the principle of organization of a thing’s matter, or the thing’s intelligible nature, form can be of two kinds.
… (it) can be substantial, organizing the matter into the kind of thing that the substance is.
On the other hand, form can be accidental, organising some part of an already constituted substance.
substantial form always … brings a new substance into existence;
accidental form simply informs an already existing substance, and in doing so it simply modifies some substance. (Kerrl, n.d.)
One of the core problems with SQL and in particular its SELECT
expression is that the way it is written/read (and the usual
expectations of the terms such as SELECT/FROM …) is very different
from what is actually taking place.
Let’s discuss this fact starting from an example (the example and much of the discussion are taken from (Date and Darwen 1997))
SELECT
P.PNO,
'Weight in grams =' AS TEXT1,
P.WEIGHT * 454 AS GMWT,
P.COLOR,
'Max Quantity =' AS TEXT2,
MAX(SQ.QTY) AS MQTY
FROM P, SP
WHERE
P.PNO = SP.PNO
AND (P.COLOR = 'Red' OR P.COLOR = 'Blue')
AND SP.QTY > 200
GROUP BY
P.PNO,
P.WEIGHT,
P.COLOR
HAVING SUM(SP.QTY) > 350
The difficulty here starts right at the begging, since the SELECT
clause is the first to be read and written, but it is the last to be
evaluated. Here is how to interpret this SELECT expression:
-
FROM : The source of the data. An usually overseen point is that the
FROMclause is actually aJOIN(aCROSS JOINspecifically). -
WHERE : The result of the
JOINof the previous step is reduced by elimination of rows (a process calledRESTRICTIONin the Relational Algebra). -
GROUP BY : This is one of the most complex parts of the expression because it goes outside the Relational Algebra. You can think of it as creating, from the restricted table, a Dictionary where the keys are a combination of the values of the defined columns and the referenced values are all the rows where the keys exist.
NOTE HOWEVER that such Dictionary is NOT a proper table. And that is why a
GROUP BYclause will always demand a correspondingSELECTclause that turns such Dictionary into a proper table. -
HAVING : This clause applies to the Dictionary values generated by GROUP BY, filtering all rows that do not match the condition. This is another operator that sits outside the Relational Algebra.
-
SELECT : This is called a
PROJECTIONin Relational Algebra. It is where you pick the colums of the table. It is also where each group resulting from the HAVING filter should now generate a single result row, by this process:- The part number, weight, color and maximum quantity are extracted from the Dictionary
- The weight is converted to grams
- Two literals are added (‘Weight in grams =’, ‘Max Quantity =’).
- All these insertions are ordered. The result looks like this:
| PNO | TEXT1 | GMWT | COLOR | TEXT2 | MQTY |
|---|---|---|---|---|---|
| P1 | Weight in grams = | 5448 | Red | Max Quantity = | 300 |
| P5 | Weight in grams = | 5448 | Blue | Max Quantity = | 400 |
So what
I hope the reader will leave with a better appreciation of what a SQL
SELECT expression actually is, instead of what it looks like. I also
think that understanding the concept of substantial vs accidental form
can help the reader a lot in understanding things like such expression
in the future.
I also think a point that deserves attention is how worse the SELECT
expression becomes by SQL’s introduction of operators that don’t fit the
relational algebra (GROUP BY and HAVING clauses). Besides such, the
SELECT expression is merely a JOIN->RESTRICT->PROJECT sequence.

Figure 1: Santa Maria del Mar destroyed by communist arson, circa 1936