Substantial and accidental forms of a SQL expression by Eduardo Bellani

Multiplicity of parts, variety, and unity of plan which
combines the parts into a coherent whole, –
such are the elements of order found in all beauty.
(De Wulf 2012)

I have been working with data intensive applications for a while and of course that means exposure to a lot of SQL, both personally and training other developers.

In these experiences I have noticed that it is sometimes harder than it should be to grasp what SQL is doing, specially since its accidental form makes understanding its substantial form so hard. In this post I’m going to try to help the reader separate the two and understand SQL better.

But first, some definitions of terms:

form is the principle of organization of a thing’s matter, or the thing’s intelligible nature, form can be of two kinds.

… (it) can be substantial, organizing the matter into the kind of thing that the substance is.

On the other hand, form can be accidental, organising some part of an already constituted substance.

One of the core problems with SQL and in particular its SELECT expression is that the way it is written/read (and the usual expectations of the terms such as SELECT/FROM …) is very different from what is actually taking place.

Let’s discuss this fact starting from an example (the example and much of the discussion are taken from (Date and Darwen 1997))

  SELECT
    P.PNO,
    'Weight in grams =' AS TEXT1,
    P.WEIGHT * 454 AS GMWT,
    P.COLOR,
    'Max Quantity =' AS TEXT2,
    MAX(SQ.QTY) AS MQTY
  FROM P, SP
  WHERE
    P.PNO = SP.PNO
    AND (P.COLOR = 'Red' OR P.COLOR = 'Blue')
    AND SP.QTY > 200
  GROUP BY
    P.PNO,
    P.WEIGHT,
    P.COLOR
  HAVING SUM(SP.QTY) > 350

The difficulty here starts right at the begging, since the SELECT clause is the first to be read and written, but it is the last to be evaluated. Here is how to interpret this SELECT expression:

  1. FROM : The source of the data. An usually overseen point is that the FROM clause is actually a JOIN (a CROSS JOIN specifically).

  2. WHERE : The result of the JOIN of the previous step is reduced by elimination of rows (a process called RESTRICTION in the Relational Algebra).

  3. GROUP BY : This is one of the most complex parts of the expression because it goes outside the Relational Algebra. You can think of it as creating, from the restricted table, a Dictionary where the keys are a combination of the values of the defined columns and the referenced values are all the rows where the keys exist.

    NOTE HOWEVER that such Dictionary is NOT a proper table. And that is why a GROUP BY clause will always demand a corresponding SELECT clause that turns such Dictionary into a proper table.

  4. HAVING : This clause applies to the Dictionary values generated by GROUP BY, filtering all rows that do not match the condition. This is another operator that sits outside the Relational Algebra.

  5. SELECT : This is called a PROJECTION in Relational Algebra. It is where you pick the colums of the table. It is also where each group resulting from the HAVING filter should now generate a single result row, by this process:

    1. The part number, weight, color and maximum quantity are extracted from the Dictionary
    2. The weight is converted to grams
    3. Two literals are added (‘Weight in grams =’, ‘Max Quantity =’).
    4. All these insertions are ordered. The result looks like this:
PNO TEXT1 GMWT COLOR TEXT2 MQTY
P1 Weight in grams = 5448 Red Max Quantity = 300
P5 Weight in grams = 5448 Blue Max Quantity = 400

So what

I hope the reader will leave with a better appreciation of what a SQL SELECT expression actually is, instead of what it looks like. I also think that understanding the concept of substantial vs accidental form can help the reader a lot in understanding things like such expression in the future.

I also think a point that deserves attention is how worse the SELECT expression becomes by SQL’s introduction of operators that don’t fit the relational algebra (GROUP BY and HAVING clauses). Besides such, the SELECT expression is merely a JOIN->RESTRICT->PROJECT sequence.

Figure 1: Santa Maria del Mar destroyed by communist arson, circa 1936

Figure 1: Santa Maria del Mar destroyed by communist arson, circa 1936

References

Date, C. J., and Hugh Darwen. 1997. A Guide to the Sql Standard (4th Ed.): A User’s Guide to the Standard Database Language Sql. USA: Addison-Wesley Longman Publishing Co., Inc.
De Wulf, M. 2012. The System of Thomas Aquinas. Verlag Editiones Scholasticae.
Kerrl, Gaven. n.d. “Aquinas: Metaphysics.” The Internet Encyclopedia of Philosophy. https://web.archive.org/web/20240313230721/https://iep.utm.edu/thomas-aquinas-metaphysics/.