Question

SQL statement evaluation order

I'm learning PostgreSQL and want to understand the exact order in which SQL statements are evaluated. After researching, I found the following evaluation order:

 1. FROM 
 2. ON 
 3. JOIN 
 4. WHERE 
 5. GROUP BY 
 6. HAVING
 7. SELECT 
 8. AS  
 9. DISTINCT 
 10. ORDER BY 
 11. LIMIT / OFFSET   

Based on this order, it seems that when I create an alias, I can't use it in the clauses that are evaluated before the SELECT statement. However, I'm confused about how this works with subqueries. For example:

SELECT first_name, last_name
    FROM customer AS c
    WHERE EXISTS(
    (SELECT * FROM payment AS p
    WHERE p.customer_id = c.customer_id
    AND amount > 11)
    );

How can the subquery use the alias c and, how does the subquery get access to the alias c before it is created?

Could someone provide insights how does this work?

Thank you in advance for the help!

 2  68  2
1 Jan 1970

Solution

 2

This is mentioned in the docs:

4.2.11. Scalar Subqueries

A scalar subquery is an ordinary SELECT query in parentheses that returns exactly one row with one column. .... The SELECT query is executed and the single returned value is used in the surrounding value expression. .... The subquery can refer to variables from the surrounding query, which will act as constants during any one evaluation of the subquery. See also Section 9.23 for other expressions involving subqueries.

Section 9.23.1 EXISTS also says the same thing.

Note also:

7.2.1.5. LATERAL Subqueries

Subqueries appearing in FROM can be preceded by the key word LATERAL. This allows them to reference columns provided by preceding FROM items. (Without LATERAL, each subquery is evaluated independently and so cannot cross-reference any other FROM item.)

This is not the same as a scalar subquery. A scalar subquery is used in place of a single value in an expression, and must only return a single row (or none). Whereas a lateral join can return multiple rows, and joins back to the previous table references as normal.


As a side point, your list of evaluation order isn't accurate. SELECT and AS are essentially the same thing. ON and JOIN are also. You are also missing Window Functions which come after HAVING, and set operators UNION and friends, which come after DISTINCT but before ORDER BY.

2024-07-22
Charlieface

Solution

 1

The logical order of standard ISO SQL clauses for the SQL SELECT statement is :

  1. FROM
  2. WHERE
  3. GROUP BY
  4. HAVING
  5. WINDOW
  6. SELECT
  7. ORDER BY

All other keyword are subclauses (JOIN / ON, DISTINCT, OFFSET / FETCH...) or operators (IN, BETWEEN, EXISTS...) depending on SQL clauses.

Contrary to what @Laurenz_Albe says, the AS keyword introduces an alias, that is to say a new name, for a column or a table, which can be used in the clauses located after it, in the logical order. Thus an AS defining an alias for a column of the SELECT clause can only be used in the ORDER BY clause and in no case in other of the following elements of the SELECT clause.

So :

SELECT COL1 AS X, X + 1

Must throw an exception.

On the other hand, a table alias can be used in any other clause, including following the sequence of elements of the FROM clause in which it is defined.

So :

SELECT *
FROM   Tab1 AS T
       JOIN Tab2 AS U
          ON T.id = U.id

...is correct

The query givent by @Laurenz_Albe :

SELECT t.c AS x
FROM (VALUES (1), (2)) AS t(c)
WHERE EXISTS (SELECT WHERE t.c = 1);

...is syntaxically false in SQL and cannott be accepted by any RDBMS, except PostGreSQL. Because the inner SELECT clause must have almost an element. A correct query can be :

SELECT t.c AS x
FROM (VALUES (1), (2)) AS t(c)
WHERE EXISTS (SELECT * WHERE t.c = 1);

Avoid to write such stupid query that is syntaxically innaceptable...

By the way,

kewords ON and PARTITIONNED depends of a JOIN, that depends of the FROM clause, like LATERAL { CROSS |OUTER } APPLY.

keyword DISTINCT wich is an operator depend of the FROM clause and must preceed any element of this clause

keyword LIMIT is not a part of the ISO SQL standard.

Keyword OFFSET depend of the ORDER BY clause and FETCH depend of OFFSET.

But do not make confusion between the logical order and the physical treatment of the query. A query is just a sentence translated into a mathematical formulae of relational algebra and this formulae can be:

  • simplified
  • rewrited

As a basic example of simplification :

SELECT *
FROM   T
WHERE  Col1 + 3 = 5

would be rewrited as :

SELECT *
FROM   T
WHERE  Col1 = 2

A more sophisticated simplification by rewriting can be :

SELECT COUNT(*), PrimaryKeyCol, Col2, Col3, Col4, Col5 
FROM   T
GROUP  BY PrimaryKeyCol, Col2, Col3, Col4, Col5 

...that would be rewrited as ::

SELECT COUNT(*), PrimaryKeyCol, Col2, Col3, Col4, Col5 
FROM   T
GROUP  BY PrimaryKeyCol 

Depending of the level of "intelligence" of the optimizer (stupidly called planer in PostGreSQL)...

2024-07-23
SQLpro

Solution

 0

c is an alias for a table, so it belongs to the FROM clause, so it makes sense that it is available in a subquery in WHERE.

Consider the following query:

SELECT t.c AS x
FROM (VALUES (1), (2)) AS t(c);

x is a column alias, t and c are table aliases.

This will work:

SELECT t.c AS x
FROM (VALUES (1), (2)) AS t(c)
WHERE EXISTS (SELECT WHERE t.c = 1);

But this will not:

SELECT t.c AS x
FROM (VALUES (1), (2)) AS t(c)
WHERE EXISTS (SELECT WHERE x = 1);
2024-07-23
Laurenz Albe