Optimizing Subqueries in the FROM
Clause
As of MariaDB 5.6.3, the optimizer more efficiently handles subqueries in the FROM
clause (that is, derived tables):
- Materialization of subqueries in the
FROM
clause is postponed until their contents are needed during query execution, which improves performance:- Previously, subqueries in the
FROM
clause were materialized forEXPLAIN SELECT
statements. This resulted in partialSELECT
execution, even though the purpose ofEXPLAIN
is to obtain query plan information, not to execute the query. This materialization no longer occurs, soEXPLAIN
is faster for such queries. - For non-
EXPLAIN
queries, delay of materialization may result in not having to do it at all. Consider a query that joins the result of a subquery in theFROM
clause to another table: If the optimizer processes that other table first and finds that it returns no rows, the join need not be carried out further and the optimizer can completely skip materializing the subquery.
- Previously, subqueries in the
- During query execution, the optimizer may add an index to a derived table to speed up row retrieval from it.
Consider the following EXPLAIN
statement, for which a subquery appears in the FROM
clause of a SELECT
query:
EXPLAIN SELECT * FROM (SELECT * FROM t1);
The optimizer avoids materializing the subquery by delaying it until the result is needed during SELECT
execution. In this case, the query is not executed, so the result is never needed.
Even for queries that are executed, delay of subquery materialization may permit the optimizer to avoid materialization entirely. Consider the following query, which joins the result of a subquery in the FROM
clause to another table:
SELECT * FROM t1 JOIN (SELECT t2.f1 FROM t2) AS derived_t2 ON t1.f2=derived_t2.f1 WHERE t1.f1 > 0;
If the optimization processes t1
first and the WHERE
clause produces an empty result, the join must necessarily be empty and the subquery need not be materialized.
In the worst case (derived tables are materialized), query execution will take the same time as before MariaDB 5.6.3 because no additional work is done. In the best case (derived tables are not materialized), query execution will be quicker by the time needed for materialization.
For cases when materialization is required for a subquery in the FROM
clause, the optimizer may speed up access to the result by adding an index to the materialized table. If such an index would permit ref
access to the table, it can greatly reduce amount of data that must be read during query execution. Consider the following query:
SELECT * FROM t1 JOIN (SELECT * FROM t2) AS derived_t2 ON t1.f1=derived_t2.f1;
The optimizer constructs an index over column f1
from derived_t2
if doing so would permit the use of ref
access for the lowest cost execution plan. After adding the index, the optimizer can treat the materialized derived table the same as a usual table with an index, and it benefits similarly from the generated index. The overhead of index creation is negligible compared to the cost of query execution without the index. If ref
access would result in higher cost than some other access method, no index is created and the optimizer loses nothing.