Optimizing Subqueries in the FROM Clause
As of MariaDB 5.6.3, the optimizer more efficiently handles subqueries in the FROM clause (that is, derived tables):
- Materialization of subqueries in the
FROMclause is postponed until their contents are needed during query execution, which improves performance:- Previously, subqueries in the
FROMclause were materialized forEXPLAIN SELECTstatements. This resulted in partialSELECTexecution, even though the purpose ofEXPLAINis to obtain query plan information, not to execute the query. This materialization no longer occurs, soEXPLAINis faster for such queries. - For non-
EXPLAINqueries, delay of materialization may result in not having to do it at all. Consider a query that joins the result of a subquery in theFROMclause to another table: If the optimizer processes that other table first and finds that it returns no rows, the join need not be carried out further and the optimizer can completely skip materializing the subquery.
- Previously, subqueries in the
- During query execution, the optimizer may add an index to a derived table to speed up row retrieval from it.
Consider the following EXPLAIN statement, for which a subquery appears in the FROM clause of a SELECT query:
EXPLAIN SELECT * FROM (SELECT * FROM t1);
The optimizer avoids materializing the subquery by delaying it until the result is needed during SELECT execution. In this case, the query is not executed, so the result is never needed.
Even for queries that are executed, delay of subquery materialization may permit the optimizer to avoid materialization entirely. Consider the following query, which joins the result of a subquery in the FROM clause to another table:
SELECT * FROM t1 JOIN (SELECT t2.f1 FROM t2) AS derived_t2 ON t1.f2=derived_t2.f1 WHERE t1.f1 > 0;
If the optimization processes t1 first and the WHERE clause produces an empty result, the join must necessarily be empty and the subquery need not be materialized.
In the worst case (derived tables are materialized), query execution will take the same time as before MariaDB 5.6.3 because no additional work is done. In the best case (derived tables are not materialized), query execution will be quicker by the time needed for materialization.
For cases when materialization is required for a subquery in the FROM clause, the optimizer may speed up access to the result by adding an index to the materialized table. If such an index would permit ref access to the table, it can greatly reduce amount of data that must be read during query execution. Consider the following query:
SELECT * FROM t1 JOIN (SELECT * FROM t2) AS derived_t2 ON t1.f1=derived_t2.f1;
The optimizer constructs an index over column f1 from derived_t2 if doing so would permit the use of ref access for the lowest cost execution plan. After adding the index, the optimizer can treat the materialized derived table the same as a usual table with an index, and it benefits similarly from the generated index. The overhead of index creation is negligible compared to the cost of query execution without the index. If ref access would result in higher cost than some other access method, no index is created and the optimizer loses nothing.