Quote: Do not put your faith in what statistics say until you have carefully considered what they do not say. ~William W. Watt.
I started this blog with that quote because I find that database execution plans can sometimes behave very much like statistics – you need to question what they really say. For example many DBA’s and SQL developers live by the rule “the lower the cost of an execution plan the better that plan must be”. That sounds quite reasonable, but is it really true?
Let’s take a very simple ANSI SQL statement, a select that joins nine tables with no restrictions, no aggregate functions, no group by, and no order by. It’s so simple that there’s really no obvious SQL modification that might improve performance.
INNER JOIN BIG_SCHEMA1.TABLE_2 ON (TABLE_1.COL1 = TABLE_2.COL5)
INNER JOIN BIG_SCHEMA1.TABLE_3 ON (TABLE_2.COL1 = TABLE_3.COL5)
INNER JOIN BIG_SCHEMA1.TABLE_4 ON (TABLE_3.COL1 = TABLE_4.COL5)
INNER JOIN BIG_SCHEMA1.TABLE_5 ON (TABLE_4.COL1 = TABLE_5.COL5)
INNER JOIN BIG_SCHEMA1.TABLE_6 ON (TABLE_5.COL1 = TABLE_6.COL5)
INNER JOIN BIG_SCHEMA1.TABLE_7 ON (TABLE_6.COL1 = TABLE_7.COL5)
INNER JOIN BIG_SCHEMA1.TABLE_8 ON (TABLE_7.COL1 = TABLE_8.COL5)
INNER JOIN BIG_SCHEMA1.TABLE_9 ON (TABLE_8.COL1 = TABLE_9.COL5)
If I load this SQL into tuning session within IDERA’s DB Optimizer product and ask it to provide any SQL rewrites it can that return the same rows of data, it finds 13 rewrites in just a few seconds that are good candidates as improvements. By default it displays the execution plan cost. If however I instruct the tool to also run each rewrite, then I also see the elapsed time as well as many other run time statistics such as physical IO, logical IO, etc. as shown here in Figure 1.
Figure 1: IDERA DB Optimizer Finds SQL Rewrites
But let’s examine the results to see if our premise holds up – that the lowest cost is always best. The original SQL had a cost of 27 and ran in .01 seconds. The lowest cost was row #2 at 25, but that SQL rewrite ran .02 seconds, or double the time. Now look at the last two rows with a cost of 46 and 803. Their execution time shows up as the lowest even though these execution plan costs are the highest.
The point is “Do your homework!” Don’t just look at execution plan costs alone. Make sure that what you care about most, such as execution time or physical IO are also low. Don't let the statistics conundrum lead you astray.