This content has been marked as final. Show 11 replies
First of all, I haven't seen such a best practices document.
Second, you are probably better off enabling smartscan by default, and let the database decide. You can let it use smartscan more than it would by removing some indexes (thus probably letting CBO decide on more table scans), but that depends a lot on your application.
We cover more in our book, but here are the key points:
1) Smarts scans require a full segment scan to happen (full table scan, fast full index scan or fast full bitmap index scan)
2) Additionally, smart scans require a direct path read to happen (reads directly to PGA, bypassing buffer cache) - this is automatically done for all parallel scans (unless parallel_degree_policy has been changed to AUTO). For serial sessions the decision to do a serial direct path read depends on the segment size, smalltable_threshold parameter value (which is derived from buffer cache size) and how many blocks of a segment are already cached. If you want to force the use of a serial direct path read for your serial sessions, then you can set serialdirect_read = always.
3) Thanks to the above requirements, smart scans are not used for index range scans, index unique scans and any single row/single block lookups. So if migrating an old DW/reporting application to Exadata, then you probably want to get rid of all the old hints and hacks in there, as you don't care about indexes for DW/reporting that much anymore (in some cases not at all). Note that OLTP databases still absolutely require indexes as usual - smart scans are for large bulk processing ops (reporting, analytics etc, not OLTP style single/a few row lookups).
Ideal execution plan for taking advantage of smart scans for reporting would be:
1) accessing only required partitions thanks to partition pruning (partitioning key column choices must come from how the application code will query the data)
2) full scan the partitions (which allows smart scans to kick in)
2.1) no index range scans (single block reads!) and ...
3) joins all the data with hash joins, propagating results up the plan tree to next hash join etc
3.1) This allows bloom filter predicate pushdown to cell to pre-filter rows fetched from probe row-source in hash join.
So, simple stuff really - and many of your every-day-optimizer problems just disappear when there's no trouble deciding whether to do a full scan vs a nested loop with some index. Of course this was a broad generalization, your mileage may vary.
Even though DWs and reporting apps benefit greatly from smart scans and some well-partitioned databases don't need any indexes at all for reporting workloads, the design advice does not change for OLTP at all. It's just RAC with faster single block reads thanks to flash cache. All your OLTP workloads, ERP databases etc still need all their indexes as before Exadata (with the exception of any special indexes which were created for speeding up only some reports, which can take better advantage of smart scans now).
Note that there are many DW databases out there which are not used just only for brute force reporting and analytics, but also for frequent single row lookups (golden trade warehouses being one example or other reference data). So these would likely still need the indexes to support fast single (a few) row lookups. So it all comes from the nature of your workload, how many rows you're fetching and how frequently you'll be doing it.
And note that the smart scans only make data access faster, not sorts, joins, PL/SQL functions coded into select column list or where clause or application loops doing single-row processing ... These still work like usual (with exception to the bloom filter pushdown optimizations for hash-join) ... Of course when moving to Exadata from your old E25k you'll see speedup as the Xeons with their large caches are just fast :-)
Blog - http://blog.tanelpoder.com
Book - http://apress.com/book/view/9781430233923
gsalem wrote:filesystemio_options=setall is supposed to give you DirectIO and ASYNC IO. Since exadata uses ASM, it already does DirectIO and ASYNC IO. There is no buffering of ASM diskgroup physical reads in filesystem cache. Also ASM inherently performs ASYNC IO by default, regardless of the filesystemio_options. Have a read of following MOS Notes :
Is it a good idea to change filesystemio_options from non e to setall for exadata machines?Yes
ASM Inherently Performs Asynchronous I/O Regardless of filesystemio_options Parameter [ID <a href="https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=BULLETIN&id=751463.1">751463.1</a>]
Init.ora Parameter "FILESYSTEMIO_OPTIONS" Reference Note [ID <a href="https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=BULLETIN&id=751463.1">120697.1</a>]
Edited by: Vishal Gupta on Jul 1, 2011 1:16 AM