While I was reading about "smart scan" in Exadata I noticed a mention that while processing a query, only the columns needed by the query are actually returned from the storage server to the database server.
Just wondering how this is achieved- I know that Exadata is not a columnar database and stores the data as rows in tabular format in the disk. Given this, does the storage software retrieve the full row (all columns), discards the unwanted columns and returns the rest to the database? Is this how this is achieved?
I'm evaluating a columnar database and wanted to see how Exadata was able to offer the same benefit of a columnar database but with row storage in the disks.
It is the function of the storage server to discard the unnecessary columns on the retrieved rows and return only targeted rows and columns during a smart scan. note this will only happen during a smart scan and there are several other criteria that need to be met to ensure that like direct reads and the hidden smalltable_threshhold parameter.
So the saving when it occurs is between the storage grid and the compute nodes since the full rows are still read from disk.
Not sure if HCC has any impact on this however . I was not taking the case when the table is HCC compressed .
HCC stores rows column-wise within a compression unit. Lots of doc on this. Most recently http://jonathanlewis.wordpress.com/2012/07/20/compression_units/
In technical speak, Exadata HCC is not a true column store, where each column is stored in a different file, but it does leverage column storage/compression (organization) as the data is stored in column-major format.
Greg Rahn | blog | twitter | linkedin