Discussions
Categories
- 17.9K All Categories
- 3.4K Industry Applications
- 3.3K Intelligent Advisor
- 62 Insurance
- 536K On-Premises Infrastructure
- 138.2K Analytics Software
- 38.6K Application Development Software
- 5.7K Cloud Platform
- 109.4K Database Software
- 17.5K Enterprise Manager
- 8.8K Hardware
- 71.1K Infrastructure Software
- 105.2K Integration
- 41.5K Security Software
Joins to utilizes outer tables min/max values of join columns

If one has select statement of type
select
T1.C,
T2.B
from
T1
inner join T2 on T1.A = T2.A
where
T1.B = 'something'
;
The table T1 typically full scanned (T1.B does not have index) and table T2 is full scanned or accessed by index.
If table T2 is big, then join operation using full scan can take a quite a lot of time; temporary areas used for joining can be large.
If during scanning of table T1 the max and min values of T1.A where stored, and when accessing table T2 this information would be utilized,
there might be possibility to utilize partition pruning or discarding of not applicable rows for temporary areas.
The max and min values of T1.A could be gotten easily during scanning of rows, no extra sorting would be required.
Then joining of these tables might utilize less resources.
Comments
-
This sounds like the already available Heat Map feature.
Heat Maps require the. Adv. Compression add-on. I don't know if it requires ExaData also.
MK
-
I think your idea is already implemented - in a certain more complex way.
It is called a bloom filter (hash) join. Bloom filter joins can take advantage from partition pruning.
See also this thread:Bloom filter joins
And this paper: https://blogs.oracle.com/datawarehousing/big-data-sql-quick-start-joins-bloom-filter-and-other-features-part5
The difference to your idea is that bloom filters do not only consider min/max values, but almost all the values.
-
Hi
Bloom filters do have similiar effects.
But could having min/max values for search criteria for inner table enable usage of storage indexes in Exadata environments?
-
Hi
Also in original idea there was assumption that both of the joined tables would be large, hundreds of millions rows or more.
(e.g. in style of ORDER ja ORDER_LINE).
select
ORDER.C,
ORDER_LINE.B
from
ORDER
inner join ORDER_LINE on T1.ORDER_NUMBER_SEQ = T2.ORDER_NUMBER_SEQ
where
T1.B = 'something'
;