Skip to Main Content

Oracle Database Discussions

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Best practice query

Nic PilotNov 21 2021

Hi,
For a while I'm questioning myself about a best practice query for performance.
I've 2 tables.
EMPLOYEES and DEPARTMENTS
The EMPLOYEES table can contains millions of records.

Which query is better for performance ?

Query 1

Select t1.emp_id, t1.emp_name, t2.dept_name
from EMPLOYEES t1, DEPARTMENTS t2
where t1.dept_id = t2.dept_id
order by t1.emp_name

Query 2

Select emp_id, emp_name, (select mypackage.get_dept_name(dept_id) from dual) as dept_name
from EMPLOYEES
order by emp_name

Comments

Greybird-Oracle
Hi,

We don't often get questions asking why JE is so fast. :-)

When inserting in key order, the work that is performed by JE is:

1) A Btree lookup is performed to find the proper insertion location.
2) Btree bottom level internal nodes are logged when they are split, which is typically once per 128 records.
3) Higher level Btree nodes are also split when they fill, but this is very infrequent compared to other operations.
4) The Btree leaf node records are logged.

Because of the log structured storage system, all logging is appending to a file. You are not reading, so there is no other I/O. (If the database becomes large and does not fit in cache, then reading records into cache is expensive of course.)

As the tree becomes deeper (more levels) over time, step 1, 2 and 3 become slightly more expensive, but perhaps not enough to measure in your test.

Does this answer your question?

--mark
666757
Mark,
Thanks again for input. It's always appreciated.

Unfortunately, what it seemed you were leading to was that my database isn't big enough so that it spills over the cache and makes things expensive enough to see any difference.

The test database I've created has gotten to 50 GB on 100s of millions of records and I've STILL seen the same insert times throughout the database lifetime - roughly 75 writes/ms on non transactional writing. The database should have been too large to work with in memory a long time ago. When you say it should be slightly more expensive as time goes on, I agree with you theoretically. Though even a small change should show in my results. I measure the insert times of every 10 million writes, and its been fairly constant (75 +/- 1 writes per ms).

Julian
Greybird-Oracle
I suspect that you're not seeing an increase over time because the I/O cost doesn't go up, and the I/O cost is probably the limiting factor. This is certainly not a problem, is it?
666757
In the Berkeley DB presentation, it lists 4 types of indexing schemes: btree, recno, queue, and hash. I've seen in the documentation that JE uses btrees as standard. However, due to the constant insert times, could it be possible that I'm actually using something like hashing, which has constant insert times? How can I check to verify that I am indeed using btrees? If I am not using btrees, how can I set my program to use them?

Please let me know.

Julian

Edited by: user10464001 on Oct 31, 2008 9:49 PM
Charles Lamb
JE only provides a B-Tree access methods. The other AM's that you mention are only available on the "Core" BDB.

Charles Lamb
666757
Mark,

That's an interesting reply you left. It's very difficult for me to get a grasp of what's going on since I don't really know the internals of berkeley db. From briefly scanning the JE architecture whitepaper, it seems that you are probably right. The keys are searched in memory data is stored at the leaves of the B+ trees.

Julian
1 - 6

Post Details

Added on Nov 21 2021
6 comments
280 views