I was asked this question in an interview, it goes like this:
When a certain user fires a specific query, the CPU utilization shoots up but the IO and the Memory are normal in this case.
What as an DBA, you will suggest to the app team to channge in the query.?
Explain plan is always there but without explain plan can we pegion hole the problem in the query which is causing this high CPU. Like in the case of high IO, we can predict that there might be a lot of hard parses happening and also in the case of high memory utilization (RAM) we can say that lots of sorting must be happening in this query, going on the same road can we predict something when mainly the CPU is high for a specific query.
as a DBA, you shouldn't make any recommendations at this point -- trial-and-error method is very messy and inefficient. You should determine the cause of excessive CPU usage first, and then make a recommendation based on that. Typical causes are:
- high number of logical reads
- expensive in-memory sorts
- expensive in-memory hash joins
- abuse of PL/SQL subroutines in a query (e.g. use of a user-defined function in a join condition)
An important thing to consider is that unless your query is running in parallel, it cannot take more than one CPU at a time. So if your system has multiple CPUs and CPU utilization approaches 100%, you cannot blame a serially executed query for that (unless, of course, multiple sessions are firing it simultaneously).
Another thing to consider is that queries are supposed to consume database resources (including CPU), and one only should worry if resource consumption is unreasonable. So it's not possible to tell whether a query is taking too much CPU or not without knowing how much work it has to do -- first of all, how much data it has to process. E.g. if a query is not using aggregation and returns just 1 row, and it takes 1 CPU minute to do so, it's suspicious. If a process loads a billion rows into a table and takes a few CPU hours to do it, it's probably fine.
Logical Reads translate to CPU Cycles instead of I/O cycles.
Logical Reads are preferred over Physical Reads, True. But many SQLs also do too many Logical Reads. An example is a query that uses a Nested Loop a million times, with the same buffers (blocks) being read repeatedly. Since it is the same blocks, they are in the buffer cache but each read takes CPU.
Hemant K Chitale
confusingly, "logical read" has two meanings in Oracle: "a read from buffer cache not preceded by a read from disk (i.e. the data were already in the buffer cache)" or "a read from buffer cache, regardless to whether or not it was preceded by a disk read". When people say "do as few logical reads as possible" they use the latter meaning, i.e. they mean that you should minimize the amount of data processed per a returned byte, not that you should read data from disk rather than from cache (that would be crazy).
From real life:
I have a query that is running quickly, but suddenly changed to running slowly. Investigating it, I find the plan changed to a full index scan. This index is repeatedly fully scanned in a large SGA on a system that isn't particularly busy otherwise. So Oracle keeps it in memory, and spikes cpu while grinding through it to find a few rownums at the end. I would want to reduce these logical reads, because it is just plain unnecessary work. I could hint the query, or better, fix the statistics so Oracle doesn't decide to do the silly full scan.
That is what reducing logical reads means, both in the above example and the very common nested loops example Hemant gave. Reducing unnecessary work.
Now, that is why I answered insufficient information to the original question. I would hope the interviewer was looking for the thought processes you would bring to bear on such a question, answering like Nikolay's first response. But if he was looking for something specific, it could simply be a bad question, as you find in some online interview lists, and even some books.
There could even be no change needed to the Oracle query in the original question, as a busy system may be close to being loaded, but the particular query pushes it over the edge, the run queues start lengthening, and everything goes to hell. This is common when letting DSS queries loose on OLTP systems. The answer then might be "don't run the stupid query" (ie, run it on a different system dedicated to DSS queries, or have time windows for running some queries).