Skip to Main Content

Analytics Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Interested in getting your voice heard by members of the Developer Marketing team at Oracle? Check out this post for AppDev or this post for AI focus group information.

OBIEE Performance Analytics: Analysing the Impact of Suboptimal Report Design

Bob Rhubart-OracleFeb 3 2016 — edited Mar 30 2016

OBIEE can do complex processing on data that you request from a data source that may not support it (e.g., analytical windowing functions against data in Hive, or MS Access). Where OBIEE can, it will generally push all the complex work down to the database it’s querying from. If it can't, it will pull the data back in raw form and then do the work itself. And this is where the problems can start. Oracle ACE Robin Moffatt's article will help you avoid the headaches.


by Oracle ACE Robin Moffatt ace.gif

Oracle Business Intelligence Enterprise Edition (OBIEE) is a fantastically powerful analytical and reporting tool. One of the great things about OBIEE is its ability to run a report and get the data you've asked for, whatever it takes. It could be running a single query against a dedicated data warehouse; it could be federating the results of multiple queries across a mixture of transactional and analytical relational sources, or even Hadoop. OBIEE can do complex processing on data that you request from a data source that may not support it (e.g., analytical windowing functions against data in Hive, or MS Access). Where OBIEE can, it will generally push all the complex work down to the database it’s querying from. If it can’t, it will pull the data back in raw form and then do the work itself. And this is where the problems can start….

Two of the common symptoms of the performance problem that I’m going to discuss here are:

  • Variance in response times: When a report is usually fast, but there are hotspots of slow performance
  • Disk space: OBIEE uses lots of disk space, sometimes running out

The underlying cause of these problems can often be down to sub-optimal design at one--or several--points in the OBIEE implementation. It’s easy to understand how this came about. Reports are often built in a development environment with little or no data, and so long as they functionally work, the "non-functional" bit of performance doesn’t come into play. It's only once reports start to be run against proper data sets, and with concurrent users, that serious issues can start to arise.In this article we're going to have a look at what happens to performance when the BI Server is doing more than its fair share of the work. It will give us an understanding of the stresses that a badly designed report or RPD can put on the system, and what to look for when we do encounter this kind of performance problem. What we’re going to see is the following:

  • Suboptimal design can cause the I/O subsystem on the application server to become overwhelmed. Because I/O throughput is frequently overlooked in favor of the more obvious CPU and memory metrics when assessing a server’s capacity, it’s easy to be given the "all clear" even when it’s actually flat on its back.

  • Low User CPU does not mean that the system is not under pressure; you must also look at CPU Wait and Idle, as well as the load average.

  • Reports with large amounts of data returned to the end user require lots of memory and temp disk space for Presentation Services and temp disk space for the BI Server

  • Reports such as these, when run concurrently can easily exhaust the disk space available to the temp folder, resulting in aborted queries with errors such as:

    Query Status: [nQSError: 46073] Operation ‘write()’ on file
    ‘/u01/app/oracle/product/instances/instance1/tmp/OracleBIServerComponent/
    coreapplication_obis1/obis_temp/nQS_1229_557_52279963.TMP’ failed with error: (2) (.

    [nQSError: 46017] Sort has no work space in the work directory.
    [nQSError: 43119] Query Failed:

  • Lots of data moving throughout the stack is generally going to be inefficient and means increased network and disk I/O throughput.
    If you see this behavior with the execution of a single report, you need to consider whether you have the necessary bandwidth on the server when concurrent requests of a similar nature run, and also what this resource utilization means for other functions on the server (e.g. co-located ETL).

To analyze the performance, we’re going to use examples run in SampleApp v406, but with the database hosted on a separate machine.

What Happens When an OBIEE Query Runs?

Before diving into the nitty-gritty of this, let’s take a moment to examine how an OBIEE analysis executes. Understanding this enables us to accurately describe a query’s progress through the stack and explain observations made. It also enables us to build a time profile of the overall response. A time profile is something that Cary Millsap discusses in detail in his excellent paper, Thinking Clearly About Performance, a lot of which inspired my talk, No Silver Bullets - OBIEE Performance in the Real World (you can find the slides and video online). A time profile is built based on the components of execution, which for OBIEE are broadly as follows. Note that we’re assuming a single analysis here; for multiple analyses on a dashboard much of this would run concurrently (which in itself can also trigger problems):

  1. User opens a dashboard/hits refresh.
  2. Request passes through WebLogic Server to Presentation Services (Oracle Business Intelligence Presentation Server: OBIPS).
  3. Presentation Services (OBIPS) generates the Logical SQL for the analysis and sends it via ODBC to the Oracle Business Intelligence Server (OBIS).
  4. BI Server (OBIS) parses the Logical SQL through the RPD and generates one or more Physical SQL statements in order to retrieve the data needed.
  5. OBIS receives all the data it needs from the data sources and does any additional processing required. This could be stitching multiple data sets (federation) or combining (union/fragmentation), or processing a single data set if it couldn’t be done in the database (sort/ aggregate/filter /etc.).
  6. OBIS sends data back to OBIPS, where the report is generated and formatted and fed back to the client browser.

pastedImage_32.png

Understanding this basic flow is key to identifying bottlenecks in performance and understanding how to address any issues observed.

Pulling Large Data Volumes from Source Database

We’re going to have a look at how BI Server behaves when pulling back lots of data from the source database. This can come about for the obvious reason of building a very large report, or in the less obvious case when the BI Server can’t push all the processing to the data source (maybe it doesn’t support it, or there’s federation involved) and so the BI Server pulls back the data prior to aggregation and/or filtering and does it itself.

Level 01

To make this a simple example, I’m using a Direct Database Request (DDR) as it’s a nice, easy way to force OBIEE to run exactly the database query that you want it to. As an aside, this is exactly why DDR gets so frequently abused and you should note that I’m using it here simply to illustrate a point, not because it’s A Good Thing. Reports built with DDR are inflexible, they’re difficult to support, difficult to scale … the RPD is provided for a reason – use it!

The query we’re going to start by running is a pretty small one. It pulls back a bunch of data from one of the Oracle system catalog tables, all_objects. What the data contains is irrelevant; we just want a bunch of rows to pull back from the database to illustrate the point.

select * from all_objects a

2016-02-13_21-45-52.png

In the results page, expand the table view to show all rows, which forces all the data to be pulled back through the stack (instead of just the first 25).pastedImage_34.png

This basic query ran in 7 seconds, as we can see from the nqquery.log:

[2015–12–19T21:58:54.000+00:00]
SQL Request, logical request hash:
58a2d9dc
SET VARIABLE QUERY_SRC_CD=’Report‘,PREFERRED_CURRENCY=’USD’;EXECUTE PHYSICAL CONNECTION POOL
“Sample Relational Connection” select * from all_objects a [2015–12–19T21:58:54.000+00:00]
Sending query to database named Sample Relational Connection (id: ExecutePhysical Gateway),
connection pool named Sample Relational Connection, logical request hash 58a2d9dc,
physical request hash 9d76179b: [[
select * from all_objects a [2015–12–19T21:59:01.000+00:00]
Query Status: Successful Completion
Rows 74763, bytes
186010344 retrieved from database query id: ExecutePhysical Gateway [[
Physical query response time 4
(seconds), id ExecutePhysical Gateway
Physical Query Summary Stats: Number of physical queries 1,
Cumulative time 4, DB-connect time 0 (seconds)
Rows returned to Client 74763 [[
Logical Query Summary Stats:
Elapsed time 7, Response time 0, Compilation time 0 (seconds) [[

Using the Rittman Mead Performance Analytics Dashboard we can easily analyze in one place the behavior of the OS and OBIEE during execution:pastedImage_35.png

Some particular points to note:

  • The heap size (memory) and use of temp on disk for Presentation Services (but not BI Server) both increase:

    pastedImage_36.png
    The memory figures in the above chart are taken from the DMS counter heapSize, and are matched in the Linux proc stats:

    Presentation Services is responsible for generating and formatting the page that’s returned to the web browser, hence the increased memory and disk usage in working with these 75,000 rows of data.

  • There’s a burst of inbound network traffic at 21:59, marking the transfer of data _from_the database to BI Server. Around the same time, there’s disk write activity, which is the temp files being written by OBIEE as the data is received from the database.

    pastedImage_38.png

In this first example, there was no BI Server temp written on disk at all, but there is sawserver temp (48Mb at peak). After the query has run and the temp falls back down, there is still a notable chunk in the querycache folder (which is presumably the Presentation Services query cache):

$ pwd

/app/oracle/biee/instances/instance1/tmp/OracleBIPresentationServicesComponent/
coreapplication_obips1

$ du -sh ./*
0 ./D_22sJ6EmumPlysgoC8jSWn_o8p3ci+z
4.0K ./dxe
4.0K ./J5h+oq80au7HtXDcOrlDhghash.tmp
4.0K ./mobile
4.0K ./multipartdoc
4.0K ./obis_temp
4.0K ./preview
896K ./qmcache
6.7M ./querycache_0
4.0K ./querycache_1
4.0K ./querycache_2
4.0K ./sawcharts
4.0K ./sawmaps
912K ./sawrptcache
904K ./sawvc

Level Up

Now, let’s double the number of rows pulled back by adding a Cartesian into our DDR:

select *
from all_objects,
(select 1
from dual
connect by level <= 2) ;

From nqquery.log we can see the data has, as expected, doubled: nearly 150,000 rows and over 350Mb being pulled back from the database, through BI Server and Presentation Services:

[2015–12–19T22:56:24.000+00:00] Sending query to database named Sample Relational
Connection (id: ExecutePhysical Gateway), connection pool named Sample Relational Connection,
logical request hash 5fff717a, physical request hash e4f82db3: [[
select * from all_objects a , (select 1 from dual connect by level <= 2) ;

[2015–12–19T22:56:33.000+00:00]
Rows 149526, bytes 374413104 retrieved from database query id: ExecutePhysical Gateway [[
Physical query response time 5 (seconds), id ExecutePhysical Gateway [[

pastedImage_39.png

As before, the use of memory by Presentation Services (sawserver) goes up (by 550MB), as does its temp on disk. This time, we also see the BI Server (nqsserver) writing some temp to disk too (28Mb at max)

By running the Linux tool iotop whilst report is running we can observe what the processes are up to in terms of disk throughput:

TIME TID PRIO  USER  DISK READ  DISK WRITE  SWAPIN IO COMMAND

22:56:27 14269 be/4 oracle 0.00 B/s 5.03 M/s 0.00 % 20.49 % nqsserver -quiet
22:56:27 14368 be/4 oracle 0.00 B/s 2.91 M/s 0.00 % 13.22 % nqsserver -quiet
22:56:27 14396 be/4 oracle 0.00 B/s 10.15 M/s 0.00 % 0.00 % sawserver

So both BI Server and Presentation Services are writing to the disk at around 10Mb/sec. The implications of this we will see shortly.

End of Level Boss

We’ll stop beating around the bush now, and really chuck some work at the server. It’s the same query style as before, just with an even bigger Cartesian. In total we’re going to pull back 7.4 million rows from the database, totalling 1.4 Gb of data. It's important to note here that I've had to construct a DDR specifically to generate this load. Typically, the amount of data that OBIEE pulls back from the database in a single query is limited by good RPD design, good report design, and the row limits set per database object in the RPD.

Here’s the query launching:

[2015–12–19T23:15:19.000+00:00] Sending query to database named Sample Relational Connection
(id: ExecutePhysical Gateway), connection pool named Sample Relational Connection, logical request hash
e4bfd857, physical request hash 6f8227e2: [[
select * from all_objects a , (select 1 from dual connect
by level <= 100) ;

By default, you only get the first 25 rows of the table showing when you first run the report.

pastedImage_40.png

Before expanding the table to pull back the rows, note that, at this point (before viewing all rows), there’s just the one BI Server (nqsserver) process writing to disk, presumably as it is still receiving data from the database:

TIME TID PRIO  USER  DISK READ  DISK WRITE  SWAPIN IO COMMAND

23:20:38 14269 be/4 oracle 0.00 B/s 8.42 M/s 0.00 % 32.68 % nqsserver -quiet

There’s also just a single BI Server TMP file on disk (growing rapidly):

Sat Dec 19 23:21:31 GMT 2015
total 3.1G
-rw——- 1 oracle oinstall 3.0G
Dec 19 23:21 nQS_14111_7_83719644.TMP

Once you click the option to expand the table view and show the maximum number of rows, you'll see additional BI Server processes writing to disk, as well as Presentation Services (sawserver) appearing:

TIME TID PRIO  USER  DISK READ  DISK WRITE  SWAPIN IO COMMAND

23:21:53 14272 be/4 oracle 0.00 B/s 1976.32 K/s 0.00 % 12.11 % nqsserver
-quiet
23:21:53 14269 be/4 oracle 3.86 K/s 1976.32 K/s 0.00 % 8.73 % nqsserver
-quiet
23:21:53 14360 be/4 oracle 0.00 B/s 1111.68 K/s 0.00 % 7.86 % nqsserver
-quiet
23:21:53 14396 be/4 oracle 0.00 B/s 18.96 M/s 0.00 % 0.00 % sawserver

There’s also a second BI Server temp file that starts to get written to on disk as well:

Sat Dec 19 23:22:03 GMT 2015
total 3.2G
-rw——- 1 oracle oinstall 3.1G Dec 19 23:21
nQS_14111_7_83719644.TMP
-rw——- 1 oracle oinstall 15M Dec 19 23:22 nQS_14111_8_84112206.TMP

The system metric profile now becomes rather interesting:

pastedImage_41.png

Because the query takes many minutes to run, it gives us the chance to gather some "steady state" figures, which we can assume is the system running at full pelt, pulling in all the millions of rows of data. Points of particular note are as follows:

  • The CPU is busy on c.10% user work, but also c.10% wait, which is usually indicative of it waiting for disk or network operations to complete.

pastedImage_42.png

  • The network is at a constant c.200 Kb/s input, and disk c.12Mb/s write.

pastedImage_43.png

The database is on the same subnet as the application server with fast physical connection, so the maximum 200Kb/s input rate suggests that the constraint would be on the other end (i.e., the database reading/sending the data).

  • The disk subsystem is notably busy, hitting 100% capacity at points:
    pa10.png
  • Presentation Services (sawserver) memory use rose by 6GB:
    pastedImage_45.png

Also interesting is that a lot of the heavy activity takes place after the logical query completes at 23:22:

[2015–12–19T23:22:05.000+00:00] Rows 7476300, bytes 1540786016 retrieved from
database query id: ExecutePhysical Gateway [[
Physical query response time 172 (seconds), id ExecutePhysical Gateway [[
Logical Query Summary Stats: Elapsed time 406, Response time 0, Compilation time 0 (seconds) [[

Looking at the temp disk behavior, it seems the DB finished returning its rows at c.23:22, which matches the nqquery.log time of the logical query completing. However, temp and heap continue to grow for several more minutes, including disk I/O hitting 100% usage, and CPU Wait at c.25%. This is presumably Presentation Services taking the data from BI Server and preparing it to send back to the web browser.

download.png

Game Over

Above, we saw how one “heavy” query behaves on the OBIEE application server. In real life, there’ll be multiple concurrent users on OBIEE, along with potentially other workloads (e.g., TimesTen on Exalytics), meaning that this level of resource consumption can have serious implications. If we now run two of these queries (modified slightly so that they don’t get a shared cursor in Presentation Services) as an example of concurrent usage on OBIEE, we completely flatten the server: the 1 minute load average is 76!
pastedImage_47.png

Note that User CPU is extremely low, but so is Idle CPU, because it’s all on Wait.

pa12.png

What’s causing the CPU Wait? IO!

pa13.png

So even if User CPU, commonly looked at when doing a cursory check to see if a server’s “slow,” is not at 100%, it doesn’t mean that the server is not starved of resource.

How Do I Know If My OBIEE System Has This Problem?

It’s possible to get a rough idea which queries might warrant a bit of investigation by analyzing the data in the Usage Tracking table S_NQ_ACCT. For example:

SELECT START_TS,SAW_SRC_PATH, CUM_NUM_DB_ROW, TOTAL_TIME_SEC
FROM S_NQ_ACCT
WHERE CUM_NUM_DB_ROW > 1000000;

And if you don’t have Usage Tracking enabled? Go and enable it now. Seriously. It’s invaluable.

Can We Fix It? Yes, We Can!

Fixing this kind of performance problem requires careful and thorough examination of the report(s) in question and an understanding of what is causing the excess movement of data. From this, it should become clear what needs doing to optimize it and make it more efficient. The causes will generally fall into one of two categories:

  • The report shows only a fraction of the rows being pulled back from the database, but because of the RPD design and/or the report, the BI Server can’t pushdown to the database the summarization required and instead has to pull back a greater amount of data to then process on the BI Server and/or Presentation Services.
  • The user is deliberately trying to return large volumes of data (often to then just export to Excel).

Quite often, once the evidence of the cause for poor performance is put in front of developers, they’ll rapidly find a way to fix it. They may have had a sneaking suspicion all along that what they’d built wasn’t as performant as it could be, or maybe realized that now’s the time to do that RPD modelling optimization that they’d been putting off for a future date.

KIWI?

You may wonder, given that I’ve clearly demonstrated a system bottlenecked on I/O to the local disk, why we don’t just get a bigger pipe for the disk. That is, a faster disk controller and/or faster disks (maybe SSDs), or maybe even RAM Disk, which I discuss below. The thing is, this misses the point. Known as K.I.W.I. (Kill It With Iron), throwing more server hardware at capacity issues is usually not a long-term solution if you want to build a system that’s scalable and economical. If you have inefficient queries running, running the same inefficient queries on bigger hardware leaves you with the same problem: inefficiency. As time goes by and people continue to build even bigger inefficient queries, or start to run them concurrently, the problems will come straight back. But, if you fix these problems at source, you regain system stability and you increase your available capacity without spending a dime!

Restricting Database Executions by Row Count or Execution Time

Using the BI Server query governor, it is possible to restrict the number of rows and/or execution time on the database. This is a good "fail-safe" to have in place if you're concerned that reports may be written that pull excessive data back to the point where it is going to impact the performance of the BI Server or source database.

When the query governor kicks in you'll see your OBIEE query abort with a message such as :

[nQSError: 43113] Message returned from OBIS.

[nQSError: 43121] Max Row Limit Exceeded

[nQSError: 43119] Query Failed:

[nQSError: 43093] An error occurred while processing the EXECUTE PHYSICAL statement.

[nQSError: 60008] The query for user 'weblogic' exceeded the maximum query governing
rows 42 from the database 'Sample Relational Connection'.

The query governor can be configured per application role (or user, but this is not such a good idea from a maintainability point of view) in the RPD:

2016-03-09_23-40-54.png

As of 12c (12.2.1), the query governor is also configurable as a global default in the NQSConfig.ini file under DEFAULT_DB_MAX_ROWS and DEFAULT_DB_MAX_EXEC_TIME. You also have the option of setting it per session using the system session variable OBIS_DB_MAXROWS and OBIS_DB_MAXEXECTIME, which are present in 11.1.1.7, and in 12c can also be set as request variables.

Note that the query governor limits apply_per physical query executed--_so one logical query could still return cumulatively more rows from the database than the governor limit, if that logical query spawned multiple physical queries that each returned fewer rows than the limit.

Footnote: Moving Work Files to Memory with RAMDisk

If excess data must pass through OBIEE, we want to help it happen as efficiently as possible. One avenue of investigation is to use faster storage for the temporary files, such as SSD or even RAM Disk so that they’re stored directly in memory. This is discussed in detail in Oracle support document 1569118.1.

This is easy to do on Linux, courtesy of /dev/shm. You can configure WORK_DIRECTORY_PATHS in the BI Server NQSConfig.ini, or just create a symlink for the whole OBIEE temp folder, at $FMW_HOME/instances/instance1/tmp (assuming default instance name). After making this change, restart the stack.

I compared the before and after figures of making this change. First I ran a baseline test, the 1.5M row request from the DB:

[2015–12–21T09:52:04.000+00:00]
Sending query to database named Sample Relational Connection (id: ExecutePhysical Gateway)
connection pool named Sample Relational Connection
select * from all_objects a , (select 1 from dual connect by level <= 20) ;

[2015–12–21T09:53:18.000+00:00]
Query Status: Successful Completion [[
Rows 1495260, bytes 3744131040 retrieved from database query id: ExecutePhysical Gateway [[
Physical query response time 38 (seconds), id ExecutePhysical Gateway [[
Physical Query Summary Stats: Number of physical queries 1, Cumulative time 38,
DB-connect time 0 (seconds) [[
Rows returned to Client 1495260 [[
Logical Query Summary Stats: Elapsed time 74, Response time 35, Compilation time 0 (seconds) [[

The OS metrics show, as expected, a machine busy with disk writes and inbound network traffic:

pa14.png

Now we move the BI Server temp folder to RAMDisk (/dev/shm):

oracle@demo:tmp () $ ll
total 32
drwxr-x— 3 oracle oinstall 4096 Dec 3 2013 Essbase
drwxr—– 2 oracle oinstall 4096 Jan 27 2014 FinancialReporting
drwxr-x— 3 oracle oinstall 4096 Dec 3 2013 OracleBIClusterControllerComponent
drwxr-x— 3 oracle oinstall 4096 Dec 3 2013 OracleBIJavaHostComponent
drwxr-x— 4 oracle oinstall 4096 Mar 28 2014 OracleBIODBCComponent
drwxr-x— 3 oracle oinstall 4096 Dec 3 2013 OracleBIPresentationServicesComponent
drwxr-x— 3 oracle oinstall 4096 Dec 3 2013 OracleBISchedulerComponent
drwxr-x— 3 oracle oinstall 4096 Dec 3 2013 OracleBIServerComponent
oracle@demo:tmp () $ mv OracleBIServerComponent/ /dev/shm/
oracle@demo:tmp () $ ln -s /dev/shm/OracleBIServerComponent/ OracleBIServerComponent
oracle@demo:tmp () $ ll
total 28
drwxr-x— 3 oracle oinstall 4096 Dec 3 2013 Essbase
drwxr—– 2 oracle oinstall 4096 Jan 27 2014 FinancialReporting
drwxr-x— 3 oracle oinstall 4096 Dec 3 2013 OracleBIClusterControllerComponent
drwxr-x— 3 oracle oinstall 4096 Dec 3 2013 OracleBIJavaHostComponent
drwxr-x— 4 oracle oinstall 4096 Mar 28 2014 OracleBIODBCComponent
drwxr-x— 3 oracle oinstall 4096 Dec 3 2013 OracleBIPresentationServicesComponent
drwxr-x— 3 oracle oinstall 4096 Dec 3 2013 OracleBISchedulerComponent
lrwxrwxrwx 1 oracle oinstall 33 Dec 21 10:05 OracleBIServerComponent -> /dev/shm/OracleBIServerComponent/

We run the same test as before:

[2015–12–21T10:11:28.000–00:00]
Sending query to database named Sample Relational Connection (id: ExecutePhysical Gateway),
connection pool named Sample Relational Connection, logical request hash 5312c84d, physical
request hash 6b3bd34e: [[select * from all_objects a , (select 1 from dual connect by level <= 20) ;
Query Status: Successful Completion [[
Rows 1495260, bytes 3744131040 retrieved from database query id: ExecutePhysical Gateway [[
Physical query response time 37 (seconds), id ExecutePhysical Gateway [[
Physical Query Summary Stats: Number of physical queries 1, Cumulative time 37, DB-connect time 0 (seconds) [[
Rows returned to Client 1495260 [[
Logical Query Summary Stats: Elapsed time 43, Response time 8, Compilation time 0 (seconds) [[

Interestingly, the network throughput maximum goes up from c.300Kb/s to c.500Kb/s, which could be because the BI Server is able to write its temp files faster, and thus consume it from the source database over the network more quickly.pastedImage_51.png

There’s still a heavy hit of disk activity because the Presentation Services (sawserver) is still using disk for its temp (we’ve only moved BI Server at this point to RAM Disk). This is confirmed by iotop that shows sawserver writing to disk--at a greater rate than before. This is because either BI Server is able to feed it the data faster, and/or there’s less contention on writes to disk now that BI Server is using memory instead:

TIME TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND
10:21:24 16348 be/4
oracle 0.00 B/s 24.27 M/s 0.00 % 0.00 % sawserver

BI Server (nqsserver) writes a temp file to /dev/shm:

Mon Dec 21 10:21:46 GMT 2015
total 110M
-rw——- 1 oracle oinstall 110M Dec 21 10:21 nQS_16039_2_37267954.TMP

Of note is that the max size seen for the NQS temp file is 110MB, yet the data returned from the DB is 3.4Gb; either NQS is compressing it, or it’s not storing the whole dataset before it then passes it on to Presentation Services (e.g., the temp file is just a transient point for the data, not a resting place):

pastedImage_52.png

By using RAMDisk, the runtime is reduced from c.75 seconds to c.45 seconds, because the BI Server is in effect able to use memory instead of disk as a landing point for data returned from the database.

What about if we move Presentation Services temp onto RAMDisk, too?

oracle@demo:tmp () $ ll
total 28
drwxr-x— 3 oracle oinstall 4096 Dec 3 2013 Essbase
drwxr—– 2 oracle oinstall 4096 Jan 27 2014 FinancialReporting
drwxr-x— 3 oracle oinstall 4096 Dec 3 2013 OracleBIClusterControllerComponent
drwxr-x— 3 oracle oinstall 4096 Dec 3 2013 OracleBIJavaHostComponent
drwxr-x— 4 oracle oinstall 4096 Mar 28 2014 OracleBIODBCComponent
drwxr-x— 3 oracle oinstall 4096 Dec 3 2013 OracleBIPresentationServicesComponent
drwxr-x— 3 oracle oinstall 4096 Dec 3 2013 OracleBISchedulerComponent
lrwxrwxrwx 1 oracle oinstall 33 Dec 21 10:05 OracleBIServerComponent -> /dev/shm/OracleBIServerComponent/
oracle@demo:tmp () $ mv OracleBIPresentationServicesComponent/
/dev/shm
oracle@demo:tmp () $ ln -s /dev/shm/OracleBIPresentationServicesComponent/ OracleBIPresentationServicesComponent
oracle@demo:tmp () $ ll
total 24
drwxr-x— 3 oracle oinstall 4096 Dec 3 2013 Essbase
drwxr—– 2 oracle oinstall 4096 Jan 27 2014 FinancialReporting
drwxr-x— 3 oracle oinstall 4096 Dec 3 2013 OracleBIClusterControllerComponent
drwxr-x— 3 oracle oinstall 4096 Dec 3 2013 OracleBIJavaHostComponent
drwxr-x— 4 oracle oinstall 4096 Mar 28 2014 OracleBIODBCComponent
lrwxrwxrwx 1 oracle oinstall 47 Dec 21 10:35 OracleBIPresentationServicesComponent ->
/dev/shm/OracleBIPresentationServicesComponent/
drwxr-x— 3 oracle oinstall 4096 Dec 3 2013 OracleBISchedulerComponent
lrwxrwxrwx 1 oracle oinstall 33 Dec 21 10:05 OracleBIServerComponent -> /dev/shm/OracleBIServerComponent/

Restart the stack and then run the same report:

[2015–12–21T10:37:21.000+00:00]
Sending query to database named Sample Relational Connection (id: ExecutePhysical Gateway),
connection pool named Sample Relational Connection, logical request hash 5312c84d,
physical request hash 6b3bd34e: [[
select * from all_objects a , (select 1 from dual connect by level <= 20) ;[2015–12–21T10:38:02.000+00:00]
Query Status: Successful Completion [[
Rows 1495260, bytes 3744131040 retrieved from database query id: ExecutePhysical Gateway [[
Physical query response time 37 (seconds), id ExecutePhysical Gateway [[
Physical Query Summary Stats: Number of physical queries 1, Cumulative time 37, DB-connect time 0 (seconds) [[
Rows returned to Client 1495260 [[
Logical Query Summary Stats: Elapsed time 41, Response time 8, Compilation time 0 (seconds) [[

Notice that the Disk Busy is almost non-existent, and the disk I/O is a degree of magnitude smaller (Kb, not Mb):pastedImage_53.png

sawserver still does some recorded I/O but at a much lower rate:

TIME TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND  

10:37:46 259 be/3 root 0.00 B/s 4.94 M/s 0.00 % 0.23 % [jbd2/xvda3–8]
10:37:46 8677 be/4 oracle 0.00 B/s 1277.92 K/s 0.00 % 0.00 % sawserver

A last point to note on the use of a RAMDisk is that while the alternative is "disk," there will usually be a level of memory involved regardless, by means of the operating system's buffer cache. This is present on both Linux and Windows, as well as other OSs.

Summary

In conclusion:

  • Suboptimal design at one or several points in the OBIEE implementation can cause the I/O subsystem on the application server to become overwhelmed. Because I/O throughput is frequently overlooked in assessments of a server’s capacity, it’s easy to be given the "all clear" even when it's unwarranted.

  • Low User CPU does not mean that the system is not under pressure; you must also look at CPU Wait and Idle, as well as the load average.

  • Large reports returned to the end user require lots of memory and disk space for Presentation Services.

  • Lots of data returned from the database requires temp disk space for the BI Server.

  • “Heavy” reports such as these, when run concurrently, can easily exhaust the disk space available to the temp folder, resulting in aborted queries with errors such as :

    Query Status: [nQSError: 46073] Operation ‘write()’ on file ‘/u01/app/oracle/product/
    instances/instance1/tmp/OracleBIServerComponent/coreapplication_obis1/obis_temp/
    nQS_1229_557_52279963.TMP’ failed with error: (2) (.
    [nQSError: 46017] Sort has no work space in the work directory.
    [nQSError: 43119] Query Failed:

  • Lots of data moving throughout the stack means network and disk I/O throughput. If you see this behavior with the execution of a single report, you must consider whether you have the necessary bandwidth on the server when concurrent requests of similar nature run, and also what this resource utilization means for other functions on the server (e.g., co-located ETL).

If you have reports that are behaving like this, the solution is to get them to move less data around. This could be returning fewer rows to the end user, through better use of report predicates as well as analysis of the use of the data with the use to understand if it could be done in a more optimal way. Alternatively, the solution is to present the same data but to do so more efficiently. Efficiency improvements are usually to be found in the design of the report (for example, look for complex calculated columns that would be better pushed to the RPD or database), as well as more generally in the RPD and database by storing and modelling the data in a way that supports the types of queries being run more accurately. A final suggestion: where appropriate, schedule bigger reports to run out of hours through BI Delivers, thus reducing the risk of them impacting online users during the day.

Further Reading and References

The documentation for OBIEE includes very useful sections on OBIEE performance:?

Oracle has published white papers available through My Oracle Support that detail further options available within the OBIEE stack for tuning performance:

The following article on the Rittman Mead blog describes further optimization possibilities for OBIEE, including use of a Web Tier: Optimising OBIEE performance. An associated presentation is available: No Silver Bullets - OBIEE Performance in the Real World

Last, but very definitely not least, a most important paper for the understanding of performance optimization work in general is Cary Millsap's Thinking Clearly About Performance?.

About the Author

Oracle ACE Robin Moffatt is the head of R&D (Europe) at Rittman Mead.


This article has been reviewed by the relevant Oracle product team and found to be in compliance with standards and practices for the use of Oracle products.


Comments

Dude!

A1) Multithreading employs the concept of parallel process execution. You can find in hardware and software. SMT is about hardware multithreading.

A2) Multithreading concepts are used on all modern computer platforms.

A3) Multitasking is about processes sharing common resources, such as a CPU. The most common is preemptive multitasking, where the OS decides how much CPU time gets allocated to individual processes. Multithreading extends the idea of multitasking into applications, OS and hardware. Multitasking allows a process to subdivide specific operations into individual sub-processes or threads that can run in parallel.

A4) Multitasking and multithreading are concepts that can run on a single CPU. However, a process can only run on a specific CPU at a time. A single process can not execute on multiple CPU's. In order to take advantage of multiple CPU's, multiple processes or threaded operations are required.

A5) The difference between, e.g. dual CPU and dual core is hardware. Dual core, for example, means 2 CPU's in one physical CPU unit, whereas dual CPU's mean duplicate hardware, such as CPU sockets and CPU bus. Multi-core architecture is cheaper because it requires less hardware and less power, and has other advantages in terms of resource sharing, but also has it's disadvantages, like less hardware redundancy, for example.

A6) Multithreading is not a feature or concept that only applies to CPU architecture, but also software. Multithreading can be used on a single CPU as well as separating processes among multiple CPU's.

Jimbo

Thanks.

Q1. So will modern processors automatically multithread processes or do you have to specifically set or configure the processor to do this ?

Q2. I presume if your multi threading is at the hardware level this is really a combination of the processor / OS i.e. it is not transparent to the OS ? Therefore is Unix, Linux and Windows OS's that natively use multi threading ( or do you have to specifically enable it, depending on whether the underlying processor can support multi threading ) ?

Q3. So is there a difference between OS level multi threading and application /coding level multi threading ?

Q4. Are OS level and Application / Code level multi threading independent of each other ? i.e. if you are going to use Application / Code level multi threading must you have a processor / OS that is capable of multi threading ? Is it possible to have multi threading at your application / coding level on a system that has a processor that itself does not multi thread - if so how is this different from the concept of parallel processing on any multi user / process system ?

Q5. Does your code / application have to use specific libraries / procedures etc in order to multi thread i.e. is it a specific type of coding that facilitates application / code level multi threading ?

thanks,

Jim

Dude!

I'm afraid your questions are too complex to be answered individually. As far as I understand, processes are a logical software construct that have no meaning for the CPU as such. On the CPU level, multithreading is typically achieved by multiplexing and context switching, which gives only the impression of running parallel tasks. However, I'm not a CPU designer and I suggest to look for information on the web.

Multithreading implemented by the OS or application does not necessarily depend on the CPU or number of CPU's like I previously mentioned, except that the system needs to be powerful enough to be able to handle parallel tasks. Hardware can provide support that can be beneficial for threaded tasks and may require appropriate software and drivers. Perhaps synchronous vs. asynchronous non-blocking I/O would be a good example.

To implement multithreading, the application and OS must support the concept of running parallel tasks and provide the appropriate code. For example, to copy one file at the time or be able to submit several parallel file copy tasks at once. In all cases, the application and OS need to be able to compute and manage the individual tasks or processes, manage file system integrity and to recover and provide information about errors.

From the perspective of an application programmer, multithreading is a concept that requires appropriate software or programming library support. Some of libraries are cross-platform. http://www.gnu.org/software/pth/related.html

Btw, do you need this information for something specific or is this just general interest?

Jimbo

Thanks Dude,

I was really trying to discern if both Unix and Windows as OS's were multi thread capable, out of the box or if there is something special you have to set up to make them multi thread

It then struck me that I didn't quite understand multi threading as well as I thought I did !

I was confused because there seemed to be different references about multi threading by a processor, multi processing by the OS and multi threading by code / applications and the 3 seemed to be presented as different concepts. I am still not sure if the processor and OS level multi threading is effectively the same concept ?

I was also confused about what the difference was between multi threading at the processor level and the old CPU concepts I learned about multi tasking i.e. round robin time slicing etc ?

thanks,

Jim

Dude!

Multi-processing and multi-threading is all about parallel processing. Multi-processing or multi-threading does however not mean you need multiple processors or CPU's. You can also do multi-processing or threading on a single CPU or single-core, in which case the CPU is switching between the processes or threads in a timely fashion to make it seem like parallel processing. This is called Context Switching and most modern system use a CPU scheduling in a round-robin fashion.

Multi-process means separate processes in the system process table. Multi-threaded means threads associated with a process, which share the same address space. To see threads in Unix or Linux, you can use the "ps" command with the "-L" option.

There is nothing you have to do to make Windows or Linux multithreading. Every modern OS provides the necessary support. Whether an application is multi-threaded, depends on the application.  Oracle database introduced a new multi-process multi-threaded architecture for Unix and Linux in 12c. https://docs.oracle.com/database/121/CNCPT/process.htm#CNCPT1245.


In some tests, memory usage was almost divided by half when using multi-process multi- threaded feature. The default in Oracle 12c is not multi-threaded, but you can enable it:

SQL> ALTER SYSTEM SET threaded_execution=true SCOPE=SPFILE;


Perhaps what you might also want to keep in mind is that a single process or single thread can only run on a single CPU or core at a time. Or said differently, a single process or single thread cannot take advantage of mulitple CPU's or multi-core architecture.

Jimbo

Thanks Dude that is all very interesting.

I guess I need to know more about the evolution of CPUs. Multi Processing has been around for as long as there have been multi user based operating systems - right back to the days of main frame technology. This is why I was aware of the concepts of round robin time slicing etc - I presume Context Switch refers to how the CPUs dealt with multi processing and Multi Threading as opposed to just how it deals with threads ?

So I guess the old multiprocessing CPU scheduling systems such as Round Robin are also referred to as Context Switching ?

I guess some where along the line, the various processors introduced the concept of Multi Threading - I know Intel introduced it as SMT back in 2002 for their Pentium 4 processor, I am not sure if the big RISC based processors from the likes of Sun, IBM, HP etc introduced it before or after Intel ?

I am also not sure if the arrival of multi threading was firstly just for OS processes and then alter it was expanded to allow end user applications / code to multi thread or if end user applications were able to avail of it from day 1 of its introduction - again more history !

Oracle is also an interesting story - prior to 12c I knew their RDBMS software did not make use of any underlying multi threading capability of the host OS, instead they had their own sort of deployment of multi threading through their Multi Threaded Server ( MTS ) deployed via their Oracle Net layer, where basically a Dispatcher process allowed multiple end user processes to share a single backend database server process. However from 12c as you point out, the RDBMS can now make use of any native OS level multi threading. I must check if this means that MTS is dead from 12c onwards or if it still could be used ( for whatever obscure reason )

thanks,

Jim


Dude!

I would not necessariliy draw a direct connection between multi-threading, multi-processing, multi-user, multi-core and the way how CPU's work. There is a lot of "multi" and some of the features benefit from each other, but the concepts do not depend on each other and were based on individual development and what was techncially feasible or requried at a certain time.

1 - 7