Forum Stats

  • 3,759,955 Users
  • 2,251,621 Discussions
  • 7,870,882 Comments

Discussions

The expected read speed of SQL*PLUS

mamsds
mamsds Member Posts: 16 Green Ribbon
edited Sep 13, 2021 2:17AM in General Database Discussions

Edit:

  1. Several people mentioned the case of the table and columns...Well this is not something I can control, so please stop asking me to change them...
  2. Now I can get a read speed at around 40MBytes/s with a local connection, following the answers I found here: https://dba.stackexchange.com/questions/299426/the-expected-read-speed-of-sqlplus
  3. I have another issue about cx_Oracle. You may take a look if you are interested: https://dba.stackexchange.com/questions/299444/possible-memory-leak-of-cx-oracle-pyodbc-oracle-instant-client

The issue is many-fold but it boils down to this simple scenario: I have a Oracle Database Version 19.4.0.0.0 and SQL*PLUS Version 21.3.0.0.0 running on the same Red Hat Linux (I actually tried both local and remote connections, here let's just consider the local case to separate concerns...). I run this SQL statement:

SELECT "Date", "ID", "Name", "Value"
FROM "A_Table"
WHERE ("A_Table"."Date" >= '11-SEP-20' AND "A_Table"."Date" <= '11-SEP-21');

So basically I just want to fetch all the rows based on Date and the resultant file should be around 2GB in size. The statement works, but after using a wide range of techniques (ICYW, techniques including checking filesystem IO, network IO and file size regularly), I am sure that this SELECT statement can only fetch data at a speed of 2 MByte/s maximum.

Some extra observation and tuning:

  1. An index is correctly built and this speed limit is NOT from Oracle finding the right data to send, but rather, all data are there ready to be sent, just somehow they cannot be sent at a speed higher than 2 MBytes/s--I observe that, at first few seconds, Oracle service will read hard drive at 300-500MB/s, after a short while (i.e., a few seconds), it stops reading and then the SELECT command starts showing results. So I believe that Oracle is done reading and organizing;
  2. I set arraysize to 5,000, which, according to Oracle's manual, seems to be highest possible value (but sure I tried other values such as 100 as well)--this is particularly odd, I tried tuning this parameter when using cx_Oracle, it is very effective.
  3. I also tried setting SDU to 65535.

Nothing seems working.

Is this the normal speed that I can expect from SQL*PLUS? (But I believe the answer is no, seems I tried using Python's cx_Oracle package which in turn, utilizing Oracle Instant Client to do the fetching, I can reach a speed of around 40MBytes/s, but that one has other issues which prevent us from using it at the moment...)

Thanks!

Tagged:
«13

Answers

  • mamsds
    mamsds Member Posts: 16 Green Ribbon

    Hi John, first the question should be revised. I actually have more discussion with other people on this link:

    After revising my SQL statement and the SQL*PLUS script, I can get 50 MBytes/s.

    Going back to your points:

    1. What is the data type of the "A_Table"."Date" column: It is "Date". Regarding why I compare it with a string, I am actually not sure, but by construct the SQL this way it works and the results are correct...I guess I will revisit this issue later;
    2. are you certain that it is being used: I am pretty certain. As I said, when I execute the SQL statement in SQL*PLUS, I can see the Oracle DB instance immediately read from hard drive at a speed of 300-500 MBytes/s. The Oracle DB instance also uses a lot of CPU resources. After a few seconds, both hard drive I/O and CPU usage drops to nearly zero. So my conclusion is, Oracle DB can finish reading and organizing the data needed to be sent very quickly, meaning that index is being used;
    3. You have table and column names in mixed case: LOL I would say this is a mistake. Do you mean that in Oracle, usually we use uppercase only?


  • mamsds
    mamsds Member Posts: 16 Green Ribbon
    edited Sep 12, 2021 9:53AM

    If you dont like my question please skip it.

    If you think I am ridiculous please skip the post.

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,756 Gold Crown
    edited Sep 12, 2021 11:24AM

    I see Andy Sayer suggested the "-fast" option for SQL*Plus to improve performance, and that improved things considerably.

    To a very large extent the way that SQL*Plus handles the data - particularly for formatting - makes a big difference to performce. You said you'd set the arraysize to 5,000, how many of the other settings did you adjust. Historically people had to include several others for best effect, e.g.

    set heading off
    set linesize 140
    set pagesize 0
    set termout off
    set trimspool on
    set tab on
    set trimout on
    

    There are a couple of others that were also used fairly frequently, but I can't remember what they were. Nowadays if you want a flat text, space saving (usually), output it is more common simply to do:

    set markup on csv
    

    to get comma separated, quote delimited text with none of the excess fix-width spacing. (I wrote a note about this a few months ago.)


    Given the significant improvement you saw, though, I suspect that your table has columns defined (not necessarily populated) as character strings of lengths greater than 4000 (or 2,000 if you're using a multi-byte character set for your database), and that means Oracle would be storing them as LOBs. If that's the case then the improvement comes from the way that "fast" includes the changing the option:

    set lobprefetch 16384
    

    The presence of LOBs in the select list means Oracle will be doing single row fetches - so one round trip per row; but it also means the volume of traffic can be MUCH larger than the actual content of the LOB columns. (In your version of Oracle the number of round trips doesn't seem to reflect the extra messages due to fetching the LOBS after fetching each row - if they're short values that didn't need to be defined as length 4000/2000).

    Simple test:

    run the script (twice) but "set autotrace traceonly statistics" before doing so and you'll get output like:

    Statistics
    ----------------------------------------------------------                     
             8 recursive calls                                                   
             0 db block gets                                                     
         19166 consistent gets                                                   
             0 physical reads                                                    
             0 redo size                                                         
       2754679 bytes sent via SQL*Net to client                                  
       2905564 bytes received via SQL*Net from client                            
          5002 SQL*Net roundtrips to/from client                                 
             0 sorts (memory)                                                    
             0 sorts (disk)                                                      
          5000 rows processed                                                    
    

    The set above is when I selected 2 lobs with the default lobprefetch = 0

    The set below is setting it to 128 (which was enough to hold both LOB values in my case):

    Statistics
    ----------------------------------------------------------                     
             0 recursive calls                                                   
             0 db block gets                                                     
          5023 consistent gets                                                   
             0 physical reads                                                    
             0 redo size                                                         
       2794679 bytes sent via SQL*Net to client                                  
         55564 bytes received via SQL*Net from client                            
          5002 SQL*Net roundtrips to/from client                                 
             0 sorts (memory)                                                    
             0 sorts (disk)                                                      
          5000 rows processed 
    

    Note that the round trips doesn't change but the consistent gets does, as does the bytes received from client.

    Regards

    Jonathan Lewis

    Mohamed Houri
  • mamsds
    mamsds Member Posts: 16 Green Ribbon
    edited Sep 12, 2021 10:36AM

    Hi Jonathan,

    First I think formatting is not an issue, since I already turned off all stdout. This is my commands:

    SQL> set arraysize 10000
    
    SQL> set colsep ,
    
    SQL> set headsep off
    
    SQL> set pagesize 0
    
    SQL> set trimspool on
    
    SQL> set termout off
    
    SQL> set echo off
    
    SQL> @test_script.sql
    

    the content of test_script.sql is:

    spool /tmp/data.csv
    
    SELECT "Date", "ID", "KeyName", "ValueName"
    
    FROM "A_Table"
    
    WHERE ("A_Table"."Date" >= '07-SEP-20' AND "A_Table"."Date" <= '07-SEP-21');
    
    spool off
    

    I think the definition of my table is fine, the following is the excerpt from its definition

    "Date" DATE NOT NULL ENABLE, 
    "ID" VARCHAR2(9 CHAR) NOT NULL ENABLE, 
    "KeyName" VARCHAR2(53 CHAR) NOT NULL ENABLE, 
    "ValueName" FLOAT(126) NOT NULL ENABLE, 
    

    But getting the `statistics` you showed looks very useful. May I ask how exactly I can get them? By adding this line:

    set autotrace traceonly statistics
    

    I get an error:

    SQL> set autotrace traceonly statistics
    SP2-0618: Cannot find the Session Identifier.  Check PLUSTRACE role is enabled
    SP2-0611: Error enabling STATISTICS report
    

    I found this link: https://www.toolbox.com/tech/oracle/question/sp2-0618-cannot-find-the-session-identifier-check-plustrace-role-is-enabled-051204/, does it mean that I need to ask the DBA of my firm to GRANT the privilege to my current account to make this command work?

    Thanks!

    Also, actually using SQL*PLUS is just the tip of the iceberg ... our real problem starts here...you may also take a look if you are interested...

    https://dba.stackexchange.com/questions/299444/possible-memory-leak-of-cx-oracle-pyodbc-oracle-instant-client#299444

  • mamsds
    mamsds Member Posts: 16 Green Ribbon

    Hi Jonathan,

    First I think formatting is not an issue, since I already turned off all stdout. These are my commands:

    SQL> set arraysize 10000
    
    SQL> set colsep ,
    
    SQL> set headsep off
    
    SQL> set pagesize 0
    
    SQL> set trimspool on
    
    SQL> set termout off
    
    SQL> set echo off
    
    SQL> @test_script.sql
    

    the content of test_script.sql is:

    spool /tmp/exp-plus.csv
    
    SELECT "Date", "ID", "KeyName", "ValueName"
    
    FROM "A_Table"
    
    WHERE ("A_Table"."Date" >= '07-SEP-20' AND "A_Table"."Date" <= '07-SEP-21');
    
    spool off
    

    I think the definition of my table is fine, the following is the excerpt from its definition

    "Date" DATE NOT NULL ENABLE, 
    "ID" VARCHAR2(9 CHAR) NOT NULL ENABLE, 
    "KeyName" VARCHAR2(53 CHAR) NOT NULL ENABLE, 
    "ValueName" FLOAT(126) NOT NULL ENABLE, 
    

    But getting the `statistics` you showed looks very useful. May I ask how exactly I can get them? By adding this line:

    set autotrace traceonly statistics
    

    I get an error:

    SQL> set autotrace traceonly statistics
    SP2-0618: Cannot find the Session Identifier.  Check PLUSTRACE role is enabled
    SP2-0611: Error enabling STATISTICS report
    

    I found this link: https://www.toolbox.com/tech/oracle/question/sp2-0618-cannot-find-the-session-identifier-check-plustrace-role-is-enabled-051204/, does it mean that I need to ask the DBA of my firm to GRANT the privilege to my current account to make this command work?

    Thanks!

    Also, actually using SQL*PLUS is just the tip of the iceberg ... our real problem starts here...you may also take a look if you are interested...

    https://dba.stackexchange.com/questions/299444/possible-memory-leak-of-cx-oracle-pyodbc-oracle-instant-client#299444

  • mamsds
    mamsds Member Posts: 16 Green Ribbon

    Hi Jonathan,

    First I think formatting is not an issue, since I already turned off all stdout. These are my commands:

    SQL> set arraysize 10000
    
    SQL> set colsep ,
    
    SQL> set headsep off
    
    SQL> set pagesize 0
    
    SQL> set trimspool on
    
    SQL> set termout off
    
    SQL> set echo off
    
    SQL> @test_script.sql
    

    the content of test_script.sql is:

    spool /tmp/exp-plus.csv
    
    SELECT "Date", "ID", "KeyName", "ValueName"
    
    FROM "A_Table"
    
    WHERE ("A_Table"."Date" >= '07-SEP-20' AND "A_Table"."Date" <= '07-SEP-21');
    
    spool off
    

    I think the definition of my table is fine, the following is the excerpt from its definition

    "Date" DATE NOT NULL ENABLE, 
    "ID" VARCHAR2(9 CHAR) NOT NULL ENABLE, 
    "KeyName" VARCHAR2(53 CHAR) NOT NULL ENABLE, 
    "ValueName" FLOAT(126) NOT NULL ENABLE, 
    

    But getting the `statistics` you showed looks very useful. May I ask how exactly I can get them? By adding this line:

    set autotrace traceonly statistics
    

    I get an error:

    SQL> set autotrace traceonly statistics
    SP2-0618: Cannot find the Session Identifier.  Check PLUSTRACE role is enabled
    SP2-0611: Error enabling STATISTICS report
    

    I found this link: https://www.toolbox.com/tech/oracle/question/sp2-0618-cannot-find-the-session-identifier-check-plustrace-role-is-enabled-051204/, does it mean that I need to ask the DBA of my firm to GRANT the privilege to my current account to make this command work?

    Thanks!

    Also, actually using SQL*PLUS is just the tip of the iceberg ... our real problem starts here...you may also take a look if you are interested...

    https://dba.stackexchange.com/questions/299444/possible-memory-leak-of-cx-oracle-pyodbc-oracle-instant-client#299444

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,756 Gold Crown
    edited Sep 12, 2021 3:42PM

    A couple of general points:

    As far as I can tell from sql*plus 21.3, the "-fast" option makes only the following changes with a SET command:

    linesize 100
    lobprefetch 16384
    pagesize 50000
    rowprefetch 2
    statementcache is 20
    
    

    The linesize setting might help in your case since you don't explicitly set a linesize (according to your last note) and a single row of your output is likely to be (by default) 9 + 9 + 53 + 10 + 3(commas) = 84 characters which exceeds a single line. But I wouldn't expect that to make a big difference to your timing

    You code overrides the pagesize, your not using lobs, so that leaves only rowprefetch and statementcache. The latter should be irrelevant (unless you're using a view, perhaps and one of your columns is actually the result of a function call with constructed SQL) - so that looks like rowprefetch is critical.

    I would check each setting in turn to see what it is that makes the big difference. Based on your description so far it's possible that there's some incompatibility between the 19.4 server and the 21.3 client (perhaps in the connection mechanism) that gets bypassed or overridden by the prefetch

    You will have to get the DBA to grant the plustrace role to you, unless you have access to v$statname and one of v$mystat or v$sesstat in which you can log on, run your query, then execute:

    select 
           sn.name, ss.value 
    from 
           v$statname sn, v$sesstat ss
    where
           ss.sid = sys_context('userenv','sid') 
    and    sn.statistic# = ss.statistic#
    and    ss.value != 0
    /
    

    if you can access v$mystat instead of v$sesstat you don't need the predicate on SID.

    Regards

    Jonathan Lewis

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,756 Gold Crown

    If you have access to v$session_Event you could also run the following (again after logging on and running your script):column event format a32

    set linesize 156
    
    select
            event, total_waits, total_timeouts, time_waited
    from
            v$session_event se
    where
            se.sid = sys_context('userenv','sid')
    order by
            time_waited desc
    /
    
    

    Regards

    Jonathan Lewis

  • EdStevens
    EdStevens Member Posts: 28,484 Gold Crown

    What is the data type of the "A_Table"."Date" column: It is "Date". Regarding why I compare it with a string, I am actually not sure, but by construct the SQL this way it works and the results are correct...I guess I will revisit this issue later;

    It "works" by pure accident. It is forcing an implied TO_DATE function to convert those date literal strings into actual DATE (an internal, binary structure) to compare to the column that is (correctly) defined as a DATE. The only reason it "works" is because the controlling setting of NLS_DATE_FORMAT just happens to match the format of the string literals.

  • EdStevens
    EdStevens Member Posts: 28,484 Gold Crown

    You have table and column names in mixed case: LOL I would say this is a mistake. Do you mean that in Oracle, usually we use uppercase only?

    In oracle, the default is to treat object names as case INsensitive. They are stored in the data dictionary in upper-case, but as long as you do not enclose the names in double-quotes, all references are treated case INsensitive. If you create an object by enclosing its name in double quotes:

    create table "MyTestTable"
        ("FirstName" varchar2(10),
         "LastName" varchar2(10),
         dob  date
         )
    

    Then they will be created in the data dictionary in exactly the case specified, and all future references will have to also be enclosed in double-quotes and exactly the case specified at creation.

    In the above example, the table name and first two column names are case sensitive. The column 'dob' is not enlcolsed in double-quotes, so is case INsensitive, and may be referred to as dob , DOB, Dob, etc.

    Because of this, in oracle the normal practice is to NOT enclose names in double-quotes, and instead of using mixed-case for readability, use underscores:


    create table my_test_table
        (first_name varchar2(10),
         last_name varchar2(10),
         dob date
         )