1 2 3 Previous Next 30 Replies Latest reply: Aug 30, 2010 8:43 AM by Charles Hooper RSS

    network performance

    734766
      We have a little problem with the connection to the Oracle database we have in production. It turns out that in any transactions through the default Oracle port (1521) and over the internet (I mean, the server and the client are always in two different networks) gives a performance well below the expected connection (about 1Mb, symmetrical), but there is no problem if we transact from within the same network. Query seems to be ok, proccessing time is normal and the problem happens when any transaction is done: querying, compiling, retrieving, uploading code, etc.
      For example, a query on a heavy records table that retrieves about 700 KB of information in 500 records takes us 3:31 minutes, which gives an average speed of downloading about 3-4 KB/s, though we have upload bandwidth in abundance to the server and more than enough download bandwith in the customer's network. With this query specifically happens that the reception / transmission of data takes place at regular intervals, receiving a chunk of around 20 KB and waiting about 5 seconds to receive new data. It does not seem to happen in other queries but we continue to detect average speeds of 4-5 KB/s. We could not verify whether the traffic managed through the graphical application developed in Oracle Forms has the same problem, I have been told that the protocol is not the same (http in this case, I guess) so it can not be compared. Does this matter? We use Toad and SQL developer as clients, both get the same results and the same performance, the query time is nailed to the second in both. For ODBC connections seems to take even longer, but that I do understand, It is an obsolete way.

      We do not know if this might be due to an incorrectly configured parameter in client or server, or is a limitation of software, hardware or line, but in the housing where the server we insist that they do not limit any traffic or anything, and our DBA insists that this is normal, we can not ask more to the system. I think it should go faster, I know for sure that there is a lot of idle bandwith. Any help or idea will be appreciated.

      Oracle Database 10g Release 10.2.0.3.0 - 64bit Production, installed on a Windows Server 2003 R2

      Thank you very much in advance!
        • 1. Re: network performance
          Satish Kandi
          IMO, the first step should be take a level 12 10046 event trace for the query under consideration and confirm that the network is the problem.

          Then you can take necessary steps in consultation with your system/network administrator.
          • 2. Re: network performance
            sybrand_b
            Apart from the 10046 level 12 trace, which should establish whether you are using array fetch, the results of a dos ping, and especially a dos tracert are required.
            You probably need to travel many hops, and some of them are slow.

            20K is the default Session Data Unit or SDU of sqlnet.

            ------------
            Sybrand Bakker
            Senior Oracle DBA
            • 3. Re: network performance
              734766
              Hi! Thanks for your interest.

              The tracert seems normal:

              Traza a la dirección www.75.101.212.in-addr.arpa [212.101.75.188]
              sobre un máximo de 30 saltos:

              1 * * * Tiempo de espera agotado para esta solicitud.
              2 * * * Tiempo de espera agotado para esta solicitud.
              3 69 ms 69 ms 69 ms 193-238-52-2.static.voztelecom.net [193.238.52.2]
              4 69 ms 70 ms 70 ms gi-0-0-0-10.rt-es-07.voztelecom.net [193.22.119.51]
              5 70 ms 72 ms 70 ms ae0-35.mad44.ip4.tinet.net [213.200.71.133]
              6 69 ms 70 ms 73 ms as174.ip4.tinet.net [77.67.72.126]
              7 70 ms 70 ms 70 ms te2-4.mpd02.mad05.atlas.cogentco.com [130.117.3.117]
              8 82 ms 88 ms 83 ms te3-1.mpd01.mad04.atlas.cogentco.com [130.117.2.70]
              9 97 ms 102 ms 99 ms 149.6.151.22
              10 98 ms 97 ms 101 ms 212.101.79.89
              11 97 ms 98 ms 98 ms 212.101.79.77
              12 99 ms 98 ms 98 ms 192.168.35.3
              13 102 ms 98 ms 97 ms 192.168.36.3
              14 98 ms 97 ms 97 ms www.75.101.212.in-addr.arpa [212.101.75.188]

              Traza completa.

              And I'm sorry but, ¿What exactly do you mean with a level 12 10046 event trace? I did an autotrace, ¿is this something like that, but shorter?

              show parameter user_dump_dest

              show parameter optimizer

              show parameter db_file_multi

              show parameter db_block_size

              show parameter cursor_sharing

              column sname format a20
              column pname format a20
              column pval2 format a20

              select
              sname
              , pname
              , pval1
              , pval2
              from
              sys.aux_stats$;

              13 filas seleccionadas.

              Transcurrido: 00:00:00.96

              Estadísticas
              ----------------------------------------------------------
              270 recursive calls
              0 db block gets
              48 consistent gets
              17 physical reads
              0 redo size
              764 bytes sent via SQL*Net to client
              333 bytes received via SQL*Net from client
              2 SQL*Net roundtrips to/from client
              5 sorts (memory)
              0 sorts (disk)
              13 rows processed


              explain plan for
              -- put your statement here
              select * from cat_m_catalogo where rownum < 501;

              Explicado.

              Transcurrido: 00:00:00.31

              select * from table(dbms_xplan.display);

              14 filas seleccionadas.

              Transcurrido: 00:00:01.48

              Estadísticas
              ----------------------------------------------------------
              2749 recursive calls
              22 db block gets
              1697 consistent gets
              144 physical reads
              0 redo size
              1209 bytes sent via SQL*Net to client
              333 bytes received via SQL*Net from client
              2 SQL*Net roundtrips to/from client
              118 sorts (memory)
              0 sorts (disk)
              14 rows processed


              rollback;

              Rollback terminado.

              Transcurrido: 00:00:00.20

              rem Set the ARRAYSIZE according to your application
              set autotrace traceonly arraysize 100

              alter session set tracefile_identifier = 'mytrace1';

              Sesión modificada.

              Transcurrido: 00:00:00.18

              rem if you're using bind variables
              rem define them here

              rem variable b_var1 number
              rem variable b_var2 varchar2(20)

              rem and initialize them

              rem exec :b_var1 := 1
              rem exec :b_var2 := 'DIAG'

              alter session set events '10046 trace name context forever, level 8';

              Sesión modificada.

              Transcurrido: 00:00:00.20

              -- put your statement here
              select * from cat_m_catalogo where rownum < 501;

              500 filas seleccionadas.

              Transcurrido: 00:03:16.71

              Plan de Ejecución
              ----------------------------------------------------------
              Plan hash value: 217263990

              -------------------------------------------------------------------------------------
              | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
              -------------------------------------------------------------------------------------
              | 0 | SELECT STATEMENT | | 500 | 498K| 19 (0)| 00:00:01 |
              |* 1 | COUNT STOPKEY | | | | | |
              | 2 | TABLE ACCESS FULL| CAT_M_CATALOGO | 500 | 498K| 19 (0)| 00:00:01 |
              -------------------------------------------------------------------------------------

              Predicate Information (identified by operation id):
              ---------------------------------------------------

              1 - filter(ROWNUM<501)


              Estadísticas
              ----------------------------------------------------------
              17 recursive calls
              0 db block gets
              129 consistent gets
              85 physical reads
              0 redo size
              751757 bytes sent via SQL*Net to client
              377 bytes received via SQL*Net from client
              6 SQL*Net roundtrips to/from client
              0 sorts (memory)
              0 sorts (disk)
              500 rows processed


              disconnect
              Desconectado de Oracle Database 10g Release 10.2.0.3.0 - 64bit Production

              spool off

              Thanks!
              • 4. Re: network performance
                Richard Harrison .
                Hi,
                I've experienced exactly the same issue a while back using a CRM application where access from network was appalingly slow compared to another. In the end it turned out it was the way the network switches were configured.

                in the working network everything was set to 100MB full duplex

                in the 'bad' network one end was set to 100MB and the other was set to autonegotiate and this was the cause of all the problems.

                I'd get your netowork guys to double check all the network settings and make sure they are set explicitly and not autonegotiate.

                Hope this helps!

                Cheers,
                Harry
                • 5. Re: network performance
                  sybrand_b
                  The tracert results definitely aren't normal.
                  You have timeouts in the first two hops!!!!
                  The other hops take too long (beneath 30ms is still acceptable)
                  There is something seriously wrong in your network.

                  You need issue the following

                  alter session set events='10046 trace name context forever, level 12';

                  This will result in a trace file with extra information with respect to 'what it is waiting for'
                  You will see whether those sqlnet roundtrips are too long.
                  Also you will see whether sqlnet fragments your output ('sqlnet more data to client' messages will appear)

                  The trace file will be in user_dump_dest.

                  This disables it

                  alter session set events='10046 trace name context off';

                  Hth

                  --------------------
                  Sybrand Bakker
                  Senior Oracle DBA
                  • 6. Re: network performance
                    734766
                    Hi sybrand_b!

                    Ok, the two first hops are normal because our switches and firewalls are configured to not respond to that requests, shouldn't be a problem.

                    And the trace file in user dump directory is big, around 79 KB, ¿what should I do with it? I have the spool from the execution but it is pretty much the same info I posted earlier.

                    I spoke with network guys because I saw autonegotiated in the switch we have in our network, waiting answer from the server network guys. I hope that'd solve the problem!!!

                    Thank you all, I will put helpful an correct answers soon!
                    • 7. Re: network performance
                      sybrand_b
                      You should both tkprof the trace file and read the raw trace file.
                      It will show you how long individual roundtrips take
                      (sqlnet message to client followed by sqlnet message from client)
                      It will show you whether you are using array fetches (as each array feth is followed by a round trip this is extreemly important)
                      You can see array fetch is not used when you have 'r=1' in the raw trace file everywhere.

                      etc,etc

                      -------------
                      Sybrand Bakker
                      Senior Oracle DBA
                      • 8. Re: network performance
                        734766
                        Ok, I've got the results:
                        select * 
                        from
                         cat_m_catalogo where rownum < 501
                        
                        
                        call     count       cpu    elapsed       disk      query    current        rows
                        ------- ------  -------- ---------- ---------- ---------- ----------  ----------
                        Parse        1      0.00       0.00          0          0          0           0
                        Execute      1      0.00       0.00          0          0          0           0
                        Fetch        6      0.01     185.38        114        125          0         500
                        ------- ------  -------- ---------- ---------- ---------- ----------  ----------
                        total        8      0.01     185.38        114        125          0         500
                        
                        Misses in library cache during parse: 0
                        Optimizer mode: ALL_ROWS
                        Parsing user id: 861  
                        
                        Rows     Row Source Operation
                        -------  ---------------------------------------------------
                            500  COUNT STOPKEY (cr=125 pr=114 pw=0 time=13437 us)
                            500   TABLE ACCESS FULL CAT_M_CATALOGO (cr=125 pr=114 pw=0 time=12931 us)
                        
                        
                        Elapsed times include waiting on following events:
                          Event waited on                             Times   Max. Wait  Total Waited
                          ----------------------------------------   Waited  ----------  ------------
                          SQL*Net message to client                       6        0.00          0.00
                          db file sequential read                         1        0.00          0.00
                          db file scattered read                         11        0.01          0.03
                          SQL*Net more data to client                   373        5.13        185.32
                          SQL*Net message from client                     6        5.02         10.50
                        ********************************************************************************
                        ¿WTF is SQL*Net more data to client? ¿What do you think of the results? To me looks normal except for the SQL*Net more data to client problem...

                        Thanks!
                        • 9. Re: network performance
                          user503699
                          user11098377 wrote:
                          ¿WTF is SQL*Net more data to client? ¿What do you think of the results? To me looks normal except for the SQL*Net more data to client problem...

                          Thanks!
                          You may want to read this to see if it helps. Basically, it is inline with what others have said above.
                          SQL*Net message to client vs SQL*Net more data to client
                          • 10. Re: network performance
                            734766
                            Thanks for the info. I've taken this from that page:
                            I’ll reiterate that both SQL*Net message to client and SQL*Net more data to client waits only record the time it took to write the return data from Oracle’s userland SDU buffer to OS kernel-land TCP socket buffer. Thus the wait times of only microseconds. Thanks to that, all of the time a TCP packet spent “flying” towards the client is actually accounted in SQL*Net message from client wait statistic. The problem here is though, that we don’t know how much of this time was spent on the wire and how much of it was application think time.
                            I think the 90% of that wait time (or even more) is "flying time", I doubt it takes so long in writing data to the socket. ¿Am I right? That is my problem, I cannot demonstrate to my DBA (I know this sounds odd, bu trust me, my DBA is a stubborn man) that the problem has a solution and which one. He wants to develop an online application or so, but I think that is a config problem. Another guy who develops DB code for us years ago says that this didn't happen (moths or years) before, so there has to be a solution.

                            Thank you very much!
                            • 11. Re: network performance
                              user503699
                              user11098377 wrote:
                              Thanks for the info. I've taken this from that page:
                              Did you read the comments on that blog post? One of the person actually discusses a similar problem and concludes by mentioning how the problem was with "network" and not database. Yours may not be the same situation but it is probably sufficient to prove that the problem is "outside database", if that is what you are looking for.
                              • 12. Re: network performance
                                sb92075
                                so there has to be a solution.
                                Start with a "big" (1GB) file
                                How long does it take to do network copy to DB Server system from system inside firewall?
                                How long does it take to do network copy to DB Server system from system outside firewall?

                                This "test" network capacity & excludes Oracle totally.
                                • 13. Re: network performance
                                  Charles Hooper
                                  user11098377 wrote:
                                  Ok, I've got the results:
                                  select * 
                                  from
                                  cat_m_catalogo where rownum < 501
                                  
                                  
                                  call     count       cpu    elapsed       disk      query    current        rows
                                  ------- ------  -------- ---------- ---------- ---------- ----------  ----------
                                  Parse        1      0.00       0.00          0          0          0           0
                                  Execute      1      0.00       0.00          0          0          0           0
                                  Fetch        6      0.01     185.38        114        125          0         500
                                  ------- ------  -------- ---------- ---------- ---------- ----------  ----------
                                  total        8      0.01     185.38        114        125          0         500
                                  
                                  Misses in library cache during parse: 0
                                  Optimizer mode: ALL_ROWS
                                  Parsing user id: 861  
                                  
                                  Rows     Row Source Operation
                                  -------  ---------------------------------------------------
                                  500  COUNT STOPKEY (cr=125 pr=114 pw=0 time=13437 us)
                                  500   TABLE ACCESS FULL CAT_M_CATALOGO (cr=125 pr=114 pw=0 time=12931 us)
                                  
                                  
                                  Elapsed times include waiting on following events:
                                  Event waited on                             Times   Max. Wait  Total Waited
                                  ----------------------------------------   Waited  ----------  ------------
                                  SQL*Net message to client                       6        0.00          0.00
                                  db file sequential read                         1        0.00          0.00
                                  db file scattered read                         11        0.01          0.03
                                  SQL*Net more data to client                   373        5.13        185.32
                                  SQL*Net message from client                     6        5.02         10.50
                                  ********************************************************************************
                                  ¿WTF is SQL*Net more data to client? ¿What do you think of the results? To me looks normal except for the SQL*Net more data to client problem...

                                  Thanks!
                                  You are headed down the right path - you found that the database server is probably not responsible for the slow performance. Next, on the client side start analyzing the network traffic using a program such as the free Wireshark program:
                                  http://www.wireshark.org/download.html

                                  You can read a six part series on how to use Wireshark and other utilities to analyze Oracle data transfers here:
                                  http://hoopercharles.wordpress.com/2009/12/15/network-monitoring-experimentations-1/

                                  The relatively high latency of the connection should not be a problem as long as the row lengths are reasonably short - it looks like the array fetch is set to 100. I would check the Wireshark log for dropped and/or retransmitted packets. But also pay attention to the relative time delta between packets.

                                  Charles Hooper
                                  Co-author of "Expert Oracle Practices: Oracle Database Administration from the Oak Table"
                                  http://hoopercharles.wordpress.com/
                                  IT Manager/Oracle DBA
                                  K&M Machine-Fabricating, Inc.
                                  • 14. Re: network performance
                                    734766
                                    Hi!

                                    I think that the problem should be outside database because transactions between servers in the same network is perfectly normal. That's because I found helpful the answer from harry76, it fits. But I have to demonstrate that its a network problem and that it has a solution. The not Oracle related requests to this server use the full bandwith.

                                    Thanks!
                                    1 2 3 Previous Next