2 Replies Latest reply: Feb 5, 2014 6:30 AM by tez RSS

    IO perfomance

    tez

      Dear oracle devels,

      could you please advise with next issue:

       

      There is BTREE database, i.e.: one primary databsae and 4 secondaries.

      Sizeof inserted buffer is 112 bytes.

      I apply next configuration:

      1) env_flags:

      DB_CREATE |  // Create the environment if it does not exist
          DB_RECOVER|  // Run normal recovery.
          DB_INIT_LOCK  |  // Initialize the locking subsystem
          DB_INIT_LOG   |  // Initialize the logging subsystem
          DB_INIT_TXN   |  // Initialize the transactional subsystem. This
          // also turns on logging.
          DB_PRIVATE|
          DB_INIT_MPOOL |  // Initialize the memory pool (in-memory cache)
          DB_THREAD;

       

      2) environment configuration:

      DbEnv* mEnv = new DbEnv(0)

      mEnv->set_flags(DB_TXN_NOSYNC, 1);

          mEnv->set_flags(DB_TXN_WRITE_NOSYNC, 1);

          mEnv->set_lg_bsize(2048000);

          mEnv->set_lk_max_lockers(10000);

          mEnv->set_lk_max_locks(30000);

          mEnv->set_lk_max_objects(30000);

          mEnv->set_tx_max(5000);

          mEnv->set_cachesize(0, 512 * 1024 * 1024, 1);

       

       

      3) When I open any database I assign page size 16 * 1024

       

      When application becomes loaded( ~700-800 messages per second ) I can observe next output

       

      163302 records flushed in 10 seconds
      449035 records flushed in 35 seconds
      439752 records flushed in 34 seconds
      437240 records flushed in 31 seconds
      441989 records flushed in 35 seconds
      437614 records flushed in 43 seconds
      441424 records flushed in 47 seconds
      430009 records flushed in 77 seconds
      735798 records flushed in 247 seconds
      1203144 records flushed in 1093 seconds

       

      After the last message application produces 100% HDD load.

      Records are colected into application cache and flushed into db once per minute. Starting from penultimate record flush cycle becomes infinite.

       

      Could you please advise,how to overcome this issue?

      Best regards, tez.

       

        • 1. Re: IO perfomance
          Bogdan Coman

          Hi Tez,

           

          It would be nice to know how many records you have, what's the record size and if you sort them in any way. Also, do you insert multiple records per transaction? Do you have readers too while performing this test?

           

          As for keeping the insertion rate constant, I suggest you to look into using memp_trickle: DB_ENV->memp_trickle()

           

          Thanks,

          Bogdan

          • 2. Re: IO perfomance
            tez

            Hi Bogdan, appreciate for your questions.

             

            > It would be nice to know how many records you have

            If consider next output

            163302 records flushed in 10 seconds

            449035 records flushed in 35 seconds

            439752 records flushed in 34 seconds

            437240 records flushed in 31 seconds

            441989 records flushed in 35 seconds

            437614 records flushed in 43 seconds

            441424 records flushed in 47 seconds

            430009 records flushed in 77 seconds

             

            then total number of records is 3240365 records in 8 minutes.

             

            > what's the record size

            const size_t ID_MAX_LEN = 24;
            
            struct Message
            {
                off_t start;                   
                off_t end;                      
                uint64_t id;                   
                char m_id[ ID_MAX_LEN ];  
                time_t timestamp;               
                uint32_t fhash;                 
                uint8_t msgType;                
                char from[ ID_MAX_LEN ];      
                char to[ ID_MAX_LEN ];        
              };
            
            

             

            So sizeof(Message)=112 bytes.

             

            >and if you sort them in any way

            I use next db schema:

            - one primary db with unique key Message::id

            - 4 secondary databases, indexed by m_id, timestamp, from,to.

             

            DBTYPE is DB_BTREE

             

            Also I use next comparators on every secondary db( installed by set_bt_compare )

            m_id:

            int id_compare( Db*, const Dbt* appKey, const Dbt* dbKey)
            {
                char appId[ ID_MAX_LEN ];
                memset( appId,0,ID_MAX_LEN );
                memcpy( appId, appKey->get_data(),appKey->get_size() );
            
                char dbId[ ID_MAX_LEN ];
                memset( dbId,0,ID_MAX_LEN );
                memcpy( dbId, dbKey->get_data(),dbKey->get_size() );
                int ret = strcmp( appId, dbId );
                return ret;
            }
            

            timestamp:

            int timestamp_compare( Db*, const Dbt* appKey, const Dbt* dbKey )
            {
                time_t appTimestamp,dbTimestamp;
                memcpy( &appTimestamp,appKey->get_data(),appKey->get_size() );
                memcpy( &dbTimestamp,dbKey->get_data(),dbKey->get_size() );
                int ret = appTimestamp > dbTimestamp ? 1 : (appTimestamp == dbTimestamp ? 0 : -1 );
                return ret;
            }
            

             

            from and too are identical:

            int from_compare( Db*, const Dbt* appKey, const Dbt* dbKey)
            {
                char appFrom[ ID_MAX_LEN ];
                memset( appFrom,0,ID_MAX_LEN );
                memcpy( appFrom, appKey->get_data(),appKey->get_size() );
            
                char dbFrom[ ID_MAX_LEN ];
                memset( dbFrom,0,ID_MAX_LEN );
                memcpy( dbFrom, dbKey->get_data(),dbKey->get_size() );
            
                int ret = strcmp( appFrom, dbFrom );
            return ret;
            }
                return ret;
            }
            

             

            >>Also, do you insert multiple records per transaction?

            I create transaction, create cursor under that transaction  and put 1000 records per one transaction.

            During one flush iteration I have to put about 400000-450000 messages ,i.e. 40 - 45 transactions are used.

             

            >>Do you have readers too while performing this test?

            No.

             

            gdb backtrace shows that flushing thread 99% of the time is blocked inside fdatasync call when 100% hdd load appears.

            I tried to play with page size, starting from 16 k up to 64 but it did not give any effect.

            Bulk insert also did not help.

            Increased cache size( DbEnv::set_cachesize ) solved the issue for a some amount of time. After caches became fulfilled  - hdd load came back.

             

            Appreciate for any advise.

            Sincerely tez.