4 Replies Latest reply: Jun 6, 2013 9:31 AM by ThAn RSS

    Question regarding External Table Feature

    ThAn
      Hi,

      i am playing around with external table feature and i managed to modify the example to match one of our data model's table.
      My modified example works, i can query the data from within sqlplus or sql developer.

      I populated my store with about 2 Million records.

      The results of a query - without a where clause - are returned relatively fast. But when it comes to a where clause, the
      performance - especially when i query records which have one part of the major key in common (there are around 1 mio
      "records" which share the first part of the major key, second part is different) performance degrades. In this case the
      result of the query is presented after about 45 minutes.

      In addition i wrote a small jsp app running on tomcat, which queries the same kvstore - without using the external table
      feature. Running the same query as mentioned in external table sample, it takes "only" let's say around 5 minutes to return
      the same amount of data.

      So i am asking myself some questions:
      - is there a way to make the external table feature faster (i already tried to improve performance by using additional location files,
      but without performance improvement)?
      - can i pass in some parameters into the Formatter class? If so, how? (I'm thinking about setting additional filters for major key)
      - is the the external table - in general - feasible for querying large amounts of data?

      I know that i cannot expect "miracles" when talking about performance, when having a look at my configuration (i run a
      kvstore on 2 storage nodes (physically different machines) with limited amount of RAM (assigned 2GB to each storage node
      when running makebootconfig), but i am wondering about the different results when using either external table feature or
      direct access to kvstore.

      Any hints are very welcome.

      Best regards
      Thomas
        • 1. Re: Question regarding External Table Feature
          Charles Lamb
          Hi Thomas,

          The query predicate (i.e. the WHERE clause) does not get passed to NoSQL Database. That's a limitation of Oracle RDBMS external table feature, not Oracle NoSQL Database. You might consider using the subkey facility in the NoSQL Database External Table integration.

          In terms of more parallelism, the parameters are finicky so you need to poke at it a bit. Take a look at

          http://docs.oracle.com/cd/NOSQL/html/javadoc/oracle/kv/exttab/package-summary.html

          Be sure to enable parallel query processing using these three commands:

          ALTER SESSION ENABLE PARALLEL DDL;
          ALTER SESSION ENABLE PARALLEL DML;
          ALTER SESSION ENABLE PARALLEL QUERY;

          as well as adding a PARALLEL nn directive in the External Table CREATE TABLE declaration, where nn is the number of LOCATION files you specify. We have observed very good scaling when the parallelism is invoked. You could do something as simple as top to see if there are parallel threads running on the ORA side of the house when you do the SELECT.

          In terms of passing in parameters to a formatter class, it's not pretty but you could do it with -Dfoo=bar on the java invocation in the nosql_stream script.

          And yes, the external table feature is feasible for querying large amounts of data, but of course it's only taking a read-only snapshot.

          I hope this is helpful.

          Charles
          • 2. Re: Question regarding External Table Feature
            ThAn
            Hi Charles,

            many thanks for your detailed reply. But i think i did everything right when defining multiple location files - in fact i added
            the parallel statement in the create table script as well.
            I verified if it's working by having a look at the explain plan in toad (which shows if the query is running in parallel) and
            it told me that the query was running with 2 slave processes (i attached 2 location files to the table definition):

            IO Cost     CPU Cost     Cost     Plan
            +          16     SELECT STATEMENT ALL_ROWS                    +
            +4      4      4           4 PX COORDINATOR                +
            +3 16     3 913.933     3 16               3 PX SEND QC (RANDOM) PARALLEL_TO_SERIAL SYS.:TQ10000 :Q1000Bytes: 498.232 Cardinality: 82           +
            +2 16     2 913.933     2 16                    2 PX BLOCK ITERATOR PARALLEL_COMBINED_WITH_CHILD :Q1000Bytes: 498.232 Cardinality: 82      +
            +1 16     1 913.933     1 16                         1 EXTERNAL TABLE ACCESS FULL TABLE PARALLEL_COMBINED_WITH_PARENT SCHEMA.TABLE_NOSQL :Q1000 Filter Predicates: "TABLE_ID"='14' Bytes: 498.232 Cardinality: 82+

            Still the query takes quite some time to return...

            From your reply i think that the difference - in terms of performance - between the query on the external table and the jsp app comes
            from the direct access to the keys which the jsp app does (i use the table_id to build a part of the key and iterate through the results)
            whereas the query on the external table does not get passed the predicate and does a kind of "full table scan" on the store, right?

            Thanks and regards
            Thomas
            • 3. Re: Question regarding External Table Feature
              Charles Lamb
              You are correct that an external table query will do a full table scan. Does your jsp use storeIterator or is it doing gets? If so, you can pass the same parameters that you are passing to storeIterator in your jsp to the external table query. You would do this at publish-time.

              Charles
              • 4. Re: Question regarding External Table Feature
                ThAn
                Hi Charles,

                thanks again! Yes, i'm using storeIterator to loop through my results when querying my key value store. And i already defined two
                additional external tables which work on the same store, by creating separate config.xml files (whose Property oracle.kv.parentKey
                was extended by the part of the major key) and publishing those external table configurations.

                It works fine. I think I'll have to find out more by adding more storage nodes, regarding scaling...

                Regards
                Thomas

                Edited by: 907108 on Jun 6, 2013 4:31 PM