3 Replies Latest reply on Sep 11, 2015 5:55 AM by dmcmahon-Oracle

    Performance implications for IS JSON constraint Oracle 12?

    Carol E.

      I have learned there is JSON support in Oracle 12.  Are there any performance implications (overhead) to using the IS JSON constraint?

        • 1. Re: Performance implications for IS JSON constraint Oracle 12?
          Kiran Pawar

          Hi 3022877,

          3022877 wrote:

               Please change your user handle from "3022877" to something meaningful. Refer : Video tutorial how to change nickname available

          I have learned there is JSON support in Oracle 12.  Are there any performance implications (overhead) to using the IS JSON constraint?

               As stated in your thread, that in Oracle DB 12c, IS JSON can be applied to column of types VARCHAR2 / CLOB, the performance consideration for VARCHAR2 / CLOB applies to the column with IS JSON constraint.

               But, I would say that IS JSON constraint will improve the performance of VARCHAR2 / CLOB data types, as there are ample SQL / PL/SQL APIs provided with Oracle DB 12c / Other Oracle products (ORDS/Oracle APEX) / third party PL/SQL APIs (viz. PL/JSON) which help querying the JSON data in a faster way than in normal VARCHAR2 / CLOB columns.

               Refer :

           

          Regards,

          Kiran

          • 2. Re: Performance implications for IS JSON constraint Oracle 12?
            Marco Gralike

            ???...

             

            Current storage options for JSON:

            • (N)VARCHAR2()
            • (N)CLOB)
            • BLOB
            • BFILE

             

            For example:

            Screen Shot 2015-09-08 at 19.36.59.png

             

            Regarding performance:

             

            There will be a performance penalty when using CHECK IS (NOT) JSON. By default the content will be checked in a LAX method (see Doc) but can also be check for STRICT JSON validation.

             

            Screen Shot 2015-09-08 at 19.40.00.png

            STRICT validated JSON content will be of course more performance intensive than using the (default) LAX method. ORDS and Native JSON Database functionality extend each other. Native JSON Database functionality is database build-in, follows current and future JSON standards (including the new JSON Path expressions / for databases) and is database (kernel) build in, aka build in "native C". The next DB version (/ patches) will have even more functionality.

             

            --------------------------------------------------------

            -- DDL for Table JSON_SCRAPED_CONTENT

            --------------------------------------------------------

              DROP TABLE "JSON_SCRAPED_CONTENT" PURGE;

              --

              CREATE TABLE "JSON_SCRAPED_CONTENT"

               ( "ID"   NUMBER GENERATED ALWAYS AS IDENTITY PRIMARY KEY ,

               "JSON" BLOB,

               CONSTRAINT "VALIDATE_JSON" CHECK ("JSON" IS JSON FORMAT JSON)

               )

              LOB ("JSON") STORE AS SECUREFILE "SEG_JSON_SCRAPED_CONTENT";

            --------------------------------------------------------

             

            Carol E.

            • 3. Re: Performance implications for IS JSON constraint Oracle 12?
              dmcmahon-Oracle

              IS JSON will definitely add some cost. You can get a sense of the cost by putting your data into a table without the constraint and then doing:

               

              set timing on

               

              select count(*) from YOUR_TABLE where YOUR_JSON_COLUMN IS JSON;

               

              This will give you a feel for the per-row cost on your data. The constraint will invoke the JSON parser on every row of content, and run the parse to completion. There's not much difference between LAX and STRICT, the cost is mainly driven by the need to go completely through every character of the input. STRICT is slightly more expensive because it has to maintain a hash table of object member names at every nested level in order to detect duplicates, which it disallows.

               

              Note that the cost of parsing may be dwarfed by the cost of I/O on the data itself, especially if you use one of the LOB types to store the JSON. And there may be character-set conversion costs to consider as well.

               

              There are performance implications to your choice of storage column type, as well as database character set. JSON itself must be Unicode, and the JSON parser requires Unicode input. If you run with the recommended AL32UTF8 character set, then VARCHAR2 columns offer the best performance if you can live with the relatively small size, because they're right in line in the row data. BLOB columns are a good choice if you need to store larger JSONs, since you can store them in the native UTF-8 character set. CLOB columns tend to use more space - if you're running any multibyte character set (such as AL32UTF8), CLOB columns store data in AL16UTF16 (essentially a big-endian UTF-16). So they typically take 2x the storage, and are correspondingly a bit slower to parse as well. If your database is not based on UTF-8 (for example, WE8MSWIN1252), then you'll pay a performance cost to convert from that character set to Unicode whenever you invoke the JSON parser. In such a database, the VARCHAR2 column would incur this cost. The CLOB column would also incur character-set conversion cost if you're not running multibyte (because in such cases CLOB columns use the same character set as VARCHAR2, rather than AL16UTF16).

               

              Finally, note that the nature of your JSON data also impacts performance. The most common unnecessary cost is excess whitespace in the JSON, e.g. newlines between fields of an object, spaces before and after colons separating fields and values, space characters used to indent the JSON, etc. Common pretty-printed JSON has enough whitespace to add 10% to the parse cost.