14 Replies Latest reply: Feb 26, 2009 4:35 AM by user3963943 RSS

    PDF's storing in Compressed BLOB Securefile doesn't save space

    user3963943
      Hello 1 have a Test-table in 10G with 1 LOB-segment of 1700 Mb , 1743 records with PDF's

      In 11G 11.1.0.7 i create a table with SCEUREFILE in ASSM tablespace, the table is partitioned.
      When 1 insert the 1743 records wioth PDF's the total segmentsize is even 1900 MB...???

      Why are my PDF's not more compressed?

      CREATE TABLE SNL_SCAN.DOCUMENTEN_LGE
      (
      OWNER VARCHAR2(50 BYTE),
      COMPANY VARCHAR2(50 BYTE),
      SCAN_ID VARCHAR2(50 BYTE) NOT NULL,
      DOCUMENT BLOB,
      FILENAME VARCHAR2(255 BYTE),
      CONTENT_TYPE VARCHAR2(50 BYTE),
      ORDER_DATE DATE,
      SCAN_DATE DATE,
      STATUS VARCHAR2(1 BYTE),
      CUSTOM_01 VARCHAR2(50 BYTE),
      CUSTOM_02 VARCHAR2(50 BYTE),
      CUSTOM_03 VARCHAR2(50 BYTE),
      CUSTOM_04 VARCHAR2(50 BYTE),
      CUSTOM_05 VARCHAR2(50 BYTE),
      CUSTOM_06 VARCHAR2(50 BYTE),
      OWNER_ID NUMBER(9) NOT NULL,
      SCAN_PLACE_DATE DATE DEFAULT sysdate
      )
      TABLESPACE SCAN_DATA
      PCTUSED 0
      PCTFREE 10
      INITRANS 1
      MAXTRANS 255
      PARTITION BY RANGE (SCAN_DATE)
      INTERVAL( NUMTOYMINTERVAL(1,'MONTH'))
      (
      PARTITION DOCUMENTEN_LGE_200605 VALUES LESS THAN (TO_DATE(' 2006-06-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
      LOGGING
      COMPRESS FOR ALL OPERATIONS
      TABLESPACE SCAN_DATA
      LOB (DOCUMENT) STORE AS SECUREFILE
      ( TABLESPACE SCAN_DATA
      ENABLE STORAGE IN ROW
      CHUNK 8192
      RETENTION
      NOCACHE
      COMPRESS HIGH
      STORAGE (
      INITIAL 64K
      NEXT 1M
      MINEXTENTS 1
      MAXEXTENTS UNLIMITED
      PCTINCREASE 0
      FREELISTS 1
      FREELIST GROUPS 1
      BUFFER_POOL DEFAULT
      )
      )
      PCTFREE 10
      INITRANS 1
      MAXTRANS 255
      STORAGE (
      INITIAL 64K
      NEXT 1M
      MINEXTENTS 1
      MAXEXTENTS UNLIMITED
      BUFFER_POOL DEFAULT
      )
      )
      COMPRESS FOR ALL OPERATIONS
      NOCACHE
      NOPARALLEL
      MONITORING
      ENABLE ROW MOVEMENT;
        • 1. Re: PDF's storing in Compressed BLOB Securefile doesn't save space
          damorgan
          What size is the PDF? If you can send it to me as an email attachment.
          • 2. Re: PDF's storing in Compressed BLOB Securefile doesn't save space
            user3963943
            Hello,

            i will send you my example pdf. in a seperate mail.

            This is my test:

            The pdf is 572 KB and with winzip i can zip it back to 462 KB = 19% compression

            If i insert this pdf 110 times in my compressed LOB the size of the LOB-Segment = 72,2 MB

            If i insert this pdf 110 times in my NONE compressed LOB the size of the LOB-Segment = 63,2 MB

            This is not what i want to see!! This segment is even bigger with compression on!!


            Best regards,

            Rob Tousain
            • 3. Re: PDF's storing in Compressed BLOB Securefile doesn't save space
              sxkumar
              This is behavior is not expected. Can you please send me a copy of the PDF file as well via email?

              Sushil Kumar
              Oracle Database Development
              • 4. Re: PDF's storing in Compressed BLOB Securefile doesn't save space
                user3963943
                Sushil,

                what email address ??
                Rob
                • 6. Re: PDF's storing in Compressed BLOB Securefile doesn't save space
                  sxkumar
                  Rob,

                  We have filed a bug to investigate this issue further. In the meanwhile, can you repeat your experiment with another PDF, may be one of the PDF files posted on OTN, and let us know if you see the same behavior? How about the following?

                  http://www.oracle.com/technology/products/database/securefiles/pdf/securefilesperformancepaper.pdf

                  Thanks,

                  Sushil
                  • 7. Re: PDF's storing in Compressed BLOB Securefile doesn't save space
                    damorgan
                    If you are with Bill Hodak's team I will leave this to you to investigate. If not let me know and I will pursue it.
                    • 8. Re: PDF's storing in Compressed BLOB Securefile doesn't save space
                      sxkumar
                      Thanks Dan! Yes, I and Bill work together on Advanced Compression.
                      • 9. Re: PDF's storing in Compressed BLOB Securefile doesn't save space
                        user3963943
                        Hello Sushil,

                        i have used the securefileperformancepaper.pdf as test doument, i also simplefied the test:

                        1.
                        Inserted the pdf into a table:
                        ---
                        declare

                        Dest_loc BLOB;
                        Src_loc BFILE;

                        BEGIN
                        INSERT INTO SNL_SCAN.DOCUMENTEN (scan_id,owner_id, document) VALUES (1,1, EMPTY_BLOB())
                        RETURNING document INTO Dest_loc;
                        Src_loc := BFILENAME ('DIR_TESTCASE', 'sec.pdf');
                        DBMS_LOB.FILEOPEN (Src_loc, DBMS_LOB.LOB_READONLY);
                        DBMS_LOB.LOADFROMFILE (Dest_loc, Src_loc, dbms_lob.getlength (Src_loc));
                        DBMS_LOB.FILECLOSE (Src_loc);
                        INSERT INTO SNL_SCAN.DOCUMENTEN (scan_id,owner_id, document) VALUES (2,2, EMPTY_BLOB())
                        RETURNING document INTO Dest_loc;
                        Src_loc := BFILENAME ('DIR_TESTCASE', 'sec.pdf');
                        DBMS_LOB.FILEOPEN (Src_loc, DBMS_LOB.LOB_READONLY);
                        DBMS_LOB.LOADFROMFILE (Dest_loc, Src_loc, dbms_lob.getlength (Src_loc));
                        DBMS_LOB.FILECLOSE (Src_loc);
                        END;
                        /
                        ----

                        2.
                        created table
                        ----
                        CREATE TABLE NOCOMP ( a BLOB)
                        LOB(a) STORE AS SECUREFILE
                        ( CACHE ) ;
                        -----
                        inserted 10 documents
                        10x-----
                        insert into nocomp
                        (select document from snl_scan.documenten where scan_id= '1');
                        commit;
                        -----
                        the lobsegment is now 4,19 MB
                        3.
                        inserted another 100 rows
                        the lobsegment is now 42,2 MB


                        4.
                        Created table with compressed LOB
                        ------
                        CREATE TABLE COMP ( a BLOB)
                        LOB(a) STORE AS SECUREFILE
                        ( COMPRESS HIGH
                        CACHE ) ;
                        ----------

                        -----
                        inserted 10 documents
                        10x-----
                        insert into comp
                        (select document from snl_scan.documenten where scan_id= '1');
                        commit;
                        -----
                        the lobsegment is now 4,25 MB
                        inserted another 100 rows
                        the lobsegment is now 42,2 MB

                        5.
                        so there is no compression.

                        6.
                        Created table with compressed LOB an DEDUPLICTION
                        ---------------
                        CREATE TABLE COMP_DEDUP ( a BLOB)
                        LOB(a) STORE AS SECUREFILE
                        ( COMPRESS HIGH
                        DEDUPLICATE
                        CACHE ) ;
                        -----------

                        7,
                        inserted 10 documents
                        10x-----
                        insert into comp
                        (select document from snl_scan.documenten where scan_id= '1');
                        commit;
                        -----
                        the lobsegment is now 1,25 MB
                        inserted another 100 rows
                        the lobsegment is now 1,25 MB


                        Deduplication is working, compression with the oracle pdf isn't saving any space......


                        8.
                        VIEW DBA_LOBS
                        TABLE_NAME     COL     SEGMENT_NAME CACHE LOGGING ENCRYPT     COMPRESSION DEDUPLICATION     IN_ROW     FORMAT     PARTITIONED     SECUREFILE

                        COMP_DEDUP     A     SYS_LOB0000107972C00001$$     YES     YES     NO     HIGH     LOB     YES N/A NO     YES
                        COMP      A     SYS_LOB0000107981C00001$$     YES     YES     NO     HIGH     NO     YES N/A NO     YES
                        NOCOMP      A     SYS_LOB0000107984C00001$$     YES     YES     NO     NO     NO     YES      N/A NO     YES


                        Hope this helps.

                        Rob Tousain

                        00 31 6 28660287
                        • 10. Re: PDF's storing in Compressed BLOB Securefile doesn't save space
                          sxkumar
                          Rob,

                          Thanks for running the tests.

                          The reason you are not seeing any compression is because the SecureFiles tries to estimate how much gain you are going to get by compression and if is below is a given threshold (20%), we don't bother compressing the file assuming the storage gains do not justify the CPU cost. For this particular files, GZIP level provides you about 7% compression, which is lower than the minimum threshold of 20%.

                          Please note however if if this file were get larger during subsequent steps, you might see the compression suddenly kicking in.

                          Hope this helps.

                          Sushil
                          • 11. Re: PDF's storing in Compressed BLOB Securefile doesn't save space
                            user3963943
                            Hello Sushil,

                            thank you for answering. But i keep stuck with a not so nice feeling about the answer.

                            1. Advanced Compression is an extra cost option, in my situation about 12.000 euro's
                            2. So Advanced Compresseion is as simple as GZIP ???
                            3. I was hoping that Oracle really did some work for the "Advanced" Option.

                            4. I have 1 table with 3 million documents (scan of passports etc.) All pdf documents. Total tablesize is now 1,6 TB
                            5. I have to make a businessCase because our documents will grow to 10 million next year 5 TB and i was thinking for using Advanced compression and so we can save Storage....

                            6. What will be the best way to store my documents?
                            7. Where can i read the Official Oracle doumentation about compression , pdf's and the "less then 20%' rule

                            Rob Tousain
                            31 6 28660287
                            • 12. Re: PDF's storing in Compressed BLOB Securefile doesn't save space
                              user3963943
                              Hello shusil,

                              in my revious answers you see that the other pdf was zipped bij WinZip with 19% and also this one didn't compress.

                              Can you provide me a pdf wich can be srhunk over 20% so i can do some testing?

                              Best regards,

                              Rob Tousain

                              rob.tousain@ibridge.nl
                              • 13. Re: PDF's storing in Compressed BLOB Securefile doesn't save space
                                sxkumar
                                Rob,

                                Advanced Compression consists of a whole range of technologies, including compression for structured/relational data, compression for unstructured data, backup, and network compression. It seems like your primary interest in the compression of PDF files and we use the best of industry standrad alogrithams to make sure that we provide you a good compression. One of the design objectives for this feature was to make sure that we can intelligently detect where compression actually benefits the use case. That's the reason whe have put an internal threshold. We don't document it extensively since it is one of those internal optimizations but may be we should explain it a bit better to users like you.

                                As far as your situation is concerned, we will be happy to work with you and help make a case for advanced compression. There are ways we can fiddle with the 20% threshold. Why don't you drop me a private email and we will take it up from there? I will also try to find a PDF file that provides a good compression even with the default 20% threshold.

                                Thanks,

                                Sushil
                                • 14. Re: PDF's storing in Compressed BLOB Securefile doesn't save space
                                  user3963943
                                  Hello Sushil,

                                  my email = rob.tousain@ibridge.nl