Discussions
Categories
- 196.7K All Categories
- 2.2K Data
- 234 Big Data Appliance
- 1.9K Data Science
- 449.8K Databases
- 221.5K General Database Discussions
- 3.8K Java and JavaScript in the Database
- 31 Multilingual Engine
- 549 MySQL Community Space
- 477 NoSQL Database
- 7.9K Oracle Database Express Edition (XE)
- 3K ORDS, SODA & JSON in the Database
- 532 SQLcl
- 4K SQL Developer Data Modeler
- 186.8K SQL & PL/SQL
- 21.2K SQL Developer
- 295.3K Development
- 17 Developer Projects
- 138 Programming Languages
- 292K Development Tools
- 104 DevOps
- 3.1K QA/Testing
- 645.9K Java
- 27 Java Learning Subscription
- 37K Database Connectivity
- 153 Java Community Process
- 105 Java 25
- 22.1K Java APIs
- 138.1K Java Development Tools
- 165.3K Java EE (Java Enterprise Edition)
- 17 Java Essentials
- 157 Java 8 Questions
- 85.9K Java Programming
- 79 Java Puzzle Ball
- 65.1K New To Java
- 1.7K Training / Learning / Certification
- 13.8K Java HotSpot Virtual Machine
- 94.2K Java SE
- 13.8K Java Security
- 203 Java User Groups
- 24 JavaScript - Nashorn
- Programs
- 389 LiveLabs
- 37 Workshops
- 10.2K Software
- 6.7K Berkeley DB Family
- 3.5K JHeadstart
- 5.6K Other Languages
- 2.3K Chinese
- 170 Deutsche Oracle Community
- 1K Español
- 1.9K Japanese
- 230 Portuguese
Fetching CLOB column Faster

Hi, We are using version 11.2.0.4 of oracle exadata. We have below query which fetch the data and is getting executed from informatica. And its just a "SELECT * from TAB1" query. It was taking ~1hr for fetching ~135k rows and then from the sql monitor we found all the time was just spent on client for fetching data. And then we got to know it has one CLOB column which is causing the issue, if we comment the CLOB column(C10) , the data fetch is finishing in few seconds. So as an alternative we were using below SUBSTR option to fetch the column C10 and it was helping us to finish the query in few seconds. But suddenly we got to see failure for this query with error "ORA-06502: PL/SQL: numeric or value error: character string buffer too small" and then its found its failing because of few of the values came into column C10 which were holding values >4000 bytes. So want to understand if there is any alternate way we can fetch the clob column here without fail and for large value(>4000bytes)?
Query:-
SELECT c1,c2,c3...c39 FROM TAB1;
Alternate option to fetch column C10:-
DBMS_LOB.SUBSTR (C10,(SELECT MAX (LENGTH (C10)) FROM TAB1)) C10
Error:-
ORA-06502: PL/SQL: numeric or value error: character string buffer too small
ORA-06512: at line 1
Below is the sql monitor for one of the slow execution which we used to see with CLOB column fetched in full:-
Query:- SELECT c1,c2,c3...c39 FROM TAB1;Global Information------------------------------ Status : EXECUTING Instance ID : 4 SQL Execution ID : 67108864 Execution Started : 04/09/2018 06:02:49 First Refresh Time : 04/09/2018 06:02:49 Last Refresh Time : 04/09/2018 06:40:45 Duration : 2277s Module/Action : SQL*Plus/- Program : sqlplus.exe Fetch Calls : 26415 Global Stats=================================================| Elapsed | Cpu | Cluster | Fetch | Buffer || Time(s) | Time(s) | Waits(s) | Calls | Gets |=================================================| 0.69 | 0.69 | 0.01 | 26415 | 27031 |=================================================SQL Plan Monitoring Details (Plan Hash Value=2531190874)============================================================================================================================================================| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Activity | Activity Detail | Progress || | | | (Estim) | | Active(s) | Active | | (Actual) | (%) | (# samples) | |============================================================================================================================================================| -> 0 | SELECT STATEMENT | | | | 2278 | +0 | 1 | 26417 | | | || -> 1 | TABLE ACCESS STORAGE FULL | TAB1 | 135K | 7212 | 2278 | +0 | 1 | 26417 | | | 6% |============================================================================================================================================================
Best Answer
-
933257 wrote:Initially i thought you are suggesting to have a UNION but then i realized as the SUBSTR will be VARCHAR data type vs the other part of the query will be a CLOB, so it has to be fetched twice by informatica. Correct me if wrong.And also i tested the query by setting high value of longchunksize and long but that didn't affect the query performance. So other one i was thinking, if i should test this by setting the CACHE option for the CLOB storage and if that will benefit us?
Exactly, unioning will need to use the same data type so won't work too well. Although if you can make it project the clob as null for the short lines it might be okay - would probably confuse informatica though!
I meant run:
select ..substr(clobcol,1,4000) my_clob from my_table where dbms_lob.getLength(clobcol)<=4000
and
select ..clobcol my_clob from my_table where dbms_lob.getLength(clobcol)>4000
Assuming 1 byte characters.
I don't think cache will help, the problem is in the network round trips from what I can tell
Answers
-
Hello,
Check if the following link works for you:
Regards,
Adi
-
if we comment the CLOB column(C10) , the data fetch is finishing in few seconds.
Is this surprising?
A CLOB is a large object.
Exactly how long it takes to fetch will depend on how large the object is. The larger the object, the longer it will take.
And it will depend on client/network settings - you will basically be roundtripping on the CLOB.
If you traced this you would see exactly how many round trips etc.
Some client software has different options in how to deal with CLOBs.
There also might be some influence on whether internally CLOB on table is set to CACHE / NOCACHE etc.
So as an alternative we were using below SUBSTR option to fetch the column C10 and it was helping us to finish the query in few seconds
Really? Why would this be a valid way of returning an entire CLOB?
If it really is effective in limited circumstances then it would only work for CLOBS < 4000 characters because you're then returning a VARCHAR2 which means that you are probably fetching that data differently in the client software which deals with large objects via lob locators - which is all you can do once it is actually "large". And it will predictably fail for anything > 4000 characters.
Investigate what options informatica has for dealing with clobs and what network settings you are currently using.
Do some tracing with current setup so you can see exactly what the problem is (e.g. network round tripping mainly with some physical IO probably) and then you can see what effect any change you make has.
-
One of the features of selecting LOBs is that the client code probably has to fetch the LOB Locator and then fetch the LOB data one item at a time. Then there's probably a fixed size buffer the client uses for fetching the LOB so a single LOB may require multiple round-trips before the whole thing is acquired.
At some point in the past SQL*Plus used to fetch 80 bytes of a LOB at a time, and if the LOB was declared "nocache" each 80 byte chunk would require the block to be re-read. The "workaround" to this was to "set long" and "set longchunksize" in SQL*Plus to larger values - possibly there's a similar configuration issue that you need to review in Informatica.
As a test of hypothesis you could start the query running and check how many "SQL*Net message to client" and "SQL*Net more data to client" it takes to move a small subset of your larger LOBs.
Regards
Jonathan Lewis
-
As i see the CLOB column definition its saying " ENABLE STORAGE IN ROW CHUNK 8192 RETENTION NOCACHE". So in table its defined as NOCACHE.
In the last posted sql monitor the number of fetch calls was 26415 for 26417 rows in rows(Actual) column of the sql monitor.
I tried fetching the number of rows which really has rows having length>4000 bytes, and it turns out to be very less(i.e 81 out of 124k).
SELECT max(length(C10)), count(*) FROM TAB1 where LENGTH (C10)<4000
MAX(LENGTH(C10)) COUNT(*)
3823 124876
SELECT max(length(C10)), count(*) FROM TAB1
MAX(LENGTH(C10)) COUNT(*)
18799 124957
Now comparing fetching CLOB column vs Varchar2(4000), i tried to test/compare the performance for rows with <4000 byte length. Still seeing the CLOB column fetch is significantly slower, why? Isn't it true that in below two scenarios we are fetching same amount of data from CLOB column, then why the CLOB fetch is so much slower. I am able to see the same effect when i run it from sqlplus client.
Fetching CLOB column -C10 as it is.:- SQL Text------------------------------SELECT /*+monitor*/c1,c2,c3...TAB1.C10 FROM TAB1 where LENGTH (C10)<4000Global Information------------------------------ Status : EXECUTING Instance ID : 4 SQL ID : bwyw0v1h3pgxq SQL Execution ID : 67108864 Execution Started : 07/30/2019 12:45:54 First Refresh Time : 07/30/2019 12:45:54 Last Refresh Time : 07/30/2019 12:57:25 Duration : 691s Module/Action : SQL*Plus/- Program : sqlplus.exe Fetch Calls : 8526 Global Stats=================================================| Elapsed | Cpu | Other | Fetch | Buffer || Time(s) | Time(s) | Waits(s) | Calls | Gets |=================================================| 0.30 | 0.03 | 0.27 | 8526 | 8794 |=================================================SQL Plan Monitoring Details (Plan Hash Value=2531190874)============================================================================================================================================================| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Activity | Activity Detail | Progress || | | | (Estim) | | Active(s) | Active | | (Actual) | (%) | (# samples) | |============================================================================================================================================================| -> 0 | SELECT STATEMENT | | | | 693 | +0 | 1 | 8534 | | | || -> 1 | TABLE ACCESS STORAGE FULL | TAB1 | 6248 | 8594 | 693 | +0 | 1 | 8534 | | | 2% |============================================================================================================================================================Fetching CLOB column -C10 using SUBSTR function:- SQL Monitoring ReportSQL Text------------------------------SELECT /*+monitor*/c1,c2,..DBMS_LOB.SUBSTR (C10,(SELECT MAX (LENGTH (C10)) FROM TAB1)) C10 FROM TAB1 where LENGTH (C10)<4000Global Information------------------------------ Status : DONE (ALL ROWS) Instance ID : 4 SQL ID : g0ya736m3sgdu SQL Execution ID : 67108864 Execution Started : 07/30/2019 12:53:38 First Refresh Time : 07/30/2019 12:53:38 Last Refresh Time : 07/30/2019 12:53:45 Duration : 7s Module/Action : SQL*Plus/- Program : sqlplus.exe Fetch Calls : 26 Global Stats===================================================================================================| Elapsed | Cpu | IO | Concurrency | PL/SQL | Other | Fetch | Buffer | Read | Read || Time(s) | Time(s) | Waits(s) | Waits(s) | Time(s) | Waits(s) | Calls | Gets | Reqs | Bytes |===================================================================================================| 5.11 | 1.58 | 0.78 | 0.00 | 1.22 | 2.76 | 26 | 81960 | 151 | 1MB |===================================================================================================SQL Plan Monitoring Details (Plan Hash Value=3132348012)================================================================================================================================================================| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Activity | Activity Detail || | | | (Estim) | | Active(s) | Active | | (Actual) | (%) | (# samples) |================================================================================================================================================================| 0 | SELECT STATEMENT | | | | 6 | +1 | 1 | 1 | 83.33 | Cpu (1) || | | | | | | | | | | SQL*Net more data to client (3) || | | | | | | | | | | direct path read (1) || 1 | SORT AGGREGATE | | 1 | | 1 | +1 | 1 | 1 | | || 2 | TABLE ACCESS STORAGE FULL | TAB1 | 125K | 7212 | 1 | +1 | 1 | 125K | | || 3 | TABLE ACCESS STORAGE FULL | TAB1 | 6248 | 8594 | 7 | +1 | 1 | 125K | 16.67 | Cpu (1) |================================================================================================================================================================Predicate Information (identified by operation id):--------------------------------------------------- 3 - filter(LENGTH("C10")<4000)Statistics---------------------------------------------------------- 47 recursive calls 0 db block gets 81999 consistent gets 163 physical reads 572 redo size 32115212 bytes sent via SQL*Net to client 736 bytes received via SQL*Net from client 26 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 124876 rows processed
-
One thing i noticed, irrespective of the type of lob value, its taking same amount of time. I mean to say, if i run the "select query for ~81 rows which holds larger CLOB value(length>4000bytes)" and compare the time with the "select query for other ~81 rows having length <4000 bytes " both are consuming same amount of time.
Tried running the query for ~1000 rows having rows with length<4000bytes, by setting "set longchunksize 200000" and "set long 200000", but seeing similar timing as below(i.e. without any of these set at sqlplus).
Fetched the CLOB column as it is for ~1000 rows with arraysize of 5000 in sqlplus, below is the sql monitor for same.
Edited:- I am not sure if changing the storage option to CACHE will help us here, but will try to test it on Dev.
Global Information------------------------------ Status : DONE (ALL ROWS) Instance ID : 4 SQL ID : 5mr5nx1x81dfg SQL Execution ID : 67108864 Execution Started : 07/30/2019 14:57:58 First Refresh Time : 07/30/2019 14:57:58 Last Refresh Time : 07/30/2019 14:59:23 Duration : 85s Module/Action : SQL*Plus/- Program : sqlplus.exe Fetch Calls : 1000 Global Stats=================================================| Elapsed | Cpu | Other | Fetch | Buffer || Time(s) | Time(s) | Waits(s) | Calls | Gets |=================================================| 0.03 | 0.00 | 0.03 | 1000 | 1065 |=================================================SQL Plan Monitoring Details (Plan Hash Value=4129443724)===========================================================================================================================================================| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Activity | Activity Detail || | | | (Estim) | | Active(s) | Active | | (Actual) | (%) | (# samples) |===========================================================================================================================================================| 0 | SELECT STATEMENT | | | | 86 | +0 | 1 | 999 | | || 1 | COUNT STOPKEY | | | | 86 | +0 | 1 | 999 | | || 2 | TABLE ACCESS STORAGE FULL FIRST ROWS | TAB1 | 999 | 2534 | 86 | +0 | 1 | 999 | | |===========================================================================================================================================================Predicate Information (identified by operation id):--------------------------------------------------- 1 - filter(ROWNUM<1000) 2 - filter(LENGTH("C10")<4000)Statistics---------------------------------------------------------- 1 recursive calls 0 db block gets 1068 consistent gets 3 physical reads 0 redo size 775428 bytes sent via SQL*Net to client 306930 bytes received via SQL*Net from client 2000 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 999 rows processed
-
We have run into this before, with JDBC and also with SSIS.
If a column is CLOB/NCLOB and it is not null in a particular row, then (as Jonathan pointed out) there will be multiple trips to fetch the column value for that row. You can't avoid that.
If the CLOB/NCLOB column is null in a particular row, then this does not happen.
On the other hand, if a column is VARCHAR/NVARCHAR, you will fetch many rows in a single trip.
You can see this behavior in a SQL trace.
What to do? If you have a rather small number of rows with the actual CLOB data > 4000 bytes (i.e. too large for VARCHAR) then you can do this:
SELECT
CASE WHEN LENGTH(MY_CLOB_COLUMN) <= 4000 then CAST(MY_CLOB_COLUMN AS VARCHAR2(4000)) ELSE NULL END AS MY_CLOB_COLUMN_AS_VARCHAR2,
CASE WHEN LENGTH(MY_CLOB_COLUMN) <= 4000 then NULL ELSE MY_CLOB_COLUMN END AS MY_CLOB_COLUMN_AS_CLOB,
(note: if NCLOB then use 2000 instead of 4000)
And the client application (Informatica) would have to marry these two columns (MY_CLOB_COLUMN_AS_VARCHAR2 and MY_CLOB_COLUMN_AS_CLOB) together.
So the result is that the vast majority of your rows would have nulls in their CLOB columns, so the multi-trip behavior drops dramatically.
But now your client has to deal with the two columns to get the single value you want. Don't try to put them back together within a SQL statement, you will just be putting yourself back where you started.
If most of your CLOBs are too big to fit in a VARCHAR - don't bother.
-
Just to clarify one important detail:
If you're running this hacked select from SQL*Plus then whatever you set the arraysize to be Oracle will use single row processing because the query includes a CLOB column in the select list. The reduction in roundtrips occurs because it always takes (at least) two round trips to fetch a row with a non-null CLOB - one to get the row with the LOB Locator and one to fetch the actual LOB content if the locator is null (or, maybe, says that the LOB is empty, but I'd have to check that).
There may be further roundtrips in SQL*Plus if you've done
set long N
set longchunksize M
where N and M are numeric (the defaults are 80) and N is larger than M and the size of some of your LOBs is larger than M.
Moreover, if your lobs are stored out of line and NOCACHE and it takes several chunks to read and display one lob then each of those reads will be a direct path read. There may be something about this on my blog (or old website, or in Practical Oracle 8i somewhere), but if not I'll write something up in the next few days.
Regards
Jonathan Lewis
-
I would like to add something not really related to OP's question. a CLOB with size 4000 will be stored in the lob segment not in the table.It must be less than 4000bytes ( 3964 to be precise). if your data length is 4000 it will be stored in lob segment and will require additional fetch as Jonathan said.
Here some examples about this: CLOB size matters!
-
Jonathan Lewis wrote:Just to clarify one important detail:If you're running this hacked select from SQL*Plus then whatever you set the arraysize to be Oracle will use single row processing because the query includes a CLOB column in the select list. The reduction in roundtrips occurs because it always takes (at least) two round trips to fetch a row with a non-null CLOB - one to get the row with the LOB Locator and one to fetch the actual LOB content if the locator is null (or, maybe, says that the LOB is empty, but I'd have to check that).There may be further roundtrips in SQL*Plus if you've doneset long Nset longchunksize Mwhere N and M are numeric (the defaults are 80) and N is larger than M and the size of some of your LOBs is larger than M.Moreover, if your lobs are stored out of line and NOCACHE and it takes several chunks to read and display one lob then each of those reads will be a direct path read. There may be something about this on my blog (or old website, or in Practical Oracle 8i somewhere), but if not I'll write something up in the next few days.RegardsJonathan Lewis
I thought that the new lobPrefetch option in sql*plus would prevent the extra fetches by including lob values in the standard fetch but it doesn’t seem to behave like that for me - it’s probably for chunks.
If you use odp.net, then you can use the initialLobFetchSize parameter to decide how much of a lob to pull through in the same fetch as the rest of the rows https://docs.oracle.com/en/database/oracle/oracle-database/12.2/odpnt/DataReaderInitialLOBFetchSize.html#GUID-E015AF50-1…
Ive used it in my own big fetches and it makes a huge difference.
I believe Informatica uses the ODBC driver which doesn’t have such an option unfortunately. Do you have to use Informatica for this?
-
Tried modifying the query as you mentioned. But not seeing any improvement. Below is the sql monitor for fetching ~1000 rows of LOB having length<4000bytes. Both using CASE and SUBSTR clause.
Using CASE statement taking same time as it was for fetching whole LOB:-
SQL Monitoring ReportSQL Text------------------------------SELECT c1,c2...,CASE WHEN LENGTH(C10) <= 4000 then CAST(C10 AS VARCHAR2(4000)) ELSE NULL END AS MY_CLOB_COLUMN_AS_VARCHAR2, CASE WHEN LENGTH(C10) > 4000 then NULL ELSE C10 END AS MY_CLOB_COLUMN_AS_CLOB FROM TAB1 where LENGTH (C10)<4000 and rownum<1000Global Information------------------------------ Status : DONE (ALL ROWS) Instance ID : 4 SQL ID : 3m0j31tb3rd8x SQL Execution ID : 67108864 Execution Started : 07/31/2019 05:41:21 First Refresh Time : 07/31/2019 05:41:21 Last Refresh Time : 07/31/2019 05:42:46 Duration : 85s Module/Action : SQL*Plus/- Program : sqlplus.exe Fetch Calls : 1000 Global Stats=================================================| Elapsed | Cpu | Other | Fetch | Buffer || Time(s) | Time(s) | Waits(s) | Calls | Gets |=================================================| 0.02 | 0.00 | 0.02 | 1000 | 1065 |=================================================SQL Plan Monitoring Details (Plan Hash Value=4129443724)===========================================================================================================================================================| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Activity | Activity Detail || | | | (Estim) | | Active(s) | Active | | (Actual) | (%) | (# samples) |===========================================================================================================================================================| 0 | SELECT STATEMENT | | | | 86 | +0 | 1 | 999 | | || 1 | COUNT STOPKEY | | | | 86 | +0 | 1 | 999 | | || 2 | TABLE ACCESS STORAGE FULL FIRST ROWS | TAB1 | 999 | 2335 | 86 | +0 | 1 | 999 | | |===========================================================================================================================================================Predicate Information (identified by operation id):--------------------------------------------------- 1 - filter(ROWNUM<1000) 2 - filter(LENGTH("C10")<4000)Statistics---------------------------------------------------------- 1 recursive calls 0 db block gets 1068 consistent gets 3 physical reads 0 redo size 608700 bytes sent via SQL*Net to client 306929 bytes received via SQL*Net from client 2000 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 999 rows processed
Using SUBSTR function Finishing in 3 seconds:-
SELECT c1,c2....,DBMS_LOB.SUBSTR (C10,(SELECT MAX (LENGTH (C10)) FROM TAB1)) C10 FROM TAB1 where LENGTH (C10)<4000 and rownum<1000999 rows selected.Elapsed: 00:00:03.59Execution Plan----------------------------------------------------------Plan hash value: 2117026829------------------------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |------------------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 999 | 195K| 2335 (1)| 00:00:29 || 1 | SORT AGGREGATE | | 1 | 164 | | || 2 | TABLE ACCESS STORAGE FULL | TAB1 | 151K| 23M| 7212 (1)| 00:01:27 ||* 3 | COUNT STOPKEY | | | | | ||* 4 | TABLE ACCESS STORAGE FULL FIRST ROWS| TAB1 | 999 | 195K| 2335 (1)| 00:00:29 |------------------------------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 3 - filter(ROWNUM<1000) 4 - filter(LENGTH("C10")<4000)Statistics---------------------------------------------------------- 47 recursive calls 0 db block gets 41131 consistent gets 14 physical reads 572 redo size 25154 bytes sent via SQL*Net to client 1197 bytes received via SQL*Net from client 68 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 999 rows processed