This content has been marked as final. Show 43 replies
probably due to the pipe-lining/use of parallel/bulk collect, instead of adding to a lob. it suits your use case more (source relational tables)
I use this:
ALTER SESSION ENABLE PARALLEL QUERY; SELECT nt.* from table( toolsdwh.p_cursor2file( cursor( select /*+ PARALLEL(s,8) */ xmlelement("r", xmlforest( ROUND(AR_BHVR_KEY,4) as "AR_BHVR_KEY" , ROUND(AR_KEY,4) as "AR_KEY" , ROUND(AR_ACTVN_ST_KEY,4) as "AR_ACTVN_ST_KEY" , AR_ACTVN_ST_CD as "AR_ACTVN_ST_CD" , ROUND(AR_CMPLN_ST_KEY,4) AS "AR_CMPLN_ST_KEY" , AR_CMPLN_ST_CD as "AR_CMPLN_ST_CD" , ROUND(AR_MKT_SEG_KEY,4) as "AR_MKT_SEG_KEY" , AR_MKT_SEG_CD as "AR_MKT_SEG_CD" , ROUND(AR_RTLR_KEY,4) as "AR_RTLR_KEY" , AR_RTLR_CD as "AR_RTLR_CD" , ROUND(AR_ST_DT_KEY,4) as "AR_ST_DT_KEY" , TO_CHAR(AR_ST_DT,'YYYY-MM-DD"T"HH24:MI:SS"Z"') as "AR_ST_DT" , ROUND(AR_ST_KEY,4) as "AR_ST_KEY" , AR_ST_CD as "AR_ST_CD" , ROUND(CNTRCT_BILL_CYCL_KEY,4) as "CNTRCT_BILL_CYCL_KEY" , CNTRCT_BILL_CYCL_CD as "CNTRCT_BILL_CYCL_CD" , ROUND(CNTRCT_EFF_DT_KEY,4) as "CNTRCT_EFF_DT_KEY" , TO_CHAR(CNTRCT_EFF_DT,'YYYY-MM-DD"T"HH24:MI:SS"Z"') as "CNTRCT_EFF_DT" , ROUND(CNTRCT_PYMNT_TP_KEY,4) as "CNTRCT_PYMNT_TP_KEY" , CNTRCT_PYMNT_TP_CD as "CNTRCT_PYMNT_TP_CD" , ROUND(CNTRCT_MKT_SEG_KEY,4) as "CNTRCT_MKT_SEG_KEY" , CNTRCT_MKT_SEG_CD AS "CNTRCT_MKT_SEG_CD" , ROUND(ACTVN_OFFCR_KEY,4) as "ACTVN_OFFCR_KEY" , ACTVN_OFFCR_CD as "ACTVN_OFFCR_CD" , ROUND(EMPE_KEY,4) as "EMPE_KEY" , EMPE_CD as "EMPE_CD" , ROUND(RLN_OFFCR_KEY,4) as "RLN_OFFCR_KEY" , RLN_OFFCR_CD as "RLN_OFFCR_CD" , ROUND(NTW_TCHNLGY_KEY,4) as "NTW_TCHNLGY_KEY" , NTW_TCHNLGY_CD as "NTW_TCHNLGY_CD" , ROUND(PD_KEY,4) as "PD_KEY" , PD_CD as "PD_CD" , ROUND(PD_ALTRNTV_1_KEY,4) as "PD_ALTRNTV_1_KEY" , PD_ALTRNTV_1_CD as "PD_ALTRNTV_1_CD" , ROUND(PD_ALTRNTV_2_KEY,4) as "PD_ALTRNTV_2_KEY" , PD_ALTRNTV_2_CD as "PD_ALTRNTV_2_CD" , ROUND(PD_ALTRNTV_3_KEY,4) as "PD_ALTRNTV_3_KEY" , PD_ALTRNTV_3_CD as "PD_ALTRNTV_3_CD" , ROUND(PYMNT_ENTTY_KEY,4) as "PYMNT_ENTTY_KEY" , PYMNT_ENTTY_CD as "PYMNT_ENTTY_CD" , ROUND(PYMNT_PRCSSR_KEY,4) as "PYMNT_PRCSSR_KEY" , PYMNT_PRCSSR_CD as "PYMNT_PRCSSR_CD" , ROUND(RI_OWN_ST_KEY,4) as "RI_OWN_ST_KEY" , RI_OWN_ST_CD as "RI_OWN_ST_CD" , CNTRCT_DD as "CNTRCT_DD" , SIM_DD as "SIM_DD" , TO_CHAR(EFF_DT,'YYYY-MM-DD"T"HH24:MI:SS"Z"') as "EFF_DT" , TO_CHAR(PPN_DT,'YYYY-MM-DD"T"HH24:MI:SS"Z"') as "PPN_DT" , CNTRL_MD5_AR as "CNTRL_MD5_AR" , CNTRL_MD5 as "CNTRL_MD5" , ROUND(PRTBLTY_OPR_KEY,4) as "PRTBLTY_OPR_KEY" , PRTBLTY_OPR_CD as "PRTBLTY_OPR_CD" , ROUND(PRTBLTY_TP_KEY,4) as "PRTBLTY_TP_KEY" , PRTBLTY_TP_CD as "PRTBLTY_TP_CD" , ROUND(PRTBLTY_DT_KEY,4) as "PRTBLTY_DT_KEY" , TO_CHAR(PRTBLTY_DT,'YYYY-MM-DD"T"HH24:MI:SS"Z"') as "PRTBLTY_DT" , ROUND(AR_BILL_ST_KEY,4) as "AR_BILL_ST_KEY" , AR_BILL_ST_CD as "AR_BILL_ST_CD" , ROUND(CNTRCT_END_DT_KEY,4) as "CNTRCT_END_DT_KEY" , TO_CHAR(CNTRCT_END_DT,'YYYY-MM-DD"T"HH24:MI:SS"Z"') as "CNTRCT_END_DT" , ROUND(CSTMR_KEY,4) as "CSTMR_KEY" , CSTMR_CD as "CSTMR_CD" , ROUND(PRN_AR_KEY,4) as "PRN_AR_KEY" , PRN_AR_CD as "PRN_AR_CD" , ROUND(IP_USR_KEY,4) as "IP_USR_KEY" , IP_USR_CD AS "IP_USR_CD" , MSISDN_DD as "MSISDN_DD" ) ) XMLROWS from DTL.AR_BHVR_DIM s ), 'testfile', 'TMP' ) ) nt ;
I know additional that if I use a function with parameters NOT use parallel option:
If use the function like normal way. Separated by commas use the parallel.
SELECT nt.* from table( toolsdwh.p_cursor2file( P_FILENAME => 'prueba.xml' ,P_DIRECTORY => 'TMP' ,P_PREFIX => '<?xml version="1.0" encoding="UTF-8"?><root>' ,P_POSTFIX => '</root>' --,P_MAXSIZE_FILE => 100000 ,P_CURSOR => CURSOR( .... ....
Edited by: 984830 on Feb 6, 2013 8:52 AM
SELECT nt.* from table( toolsdwh.p_cursor2file( CURSOR( .... ), 'prueba.xml','TMP','<?xml version="1.0" encoding="UTF-8"?><root>','</root>', 100000) )
One thing to bear in mind is that CLOB2FILE has to convert from UCS2 to the target character set before writing the output as a BLOB....
I just want to clarify what I wanted to say in the previous two comments.
When I am using a position base parameters the Oracle parallelism it is working.
When I am using a name base parameters the Oracle parallelism it is not working for whatever reason.
Parameter might be essential for the parallel bit?
--,P_MAXSIZE_FILE => 100000
...PLUS...with your post and prefix parameters, you have (guessing here) seriously altered Adrian Billington's code by adding stuff to the original example on (http://www.oracle-developer.net/display.php?id=425), so our reference would be...?
The reference is:
Edited by: 984830 on Feb 6, 2013 11:25 AM
CREATE TYPE TOOLSDWH.DUMP_FILE_LOG AS OBJECT ( END_TIMESTAMP TIMESTAMP , SESSION_ID NUMBER , OLD_FILENAME VARCHAR2(512) , NEW_FILENAME VARCHAR2(512) , NO_RECORDS NUMBER , FILESIZE NUMBER ) / CREATE TYPE TOOLSDWH.DUMP_FILE_LOG_ROW AS TABLE OF DUMP_FILE_LOG / create or replace FUNCTION P_CURSOR2FILE ( P_CURSOR IN SYS_REFCURSOR, P_FILENAME IN VARCHAR2, P_DIRECTORY IN VARCHAR2, P_PREFIX IN VARCHAR2 DEFAULT '', P_POSTFIX IN VARCHAR2 DEFAULT '', P_MAXSIZE_FILE IN NUMBER DEFAULT -1 ) RETURN TOOLSDWH.DUMP_FILE_LOG_ROW PIPELINED PARALLEL_ENABLE (PARTITION p_cursor BY ANY) AUTHID CURRENT_USER IS /* Version Control 20130204 - First version of the exporter cursor to a file */ VC_VERSION CONSTANT varchar2(30) := '1.0.0'; vc_job_name constant varchar2(50) := 'TOOLSDWH.P_CURSOR2FILE'; VC_USUARIO_BD CONSTANT VARCHAR2(30) := 'TOOLSDWH'; vc_job_owner constant varchar2(30) := 'CRISTIAN SAAVEDRA'; VC_JOB_TYPE CONSTANT VARCHAR2(30) := 'PL-SQL'; V_LINES PLS_INTEGER := 0; V_COUNTER PLS_INTEGER := 1; C_EOL CONSTANT VARCHAR2(1) := CHR(10); C_EOLLEN CONSTANT PLS_INTEGER := LENGTH(C_EOL); C_MAXLINE CONSTANT PLS_INTEGER := 32767; TYPE V_ROW IS TABLE OF VARCHAR2(32767); V_ROWS V_ROW; V_FILE UTL_FILE.FILE_TYPE; V_BUFFER VARCHAR2(32767); V_SIZE NUMBER; V_SID NUMBER; V_NAME VARCHAR2(1024); BEGIN -- Get the session id SELECT sid INTO v_sid FROM v$mystat WHERE ROWNUM = 1; -- File Name with session id prefix v_name := 'id' || TO_CHAR(v_sid) || '_' || TO_CHAR(V_COUNTER) || '_' || p_filename; V_FILE := UTL_FILE.FOPEN(P_DIRECTORY, V_NAME, 'w', 32767); V_SIZE := 0; -- Add Prefix to the file IF (P_PREFIX IS NOT NULL) THEN V_SIZE := V_SIZE + LENGTH(P_PREFIX) + 1; UTL_FILE.PUT_LINE(V_FILE, P_PREFIX); end if; LOOP -- Bulk Collect to manage the memory FETCH p_cursor BULK COLLECT INTO v_rows LIMIT 100; -- Buffer cache to optimize the write FOR I IN 1 .. V_ROWS.COUNT LOOP IF LENGTH(v_buffer) + c_eollen + LENGTH(v_rows(i)) <= c_maxline THEN V_BUFFER := V_BUFFER || C_EOL || V_ROWS(I); ELSE IF V_BUFFER IS NOT NULL THEN V_SIZE := V_SIZE + LENGTH(V_BUFFER) + 1; UTL_FILE.PUT_LINE(V_FILE, V_BUFFER); END IF; V_BUFFER := V_ROWS(I); END IF; END LOOP; V_LINES := V_LINES + V_ROWS.COUNT; EXIT WHEN p_cursor%NOTFOUND; END LOOP; CLOSE p_cursor; V_SIZE := V_SIZE + LENGTH(v_buffer) + 1; UTL_FILE.PUT_LINE(V_FILE, V_BUFFER); -- Add Postfix to the file IF (P_POSTFIX IS NOT NULL) THEN V_SIZE := V_SIZE + LENGTH(P_POSTFIX) + 1; UTL_FILE.PUT_LINE(V_FILE, P_POSTFIX); end if; UTL_FILE.FCLOSE(v_file); PIPE ROW (DUMP_FILE_LOG(LOCALTIMESTAMP, v_sid, P_FILENAME, v_name, v_lines, V_SIZE)); return; END;
984830 wrote:What do redologs have to do with this?
--Oracle Pipeline to optimize the memory and redolog use
Are you performing any transaction in the process?
It's a comment for remember that using pipelined get less redolog. Not Apply to this procedure.
It is faster primarily because of the switch to set-based SQL. Set-based DML (such as the INSERT...SELECT I used in my pipelined load) is almost always considerably faster than a row-based, procedural solution. In this particular case, I have benefited directly from the Oracle database's internal optimization of set-based inserts. Specifically, the database writes considerably less redo information for set-based inserts (INSERT...SELECT) than it does for singleton inserts (INSERT...VALUES). That is to say, if I insert 100 rows in a single statement, it will generate less redo than if I inserted 100 rows one-by-one. My original legacy load of 1 million tickertable rows generated over 270Mb of redo information. This was reduced to just over 37Mb when using the pipelined function-based load, contributing to a significant proportion of the time savings
Maybe it's just me but I usually tend to avoid writing comments that don't apply to the actual code being commented :)
You have to admit it could get very misleading for someone reading your code and trying to understand why this particular choice has been made, and what are its implication and benefits.
I always thought that a clob uses the database characterset. Does clob2file expects a clob to be in UCS2?
I guess I always assume the database character set is AL32UTF8 in which case the CLOB is UCS2...
Odie. You are completely right. It was a copy and paste error.
Do you know, Why with the "name base parameter" not use the parallelism?