Oracle fast parallel data unload into ASCII file(s)


I was always jealous to see Informix and MS SQL users unloading large tables or queries from database to ASCII with few simple commands or/and clicks and never really understood why Oracle did not implement easy and fast unload solution. Yes Oracle 10.2 uses excellent Data Pump and unload for backup purposes is fast and easy, but this is not ascii format.

I’m sure the most common solution is (or better say was until you started reading this blog) to use SPOOL in SQL*PLUS. Although that might be good solution for some, I don’t like it for two reasons – first it’s not fast enough, the second you have to run the command in SQL*PLUS which really limits the automations (you have to leave PL/SQL just to spool data?).

The core of code below comes from Adrian Billington – you can read his excellent blog here . I appended little bit of logging and changed some of the calls so you don’t need DBA grants.

The function (yes it’s simple function you can call from SQL) uses UTL_FILE for data spool and PIPELINE function for optional parallel spooling (yes it’s correct you can spool in parallel thanks to pipeline function features). Pipeline function is amazing thing in general (I used it for very complex forecasting calculations presented in BI where BI is not aware of stored procedures or functions and required to get data back fast). The other great feature of pipelined function is parallelizm (utilized here).

why do I love this solution?

  • it’s very simple SQL, no messy SQL*PLUS commands, no special line size or ON/OFF switches
  • because it’s SQL you can execute it from almost anywhere, you can even plug it into PL/SQL
  • it has very robust logging (both internal – result of SQL) and optional external – unix LOG file
  • it’s fast – if you run this in parallel, you can achieve very high speed (I was able to get really 4x faster dump in parallel 4 with no speed degradation)
 
CREATE OR REPLACE TYPE dump_ot AS OBJECT    
( file_name       VARCHAR2(128) 
,  directory_name VARCHAR2(128)    
,  no_records     NUMBER              
,  session_id     NUMBER 
,  start_dttm     TIMESTAMP WITH TIME ZONE
,  end_dttm       TIMESTAMP WITH TIME ZONE          
);

CREATE OR REPLACE TYPE dump_ntt AS TABLE OF dump_ot;

CREATE OR REPLACE FUNCTION DATA_UNLOAD         ( p_source             IN SYS_REFCURSOR,                                    
                                                 p_filename           IN VARCHAR2,                                         
                                                 p_directory          IN VARCHAR2,
                                                 p_unique_filename    IN VARCHAR2 DEFAULT 'N',
                                                 p_create_log_file    IN VARCHAR2 DEFAULT 'N'                   
                                               ) RETURN dump_ntt PIPELINED PARALLEL_ENABLE (PARTITION p_source BY ANY) AS   
/*
p_source           SQL query you spool
p_filename         targe file name
p_directory        targe Oracle directory
p_unique_filename  create unique file name? Y/N (appends unique SID to p_filename) - useful only for parallel unload
p_create_log_file  create log file? Y/N (creates separate log file and logs every 1mm rows) - has very small performance hit on the spool
*/               

   TYPE row_ntt IS TABLE OF VARCHAR2(32767);                                     
   v_rows       row_ntt;                                                            
   v_file       UTL_FILE.FILE_TYPE;                         
   v_log_file   UTL_FILE.FILE_TYPE;                       
   v_buffer     VARCHAR2(32767);                                                    
   v_sid        VARCHAR(255);                                                             
   v_name       VARCHAR2(255);                                                    
   v_lines      PLS_INTEGER := 0;                                                   
   v_start_dttm TIMESTAMP WITH TIME ZONE:= SYSTIMESTAMP;
   v_end_dttm   TIMESTAMP WITH TIME ZONE;
   v_create_log BOOLEAN := FALSE;

   c_eol        CONSTANT VARCHAR2(1) := CHR(10);                                    
   c_eollen     CONSTANT PLS_INTEGER := LENGTH(c_eol);                              
   c_maxline    CONSTANT PLS_INTEGER := 32767;        
   c_log_limit  CONSTANT PLS_INTEGER := 1000000;                                

BEGIN                

   v_sid  := LPAD(sys_context ('USERENV', 'sid'), 10, '0') ;                                                                              
   v_name := p_filename ;

   IF TRIM(UPPER(p_create_log_file)) = 'Y' THEN
     v_create_log := TRUE;
   END IF;

   -- add SID (must be used for parallel spooling, single spooling or spooling across db-link creates alsway one file)
   IF upper(p_unique_filename) = 'Y' THEN
     v_name := v_name || '_' || TO_CHAR(v_sid) ;  
   END IF;

   v_file := UTL_FILE.FOPEN(p_directory, v_name, 'w', c_maxline);                    

   IF v_create_log THEN
     v_log_file := UTL_FILE.FOPEN(p_directory, v_name || '.LOG', 'w', c_maxline);
     UTL_FILE.PUT_LINE(v_log_file, TO_CHAR(v_start_dttm, 'DD-MON-YYYY HH24:MI:SS:FF3') || ' --> START'  ); 
     UTL_FILE.FFLUSH(v_log_file);            
   END IF;

   LOOP                                                                          
     FETCH p_source BULK COLLECT INTO v_rows LIMIT 1000;                          

      FOR i IN 1 .. v_rows.COUNT LOOP                                            

         IF LENGTH(v_buffer) + c_eollen + LENGTH(v_rows(i)) <= c_maxline THEN    
            v_buffer := v_buffer || c_eol || v_rows(i);                          
         ELSE                                                                    
            IF v_buffer IS NOT NULL THEN                                         
               UTL_FILE.PUT_LINE(v_file, v_buffer);                              
            END IF;                                                              
            v_buffer := v_rows(i);                                               
         END IF;                                                                 

      END LOOP;                                                                  

      v_lines := v_lines + v_rows.COUNT;                                         

      IF v_create_log and MOD(v_lines, c_log_limit) = 0 THEN
        UTL_FILE.PUT_LINE(v_log_file, TO_CHAR(SYSTIMESTAMP, 'DD-MON-YYYY HH24:MI:SS:FF3') || ' --> ' || v_lines  );
        UTL_FILE.FFLUSH(v_log_file);
      END IF;     

      EXIT WHEN p_source%NOTFOUND;                                               
   END LOOP;                                                                     
   CLOSE p_source;                                                             

   UTL_FILE.PUT_LINE(v_file, v_buffer);                                          
   UTL_FILE.FCLOSE(v_file);      
   v_end_dttm := SYSTIMESTAMP; 

   IF v_create_log THEN
     UTL_FILE.PUT_LINE(v_log_file, TO_CHAR(v_end_dttm, 'DD-MON-YYYY HH24:MI:SS:FF3') || ' --> ' || v_lines);
     UTL_FILE.PUT_LINE(v_log_file, TO_CHAR(v_end_dttm, 'DD-MON-YYYY HH24:MI:SS:FF3') || ' --> END');                                          
     UTL_FILE.FCLOSE(v_log_file);
   END IF;                                                   

   PIPE ROW (dump_ot(v_name, p_directory, v_lines, v_sid, v_start_dttm, v_end_dttm));                                   
   RETURN;                                                                       

END;

all samples below use Oracle directory DIR_USERS_JIRI, in my case it maps to unix folder /work/users/jiri

CREATE OR REPLACE DIRECTORY "DIR_USERS_JIRI" AS '/work/users/jiri/';
GRANT READ, WRITE ON DIRECTORY DIR_USERS_JIRI TO JIRI;


and sample table MYTABLE (this sample table is based on ALL_TAB_COLUMNS view).

CREATE TABLE MYTABLE AS
SELECT TABLE_NAME,
       COLUMN_NAME,
       DATA_TYPE
  FROM ALL_TAB_COLUMNS;


sample1

pipe delimited file with no header, no log, no parallel mode

SELECT *                                       
  FROM TABLE(                                  
                 DATA_UNLOAD(                       
                             CURSOR(
                                     SELECT /*+ NO_PARALLEL(A) */
                                            TABLE_NAME                   || '|' || 
                                            COLUMN_NAME                  || '|' ||
                                            DATA_TYPE     
                                       FROM MYTABLE A
                                   ),     
                            'SAMPLE_SPOOL.TXT',    
                            'DIR_USERS_JIRI',
                            'N',
                            'N' )
                            ) ;  

FILE_NAME         DIRECTORY_NAME  NO_RECORDS  SESSION_ID  START_DTTM                           END_DTTM
SAMPLE_SPOOL.TXT  DIR_USERS_JIRI  20796              555  03/18/2009 4:26:39.515437 PM -05:00  03/18/2009 4:26:47.512584 PM -05:00


as you can see it is simple SQL statement, the SQL returns number of rows and total run time – execute it as any other SQL

sample2

pipe delimited file with no header, unix log created, no parallel mode

SELECT *                                       
  FROM TABLE(                                  
                 DATA_UNLOAD(                       
                             CURSOR(
                                     SELECT /*+ NO_PARALLEL(A) */
                                            TABLE_NAME                   || '|' || 
                                            COLUMN_NAME                  || '|' ||
                                            DATA_TYPE     
                                       FROM MYTABLE A
                                   ),     
                            'SAMPLE_SPOOL.TXT',    
                            'DIR_USERS_JIRI',
                            'N',
                            'Y' )
                            ) ;  

FILE_NAME         DIRECTORY_NAME  NO_RECORDS  SESSION_ID  START_DTTM                           END_DTTM
SAMPLE_SPOOL.TXT  DIR_USERS_JIRI  20796              555  03/18/2009 4:26:39.515437 PM -05:00  03/18/2009 4:26:47.512584 PM -05:00


this is exactly same as SAMPLE1 except additional text file is created in the same directory as the main file, the file has identical name as main file, it as extension .LOG  The log file shows start time, end time and logs every 1mm records. It is useful to use it during very large data spools – sometimes it’s handy to see the progress and how long 1mm records takes.   I did not notice any significant speed degradation in comparison to sample1 (writing a row for every 1mm records can take anywhere from few seconds to minutes depending on how wide the spool is and how fast your file system is).

sample3

pipe delimited file with no header, unix log created, in parallel mode – this is where it gets interesting and where you really gain power of pipeline function.

 
SELECT *                                       
  FROM TABLE(                                  
                 DATA_UNLOAD(                       
                             CURSOR(
                                     SELECT /*+ PARALLEL(A, 2, 1) */
                                            TABLE_NAME                   || '|' || 
                                            COLUMN_NAME                  || '|' ||
                                            DATA_TYPE     
                                       FROM MYTABLE A
                                   ),     
                            'SAMPLE_SPOOL.TXT',    
                            'DIR_USERS_JIRI',
                            'Y',
                            'Y' )
                            ) ;  

FILE_NAME                    DIRECTORY_NAME  NO_RECORDS  SESSION_ID  START_DTTM                           END_DTTM
SAMPLE_SPOOL_0000000555.TXT  DIR_USERS_JIRI  10796              555  03/18/2009 4:44:57.141247 PM -05:00  03/18/2009 4:44:58.034521 PM -05:00
SAMPLE_SPOOL_0000000666.TXT  DIR_USERS_JIRI  10000              666  03/18/2009 4:44:57.141264 PM -05:00  03/18/2009 4:44:58.047624 PM -05:00


parallel mode creates X (in this case 2) files, the files can be merged on unix level (or simple left separate), the files will not be exactly same size – parallel is based on oracle’s internal ‘best guess” and yes it really helps to analyze all underlying tables. The files will however preserve full records (one row will not span across 2 files).

SQL will return X rows (in this case 2) showing file names, number of records and start-end times

there will be 2 unix log files created (optional)

Please note if you do not specify p_unique_filename=’Y’ and run the query in parallel, then all spools will run into one file and override each other (you will not get correct result) !!

sample4

other formats – you can easily spool to comma delimited, add header or even DOS end-of-line character. It is really up to your SQL statement. Here is comma delimited file with header and DOS end-of-line

 

SELECT *                                      
  FROM TABLE(                                 
                 DATA_UNLOAD(                      
                             CURSOR(
                                     SELECT 'TABLE_NAME,COLUMN_NAME,DATA_TYPE' FROM DUAL
                                     UNION ALL
                                     SELECT /*+ NO_PARALLEL(A) */
                                            TABLE_NAME                   || ',' ||
                                            COLUMN_NAME                  || ',' ||
                                            DATA_TYPE                    || CHR(13)
                                       FROM MYTABLE A
                                   ),    
                            'SAMPLE_SPOOL.TXT',   
                            'DIR_USERS_JIRI',
                            'N',
                            'N' )
                            ) ;
Advertisements

27 thoughts on “Oracle fast parallel data unload into ASCII file(s)

  1. Igor Cabrera says:

    I have implemented your function and the demos you put in this article (the only difference is my directory name that is different) , they are usefull but probably I’m doing something wrong because when I run :

    SELECT *
    FROM TABLE(
    DATA_UNLOAD(
    CURSOR(
    SELECT/*+ PARALLEL (A, 2,1) */
    TABLE_NAME || ‘|’ ||
    COLUMN_NAME || ‘|’ ||
    DATA_TYPE
    FROM MYTABLE A
    ),
    ‘SAMPLE_SPOOL.TXT’,
    ‘DIR_SIGECOM_AE’,
    ‘Y’,
    ‘Y’ )
    ) ;

    I only get one file and one record in pipelined function :

    FILE_NAME DIRECTORY_NAME NO_RECORDS SESSION_ID START_DTTM END_DTTM
    SAMPLE_SPOOL.TXT_0000000717 DIR_SIGECOM_AE 92393 717 01/02/2011 04:59:27.825075 PM -04:00 01/02/2011 04:59:28.625349 PM -04:00

    Why is that?

    Isn’t parallel working?

    Please answer me I’m trying to do fast data unload.

  2. jiri says:

    Can you do parallel in general? In another words is PARALLEL enabled on your server/session?

    I’ve used this many times, the only time it runs in serial is when I run it over db-link or use custom function inside the SQL

  3. Igor Cabrera says:

    Well,this is the first time that I try parallel in my database. (I’m a developer not a DBA)

    How could I know if it is enabled in my server?

    I it is not what should my dba do to enable it?

    Does it have any disvintages to enable parallel in our DB?

  4. jiri says:

    I would recommend to read this article http://www.dba-oracle.com/art_builder_opq_cpu.htm

    in general, parallel processing makes sense on a bit larger server, single processor server will not gain anything,

  5. Daniel Nelson says:

    Hello,

    This function works great, I unloaded 22 million records in about 10 minutes. However, since I have 10 tables I need to unload, I wanted to put it into a procedure. The problem I am having is the RETURN from the function. How can I call the function you provided in a procedure? What do I need to setup in the declare section to use in the select …. into clause? Can you provide an example?

    Thanks,
    Dan.

  6. jiri says:

    Assuming you want keep the ability to run it in parallel which means that the function can return more than one row, I would use weakly typed reference cursor.

    I did not test this code (I don’t have oracle database in front of me), but in general it should work. Just replace … with the rest of your SQL

    DECLARE
    c_cursor SYS_REFCURSOR;
    l_row cursor_variable_test%ROWTYPE;
    BEGIN
    OPEN c_cursor FOR 'SELECT * FROM TABLE( ...';

    LOOP
    FETCH c_cursor
    INTO l_row;

    EXIT WHEN c_cursor%NOTFOUND;

    DBMS_OUTPUT.put_line(l_row.id || ' : ' || l_row.description);
    END LOOP;

    CLOSE c_cursor;
    END;

  7. Daniel Nelson says:

    Hello,
    Thanks for your prompt response.

    Here is my test. I don’t know what to put as the %rowtype (i.e. cursor_variable_test%rowtype). How do I do a rowtype for a pipelined cursor.

    Thanks.

    DECLARE
    c_cursor SYS_REFCURSOR;
    l_row cursor_variable_test%ROWTYPE;
    BEGIN
    OPEN c_cursor
    FOR ‘
    select *
    from table(DATA_UNLOAD(CURSOR(
    select /*+ PARALLEL(AS_LY, 12) */

  8. jiri says:

    either define type for it or if you are lazy, try something like this instead… again I dont have oracle here, it’s kinda hard to guess the code 😉

    — replace … with actual types

    DECLARE
    c_cursor SYS_REFCURSOR;
    field1 ...;
    field2 ...;
    field3 ...;
    field4 ...;

    BEGIN
    OPEN c_cursor FOR 'SELECT * FROM TABLE( ...';

    LOOP
    FETCH c_cursor
    INTO field1,field2,field3,field4;

    EXIT WHEN c_cursor%NOTFOUND;

    DBMS_OUTPUT.put_line(field1 || ' : ' || field2 || ' : ' || field3 || ' : ' || field4);
    END LOOP;

    CLOSE c_cursor;
    END;

  9. Daniel Nelson says:

    Hello,
    Maybe we can take this offline, but a few things. I was able to get it working by explicitly defining the variables as well as defining the type within the PLSQL block

    (i.e.
    type l_row is record
    (l_name VARCHAR2(255),
    l_directory VARCHAR2(255),
    l_lines PLS_INTEGER,
    l_sid VARCHAR(255),
    l_start_dttm TIMESTAMP WITH TIME ZONE,
    l_end_dttm TIMESTAMP WITH TIME ZONE);
    lr l_row;

    ).

    However, I am still not able to take advantage of the objects you create, for example dump_ot and dump_ntt. Maybe PL/SQL is not robust enough to handle that, but I think it’s improbable and I’m just missing something. Below is my latest stab at it. Any suggestion you have would be most welcomed – if or when you get back to a computer.

    Thanks much.
    Dan.

    DECLARE
    type l_row is table of dump_ot;
    l_row_table l_row := l_row();

    l_name VARCHAR2(255);
    l_directory VARCHAR2(255);
    l_lines PLS_INTEGER;
    l_sid VARCHAR(255);
    l_start_dttm TIMESTAMP WITH TIME ZONE;
    l_end_dttm TIMESTAMP WITH TIME ZONE;

    cursor c0 is
    select *
    from table(DATA_UNLOAD(CURSOR(
    select /*+ PARALLEL(AS_LY, 12) */
    ID_LY || ‘|’ ||ID_STR_RT || ‘|’ ||ID_WS_ORGL || ‘|’ ||DC_DY_BSN_ORGL || ‘|’ ||
    AI_TRN_ORGL || ‘|’ ||ID_CT || ‘|’ ||ST_LY || ‘|’ ||ST_LY_PREV || ‘|’ ||
    TS_MDF_ST || ‘|’ ||DT_LY_EP || ‘|’ ||DT_LY_GPD || ‘|’ ||MO_PYM_LY_MNM_DWN || ‘|’ ||
    NM_LCN_STR || ‘|’ ||MO_LY_TOT || ‘|’ ||MO_LY_FE_CRT || ‘|’ ||MO_LY_FE_DLT || ‘|’ ||
    CP_PYM_LY_CLT_TOT || ‘|’ ||QU_PYM_LY_CLT || ‘|’ ||MO_BLNC_LY || ‘|’ ||FL_MOD_TRG || ‘|’ ||
    TS_CRT_RCRD || ‘|’ ||TS_MDF_RCRD
    from POSDBADMIN.AS_LY
    ),
    ‘as_ly.dat’,
    ‘UNLOADER’,
    ‘Y’,
    ‘N’ )
    );

    BEGIN
    Open c0;
    Loop
    Fetch c0 bulk collect Into l_row_table limit 1000;
    Exit When c0%notfound;
    End Loop;
    Close c0;
    END;

  10. jiri says:

    I probably don’t fully get what is missing… here are several different options to bulk collect SQL (I was assuming you want to do something with the output records and thus the loop is easiest)

    http://www.adp-gmbh.ch/ora/plsql/bc/index.html

    at the end, you will probably never get more than 16 records (even parallel 16 is a bit stretch) and all of the solutions will run reasonably fast. I’d say use whatever you are comfortable with…

  11. Daniel Nelson says:

    Here is what I was trying to do. Run it from a PL/SQL block or take it one step further and put it in a procedure. What I was missing was getting the pipelined results back and how to handle it. Note it required the following:

    select dump_ot(a.file_name,
    a.directory_name,
    a.no_records,
    a.session_id,
    a.start_dttm,
    a.end_dttm)

    — Full Code Set.
    DECLARE
    l_data dump_ntt;

    cursor c0 is
    select dump_ot(a.file_name,
    a.directory_name,
    a.no_records,
    a.session_id,
    a.start_dttm,
    a.end_dttm)
    from table(DATA_UNLOAD(CURSOR(
    select /*+ PARALLEL(AS_LY, 12) */
    ID_LY || ‘|’ ||ID_STR_RT || ‘|’ ||ID_WS_ORGL || ‘|’ ||DC_DY_BSN_ORGL || ‘|’ ||
    AI_TRN_ORGL || ‘|’ ||ID_CT || ‘|’ ||ST_LY || ‘|’ ||ST_LY_PREV || ‘|’ ||
    TS_MDF_ST || ‘|’ ||DT_LY_EP || ‘|’ ||DT_LY_GPD || ‘|’ ||MO_PYM_LY_MNM_DWN || ‘|’ ||
    NM_LCN_STR || ‘|’ ||MO_LY_TOT || ‘|’ ||MO_LY_FE_CRT || ‘|’ ||MO_LY_FE_DLT || ‘|’ ||
    CP_PYM_LY_CLT_TOT || ‘|’ ||QU_PYM_LY_CLT || ‘|’ ||MO_BLNC_LY || ‘|’ ||FL_MOD_TRG || ‘|’ ||
    TS_CRT_RCRD || ‘|’ ||TS_MDF_RCRD
    from POSDBADMIN.AS_LY
    ),
    ‘as_ly.dat’,
    ‘UNLOADER’,
    ‘Y’,
    ‘N’ )
    ) a;

    BEGIN
    Open c0;
    Loop
    Fetch c0 bulk collect Into l_data;

    dbms_output.put_line(l_data.count||’ Array Elements’);
    For i in 1..l_data.count Loop
    dbms_output.put_line(‘Values: ‘||l_data(i).file_name ||’ ~ ‘||
    l_data(i).directory_name||’ ~ ‘||
    l_data(i).no_records ||’ ~ ‘||
    l_data(i).session_id ||’ ~ ‘||
    l_data(i).start_dttm ||’ ~ ‘||
    l_data(i).end_dttm);
    End Loop;

    Exit When c0%notfound;
    End Loop;
    Close c0;
    END;
    /

  12. Anonymous says:

    Thanks a lot , this is an excellent article to make use of the pipe function, A small change, On sample 3 the hint should read “parallel” instead of “paralle”, thats what causing the query not to write multiple files.

  13. jiri says:

    thank you, I will fix the typo

  14. Phil says:

    Hi, I ran your code against a 4 million row table on a RAC environment.However it took around 12 minutes to finish our sqlplus version which is a straight select took around 1 minute 30 seconds. Both running with parallel 8. Do you have any suggestions for the diffrences?

  15. jiri says:

    Did you run it on a single node or distributed in another words PARALLEL(NAME, 8, 1) or PARALLEL(NAME,8,x)

    it should not run that slow for sure

  16. Wilco says:

    Excellent piece of work, unloads really fast.
    Speed however will be compromised if there is a use of functions in the select. (eg formatting date values with to_char() instead try setting nls_date_format or alter session to format dates if possible, this will improve performance dramatically)

  17. Rahul Saxena says:

    Hi Jiri,

    Is there a way I can load data from an XML File into Oracle Tables , without having the Input file in some Oracle Server Directory. My XML File resides on UNIX Application server. And I need to directly load the data into Database tables. Without loading them into the Database Directory.

  18. Marco Coletti says:

    If you want to unload data as SQL insert statements and/or do not want to cope with required permissions to write to the server filesystem, then my solution could be useful:
    http://pastebin.com/6NtadR95

  19. մովսես says:

    Hey friend. Thanks for this brilliant code. But i have an issue that you maybe could help me with. In parallel mode i wanted to load all the data in one file so I changed the parameter ‘W’ in utl_file.fopen to ‘A’. But i dont understand why some of the data is lost . Maybe you could suggest something better for my goal?
    Thanks in advance

  20. մովսես says:

    Okay, just changed the code a bit and it worked, so if anyone needs to put all the data into the file using parallel hint can do this trick. The key is to open a file in append mode in the outer loop not the outside of loops. worked for me.
    LOOP
    if utl_file.is_open(v_file)
    then
    utl_file.fclose(v_file);
    end if;
    v_file := UTL_FILE.FOPEN(p_directory, v_name, ‘A’, c_maxline)
    FETCH p_source BULK COLLECT INTO v_rows LIMIT 1000;

    FOR i IN 1 .. v_rows.COUNT LOOP

    IF LENGTH(v_buffer) + c_eollen + LENGTH(v_rows(i)) <= c_maxline THEN
    v_buffer := v_buffer || c_eol || v_rows(i);
    ELSE
    IF v_buffer IS NOT NULL THEN
    UTL_FILE.PUT_LINE(v_file, v_buffer);
    END IF;
    v_buffer := v_rows(i);
    END IF;

    END LOOP;

  21. Anonymous says:

    I used this Function on a ExaData DB and it worked greatly.
    15 Gb unloaded in 1 minutes and 10 seconds on Parallel 75.

    I found a little bug when used on an Exa enviornment (or a “simple” Rack configuration).
    The uniqueness of the file name is not assured only placing the SID in the file name because the same SID may be (and will be) used on different nodes of the rack.

    I modified the file name adding the ID of the instance this way:

    ..
    v_insta_name VARCHAR(255);
    v_instance VARCHAR(255);
    ..

    BEGIN

    ..
    v_insta_name := sys_context(‘USERENV’,’INSTANCE_NAME’);
    v_instance := sys_context(‘USERENV’,’INSTANCE’);
    ..
    IF upper(p_unique_filename) = ‘Y’ THEN
    v_name := v_name || ‘_’ || TO_CHAR(v_insta_name) || ‘_’ || TO_CHAR(v_instance) || ‘_’ || TO_CHAR(v_sid);
    END IF;
    ..

    Hope this will be usefull for somebody.
    Thank you a lot for the source code of the Function.

    Stefano A.

  22. Anonymous says:

    I used this Function on a ExaData DB and it worked greatly.
    15 Gb unloaded in 1 minutes and 10 seconds on Parallel 75.

    I found a little bug when used on an Exa enviornment (or a “simple” RAC configuration).
    The uniqueness of the file name is not assured only placing the SID in the file name because the same SID may be (and will be) used on different nodes of the RAC.

    I modified the file name adding the ID of the instance this way:

    ..
    v_insta_name VARCHAR(255);
    v_instance VARCHAR(255);
    ..

    BEGIN

    ..
    v_insta_name := sys_context(‘USERENV’,’INSTANCE_NAME’);
    v_instance := sys_context(‘USERENV’,’INSTANCE’);
    ..
    IF upper(p_unique_filename) = ‘Y’ THEN
    v_name := v_name || ‘_’ || TO_CHAR(v_insta_name) || ‘_’ || TO_CHAR(v_instance) || ‘_’ || TO_CHAR(v_sid);
    END IF;
    ..

    Hope this will be usefull for somebody.
    Thank you a lot for the source code of the Function.

    Stefano A.

  23. jiri says:

    thank you for the fix!!!

  24. k@z@k says:

    When I use PARALLEL hint on a plain select clause, such as “select /*+ PARALLEL (A,4,1) */ column1, column2, …. columnX from table A;” it is working just as supposed.
    But when I use some custom function on a column, such as “select /*+ PARALLEL (A,4,1) */ column1, FUNC(column2)…. columnX from table A;”, then it looks like the parallel is not working, and the code produces only one file, so the time for processing is like no parallel. In my case this custom function is a function for replacing some characters into the column field.
    When use system REPLACE function, then parallel is working. As I want to replace multiple characters, I’m not using system REPLACE function.
    Is this behavior normal or there could be something in my function keeping it from parallel?

  25. jiri says:

    its been really long since I did this post…

    Did you enable the function for parallel execution with PARALLEL_ENABLE and PRAGMA RESTRICT_REFERENCES ? http://docs.oracle.com/cd/A87860_01/doc/server.817/a76965/c22paral.htm#22181

  26. k@z@k says:

    PARALLEL_ENABLE did the job
    Thanks a lot for help and great code! It unloads data really fast!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: