INSERT ALL INTO SQL statement and normalization in Oracle


It is very surprising to me how much people don’t know about Oracle INSERT ALL INTO multiple tables statement and over complicate some of the inserts more than they should. Here is one interesting aggregate query and how it can be resolved using simple INSERT statement in one pass.

Let’s assume I have customer table and want to create transposed aggregate table.

this is my source table

CUSTOMER_ID LOCATION_NAME TOTAL_BALANCE AVERAGE_BALANCE
234 Nevada 444.54 445.55
345 California 11.2 55.4
123 California 1000.5 800.2
CREATE TABLE JIRI.SOURCE_TABLE
(
  CUSTOMER_ID      NUMBER(10)                   NOT NULL,
  LOCATION_NAME    VARCHAR2(60 BYTE)            NOT NULL,
  TOTAL_BALANCE    NUMBER(10,2)                 NOT NULL,
  AVERAGE_BALANCE  NUMBER(10,2)                 NOT NULL
)

this is my summarized target table

LOCATION_NAME MEASURE_NAME MEASURE_VALUE
Nevada TOTAL_BALANCE 444.54
Nevada AVERAGE_BALANCE 445.55
California TOTAL_BALANCE 1011.7
California AVERAGE_BALANCE 855.6
CREATE TABLE JIRI.TARGET_TABLE
(
  LOCATION_NAME  VARCHAR2(60 BYTE)              NOT NULL,
  MEASURE_NAME   VARCHAR2(60 BYTE)              NOT NULL,
  MEASURE_VALUE  NUMBER(10,2)                   NOT NULL
)

Transformation from source to target is very simple SUM of individual measures with GROUP BY location_name. The only small tricky part is that we are normalizing this table by converting X columns to X rows. Let’s ignore the reasons for this normalization and let’s focus on the solution.

.

1. PLSQL

I want to briefly mention this solution, but won’t go into details (this is not SQL). You can use for example cursor, execute the query one and loop X times (based on number of columns you want to transform)

.

2. two SQL inserts (total cost 32) – can be also replaced with one SQL statement with UNION ALL (it can be either subquery or better WITH clause)

This is the easiest solution. I don’t want to say that it’s the solution you should never use simply because there might be situations where it fits the best. In general this reads source table 2x and thus creates a lot of overhead.

You can either run two inserts statements or one insert statement with UNION ALL between two queries. Either way it reads source table 2x (union all might run slower if the source table is large, I would probably run two statements).

  INSERT INTO JIRI.TARGET_TABLE (LOCATION_NAME, MEASURE_NAME, MEASURE_VALUE)
  SELECT LOCATION_NAME, 'TOTAL_BALANCE', SUM(TOTAL_BALANCE)
    FROM JIRI.SOURCE_TABLE
GROUP BY LOCATION_NAME;
COMMIT;

  INSERT INTO JIRI.TARGET_TABLE (LOCATION_NAME, MEASURE_NAME, MEASURE_VALUE)
  SELECT LOCATION_NAME, 'AVERAGE_BALANCE', SUM(AVERAGE_BALANCE)
    FROM JIRI.SOURCE_TABLE
GROUP BY LOCATION_NAME;
COMMIT;

INSERT STATEMENT  ALL_ROWS Cost: 16  Bytes: 30  Cardinality: 2
 2 SORT GROUP BY  Cost: 16  Bytes: 30  Cardinality: 2
 1 TABLE ACCESS FULL TABLE JIRI.SOURCE_TABLE Cost: 15  Bytes: 45  Cardinality: 3 

INSERT STATEMENT  ALL_ROWS Cost: 16  Bytes: 30  Cardinality: 2
 2 SORT GROUP BY  Cost: 16  Bytes: 30  Cardinality: 2
 1 TABLE ACCESS FULL TABLE JIRI.SOURCE_TABLE Cost: 15  Bytes: 45  Cardinality: 3

 

3. dummy cartesian JOIN (total cost 28)

This is a technique I used in the past in cases when I did not want INSERT statement but SELECT statement only. It works however extra join creates way more overhead than you need. Very simply say, you use cartesian join to X dual tables (in my case 2).

  INSERT INTO JIRI.TARGET_TABLE (LOCATION_NAME, MEASURE_NAME, MEASURE_VALUE)
  SELECT LOCATION_NAME,
         MEASURE_NAME,
         SUM(CASE WHEN MEASURE_NAME = 'TOTAL_BALANCE'   THEN TOTAL_BALANCE
                  WHEN MEASURE_NAME = 'AVERAGE_BALANCE' THEN AVERAGE_BALANCE
                  ELSE 0 END) MEASURE_VALUE
   FROM JIRI.SOURCE_TABLE A,
        (SELECT 'TOTAL_BALANCE' MEASURE_NAME FROM DUAL
          UNION ALL
         SELECT 'AVERAGE_BALANCE' MEASURE_NAME FROM DUAL) B
GROUP BY LOCATION_NAME, MEASURE_NAME;
COMMIT;

INSERT STATEMENT  ALL_ROWS Cost: 28  Bytes: 58  Cardinality: 2
 8 SORT GROUP BY  Cost: 28  Bytes: 58  Cardinality: 2
 7 MERGE JOIN CARTESIAN  Cost: 27  Bytes: 174  Cardinality: 6
 1 TABLE ACCESS FULL TABLE JIRI.SOURCE_TABLE Cost: 15  Bytes: 60  Cardinality: 3
 6 BUFFER SORT  Cost: 13  Bytes: 18  Cardinality: 2
 5 VIEW P_FINPROD. Cost: 4  Bytes: 18  Cardinality: 2
 4 UNION-ALL
 2 FAST DUAL  Cost: 2  Cardinality: 1
 3 FAST DUAL  Cost: 2  Cardinality: 1

 

4. one simple INSERT ALL statement (total cost 16)

This is the easiest yet not so widely used solution. I insert each record into 2 tables (in my case the same table).

  INSERT ALL
    INTO JIRI.TARGET_TABLE VALUES (LOCATION_NAME, LABEL1, TOTAL_BALANCE)
    INTO JIRI.TARGET_TABLE VALUES (LOCATION_NAME, LABEL2, AVERAGE_BALANCE)
  SELECT LOCATION_NAME,
         'TOTAL_BALANCE'      LABEL1,
         'AVERAGE_BALANCE'    LABEL2,
         SUM(TOTAL_BALANCE)   TOTAL_BALANCE,
         SUM(AVERAGE_BALANCE) AVERAGE_BALANCE
    FROM JIRI.SOURCE_TABLE
GROUP BY LOCATION_NAME;
COMMIT;

 INSERT STATEMENT  ALL_ROWS Cost: 16  Bytes: 174  Cardinality: 3
 5 MULTI-TABLE INSERT
 1 INTO JXNOVAK.TARGET_TABLE
 4 INTO JXNOVAK.TARGET_TABLE
 3 SORT GROUP BY  Cost: 16  Bytes: 174  Cardinality: 3
 2 TABLE ACCESS FULL TABLE JIRI.SOURCE_TABLE Cost: 15  Bytes: 174  Cardinality: 3

 

As you can see, solution (4) the easiest, fastest and most elegant solution with lowest cost.

Advertisements
Tagged , ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: