PL/SQL Tuning for Batch Systems

articles: 

A History Lesson

Where were you in 1990? Nelson Mandela was being freed from Victor Verster Prison after 26 years behind bars, Saddam Hussein was starting the Gulf War by invading Kuwait, and Tim Berners-Lee was inventing the World-Wide-Web at CERN in Geneva. Me? In 1990, I was writing an insurance system in Oracle SQL*Forms v2.3.

Forms 2.3 - the last version before the introduction of PL/SQL - was frustrating and irritating in equal measures. It had only one control structure: GOTO. Clearly something had to give. Oracle provided the ability to link C modules (Pro*C actually) into the runform.exe executable; they were called User Exits and could be called directly from that abominable Forms trigger language. Although their intended use was just to "go do something outside Oracle", programmers were writing entire languages - procedural languages - in Pro*C and linking them into Forms. Urban legend has it that PL/SQL was one of these languages written by a frustrated techo in the 1980's and that it was so successful that Oracle bought it.

Forms v3.0 did not improve the situation - it just legitimised it by making PL/SQL the native trigger language of Forms. But something far more important was happening at about the same time: Oracle v6.0 was being shipped with PL/SQL on the server. For the first time, it was possible to integrate SQL and procedural logic without the overheads of a client-server architecture. This was seen as a great leap forward; the big problem with client-server was all that nasty context-switching. PL/SQL - ProceduraL SQL - would combine procedural control structures with the SQL language and eliminate context-switching from batch jobs. Right? Well..... umm.... not exactly. PL/SQL inherited more from its Forms roots than just syntax - it used the same two-engine architecture. Seventeen years later, looking at the diagram in Chapter 1 of the Oracle 10g PL/SQL Users Guide and Reference: it still does!

Context Switching in PL/SQL

So what is context switching anyway? I Googled it and found the following concise definition:

Quote:
The process of saving an executing thread or process and transferring control to another thread or process.

But how much does it cost performance? Run the following test script in SQL*Plus to see for yourself:

CREATE TABLE context_switch_test 
AS 
SELECT * 
FROM dual 
WHERE 1=0;

SET TIMING ON

/* Process 100,000 rows, switching context every 1000 rows */
DECLARE
  CURSOR rec_cur IS
    SELECT level
    FROM   dual
    CONNECT BY level <= 100000;

  TYPE rec_tab_type IS TABLE OF NUMBER(10);
  rec_tab REC_TAB_TYPE;
BEGIN
  OPEN rec_cur;
  LOOP
    FETCH rec_cur BULK COLLECT INTO rec_tab LIMIT 1000;
    EXIT WHEN rec_tab.COUNT() = 0;
    
    FORALL i IN rec_tab.FIRST .. rec_tab.LAST
      DELETE 
      FROM context_switch_test 
      WHERE rec_tab(i) IS NOT NULL;
  END LOOP;
  CLOSE rec_cur;
END;
/

/* Process 100,000 rows, switching context every row */
BEGIN
  FOR rec IN (
    SELECT level
    FROM   dual
    CONNECT BY level <= 100000
  ) LOOP
    DELETE 
    FROM context_switch_test
    WHERE rec.level IS NOT NULL;
  END LOOP;
END;
/

On my database, the first example took 1.43 seconds, and the second one - which performed 1000 times more context switching - took 7.62 seconds.

"So," I hear you say, "6 more seconds to perform 99,000 context switches: I can live with that". But consider the ETL of a large modern data warehouse: it may process 1 million or more rows (1 million context switches), open half-a-dozen more cursors nested inside the main loop, each of which fetch 5 or more rows (30 million), and perform a million inserts (1 million more). Then consider there might be 50 such jobs running every night; at 6 seconds per 100,000 switches, that's 26 hours of unnecessary processing time. And the most common solution? Buy bigger boxes! Yeah!

Set Processing

The absolute best way to tune PL/SQL batch jobs is to stop using PL/SQL (well, almost). When we're applying a data-patch, or just cutting code free-hand in SQL*Plus (or TOAD, or Navigator, or whatever's your pleasure), we're only too happy to perform multi-row UPDATE, DELETE, and INSERT statements that push all of the iteration logic (and - with WHERE clauses - the conditional logic as well) onto the SQL engine. So what is with the pathalogical, lemming-like urge to write cursor-loops in Production application code? There are a number of reasons, but most of them crumble under close scrutiny:

  • "But for each row returned by the cursor, we need to lookup values from several other tables."

    Join those lookup tables into the main cursor.

  • "But we perform Upserts. Try to INSERT, and then UPDATE the existing row if the INSERT fails with key-violation."

    Right on! Double your context switches! Use the MERGE statement instead to process the entire data set in a single statement. Still using v8i? Load all of the rows you need into a Global Temporary Table, and then perform separate bulk UPDATE and INSERT statements from the GTT.

    Intuition tells us this should be slower because the data is being written twice; it's not. Inserts into GTTs generate less IO than regular tables because they don’t need to be recoverable in the event of a system crash. The additional time it takes to write the GTT is still much faster than the context switches and the single-row lookups performed by the cursor-loop.

  • "But we have to write to a log file every N rows. Prod Support can't tell where it’s up to otherwise."

    Rubbish! You wouldn't need to check where it was up to if it was already finished. Anyway, long-running SQLs with Full Table Scans, sorts, and hash joins can be monitored with the V$SESSION_LONGOPS view.

  • "But we do business rule and integrity checking. Invalid rows need to go to another table."

    Use the multi-table INSERT FIRST statement (errors to one table, the rest to the main table). Or in 8i, insert everything into a Global Temporary Table, then INSERT valid rows into the main table and errors into the error table with separate queries on the GTT.

  • "But we have to commit every N rows, otherwise we'll blow the rollback segments."

    In an OLTP system, perhaps; they are configured for many concurrent small transactions. The needs of the overnight batch are often overlooked. One work-around is to process the data in manageable chunks. One clever technique is Rowid Range; search AskTom.oracle.com for more details.

    A Data Warehouse is a different story; large transactions are its bread-and-butter and it needs to be able to handle them. The DBA should set up rollback segments (UNDO) to handle transactions of several Gigabytes - more if necessary.

The power of in-line views, user-defined functions and analytic functions allow us to resolve queries in SQL that were previously impossible; however this should not be seen as a challenge to code every query in SQL at the expense of concise, intuitive and maintainable code. There are situations where Set Processing is inappropriate, including complex iterative logic, complex conditional logic and branching, N-M cardinality joins, and mixed-cardinality multi-table INSERTs.

When common sense dictates that procedural logic is necessary, it can still be achieved with a minimal drop in performance by minimising context switching. Context switching occurs whenever a SQL operation is performed. This includes:

  • SELECT .. INTO statements.
  • INSERT, DELETE, and UPDATE statements.
  • OPEN, FETCH, and CLOSE a cursor.
  • FOR rec IN (cursor) loops.
  • Any DDL statements executed via Dynamic SQL.

BULK COLLECT

BULK COLLECT reduces FETCH operations by fetching many rows at a time into a collection. It should always be used in conjunction with the LIMIT clause where there is a risk of fetching too many rows, as the resultant collection would exceed memory limitations and start paging to disk. Anywhere from 100 to 1000 is a practical LIMIT; it is not important to find the "perfect" number as there is a law of diminishing returns.

BULK COLLECT with the LIMIT clause is only available with explicit FETCH statements, which is annoying because it precludes the tidy and intuitive Implicit Cursor FOR loops.

Implicit Cursor FOR Loop: tidy, intuitive, and very, very, slow

FOR empl IN (
    SELECT ename
    FROM   emp
) LOOP
    <... some statements ...>
END LOOP;

Implicit Cursor BULK COLLECT: efficient, but not scalable

SELECT ename
BULK COLLECT INTO empl_tab
FROM   emp;

FOR i IN empl_tab.FIRST .. empl_tab.LAST LOOP
    <... some statements ...>
END LOOP;

Explicit Cursor BULK COLLECT with LIMIT: fast and scalable, but wordy and complex

DECLARE
    CURSOR empl_cur IS
    SELECT ename
    FROM   emp;

    TYPE empl_tab_typ IS TABLE OF EMPL_CUR%ROWTYPE;
    empl_tab EMPL_TAB_TYPE;
BEGIN
    OPEN empl_cur;
    LOOP
        FETCH empl_cur BULK COLLECT INTO empl_tab LIMIT 1000;
        EXIT WHEN empl_tab.COUNT = 0;

        FOR i IN empl_tab.FIRST .. empl_tab.LAST LOOP
            <... some statements ...>
        END LOOP;

        EXIT WHEN empl_cur%NOTFOUND;
    END LOOP;
    CLOSE empl_cur;
END;

The FORALL Statement

Unlike the FOR loop, FORALL is more truly a statement than a loop; it can only repeat a single DML statement. Every INSERT, UPDATE, and DELETE statement in a PL/SQL program performs a context switch from the PL/SQL engine to the SQL engine. But by loading all of the rows to be processed into a collection and performing the DML inside a FORALL statement, the DML and the collection are passed to the SQL engine and run with a single context switch. Exceptions can be captured using the SAVE EXCEPTIONS clause and are returned in the SQL%BULK_EXCEPTIONS collection.

DECLARE
    empl_tab_type  IS TABLE OF emp_cur%ROWTYPE;  -- collection of records
    empno_tab_type IS TABLE OF emp.empno%TYPE;   -- collection of scalars

    empl_tab  EMPL_TAB_TYPE;
    empno_tab EMPNO_TAB_TYPE;

    -- create an exception handler for ORA-24381
    dml_errors EXCEPTION;
    PRAGMA EXCEPTION_INIT(dml_errors, -24381);

BEGIN

    :     :

    FORALL i IN empl_tab.FIRST .. empl_tab.LAST SAVE EXCEPTIONS
        UPDATE emp
        SET ROW = empl_tab(i)
        WHERE empno = empno_tab(i);
    :     :

EXCEPTION
    WHEN dml_errors THEN
        FOR i IN 1 .. SQL%BULK_EXCEPTIONS.COUNT LOOP
            <... handle the errors ...>
        END LOOP;
END;

As demonstrated in this example, it is currently a limitation of PL/SQL that SQL statements can reference only entire rows of a collection, not individual data items within a record. Entire rows are useful when updating with SET ROW (as above) or record-wise inserting, but you must maintain separate scalar collections to reference individual values in the WHERE clause. Many find it easier to use only scalar collections and ignore the attractions of record-wise inserts and updates.

Associative Arrays

Single row lookups in PL/SQL should never be performed as SQL statements (or cursors) inside a cursor loop. Instead, they should be merged into the main cursor using a table join; use an outer-join if there is a chance the row may not exist in the lookup table. This is still an efficient method even if the conditional PL/SQL logic does not need the joined data for every row returned.

Some expert systems may have complex or iterative branching logic that makes the number of lookup permutations impractical. In these situations, the lookup tables can be cached in PL/SQL using an associative array (Index-By Table) so that lookups do not need a context switch.

DECLARE
    TYPE currency_lkp_type IS TABLE OF currency%ROWTYPE INDEX BY CURRENCY.CURRENCY_CODE%TYPE;
    currency_lkp CURRENCY_LKP_TYPE;
BEGIN
    FOR curr_rec IN (SELECT * FROM currency) LOOP
        currency_lkp(curr.currency_code) := curr_rec;
    END LOOP;

    :     :

    FOR i IN txn_tab.FIRST .. txn_rec.LAST LOOP
        :     :
        -- Lookup the currency table without using SQL
        txn_tab.currency_desc := 
            currency_lkp( txn_tab(i).currency_code );
        :     :
    END LOOP;
END;

Note that this example breaks a cardinal rule by loading the associative array using an implicit-cursor FOR loop. It would be possible to BULK COLLECT the rows into a nested table, and then transfer the nested table row-by-row into the associative array, but it seems like overkill for downloading a small table just once for the entire job.

Concurrent Cursors

Almost every data retrieval requirement for a batch process can be satisfied either by merging a lookup into the main cursor, or by using associative arrays. In a very small number of cases, there is a need to process rows from two or more very large tables which have N:M join-cardinality.

For example: consider a billing run where an invoice must be generated for each Customer, but in order to generate that invoice the program needs to process all of the Cusomter's Services, plus all of the Customer's Billing Instructions. If there was only one Billing Instruction per customer (say, the latest one), then the main cursor could join the Customer table to Customer Service (return a row per Service per Customer) and join in the latest Billing Instruction to each row. However, when multiple Billing Instructions are required, such a join would destroy the cardinality of the cursor. One Customer joined to 5 Services and 4 Billing Instructions would explode to 5*4=20 rows.

The most efficient solution to such a problem is to use Concurrent Cursors. For this example, use 3 explicit cursors: one that retrieves all of the customers ordered by the customer number, a second that retrieves all of the Customer Services ordered by customer number, and a third that retrieves all of the Customer Billing Instructions ordered by customer number. Open all of the cursors once only at the beginning of the program, and then fetch the rows customer by customer.

To demonstrate the concept more clearly, the sample code below does not use BULK COLLECT, although it could be adapted to do so.

-- Declare 3 cursors
CURSOR cust_cur IS
SELECT *
FROM   customer
ORDER BY cust_num;

CURSOR custserv_cur IS
SELECT *
FROM   customer_service
ORDER BY cust_num;

CURSOR custinst_cur IS
SELECT *
FROM   customer_billing_instruction
ORDER BY cust_num

-- Declare a record type to FETCH each cursor, and collections to load
-- the services and billing instructions for each customer as they are
-- processed.
curr_cust_rec     CUST_CUR%ROWTYPE;
curr_custserv_rec CUSTSERV_CUR%ROWTYPE;
curr_custinst_rec CUSTINST_CUR%ROWTYPE;

TYPE custserv_tab_type IS TABLE OF CUSTSERV_CUR%ROWTYPE;
TYPE custinst_tab_type IS TABLE OF CUSTINST_CUR%ROWTYPE;

custserv_tab CUSTSERV_TAB_TYPE;
custinst_tab CUSTINST_TAB_TYPE;

BEGIN
    -- Open all the cursors and pre-fetch the first row.
    OPEN cust_cur;
    OPEN custserv_cur;
    OPEN custinst_cur;
    
    FETCH cust_cur INTO curr_cust_rec;
    FETCH custserv_cur INTO curr_custserv_rec;
    FETCH custinst_cur INTO curr_custinst_rec;

    WHILE cust_cur%FOUND LOOP

        -- Load up the services for the customer
        custserv_tab.DELETE;
        WHILE custserv_cur%FOUND 
        AND curr_custserv_rec.cust_num <= curr_cust_rec.cust_num LOOP
            IF curr_custserv_rec.cust_num = curr_cust_rec.cust_num THEN
                custserv_tab.EXTEND;
                custserv_rec.LAST := curr_custserv_rec;
            END IF;
            FETCH custserv_cur INTO curr_custserv_rec;
        END LOOP;

        -- Load up the billing instructions for the customer
        custinst_tab.DELETE;
        WHILE custinst_cur%FOUND 
        AND curr_custinst_rec.cust_num <= curr_cust_rec.cust_num LOOP
            IF curr_custinst_rec.cust_num = curr_cust_rec.cust_num THEN
                custinst_tab.EXTEND;
                custinst_rec.LAST := curr_custinst_rec;
            END IF;
          FETCH custinst_cur INTO curr_custinst_rec;
        END LOOP;

        <... Process the invoice using services in the custserv_tab
             collection and billing instructions in the custinst_tab
             collection 
         ...>

        FETCH cust_cur INTO curr_cust_rec;
    END LOOP;
END;

Even though the above example – not using BULK COLLECT – does not significantly reduce context switching to any significant degree, it still has a profound improvement in performance: usually several orders of magnitude. The reason for this is that the Customer Service and Cusotmer Billing Instruction tables are processed in a single fast Full Table Scan, rather than the death-of-a-thousand-cuts method of reading via the index in a separate cursor per customer.

Summary

PL/SQL performance tuning is easy: you only need to remember only 2 things:

  • Do it in pure SQL if you can, and
  • If you can't, never code SQL inside a cursor loop.

Followup


In response to a query on the forum, I did some more investigating. In the demo script above, the bulk of the performance gain is from the FORALL, not from the BULK COLLECT. Tracing this script, I found that the DML within the FORALL statement both parses and executes once only, whereas the DML inside the cursor loop executes once for each iteration.

In fact, if you compare BULK COLLECT to a cursor loop without any SQL inside the loop, performance improvements are difficult to find.

Conclusion: whilst the above methods are still all effective tuning methods, it is primarily because they reduce cursor parse and execute activity, and secondarily because they reduce context switching.

Comments

its really a good document, interesting to read

Hi rleishman, in the Concurrent cursor script I guess you mean:

    WHILE cust_cur%FOUND LOOP

        -- Load up the services for the customer
        custserv_tab.DELETE;
        WHILE custserv_cur%FOUND 
        AND curr_custserv_rec.cust_num <= curr_cust_rec.cust_num LOOP
            IF curr_custserv_rec.cust_num = curr_cust_rec.cust_num THEN
                custserv_tab.EXTEND;
                custserv_tab.LAST := curr_custserv_rec;        -- add custserv record
                FETCH custserv_cur INTO curr_custserv_rec;     -- get next record
            END IF;
        END LOOP;
    ...

This article helps a great deal. Thanks for that!

Thanks for that. Fixed in the main article

aliceg's picture

Great article. It really gives what one needs to know about pl/sql DML performance. Smart, straight to the point.
With many thanks, --Alice

Your first example roughly parallels an example from Oracle whereby context switching is avoided. But I need a further elaboration. Suppose an application wants to return several 10000s of rows. One could code up a OCI app, prepare and execute a select statement and grab rows as fast as the server can send the. But I was wondering if it is possible to write some PL/SQL to:

1. get a cursor
2. perform the select
3. grab rows 1000 at a time BULK collecting them into some storage
4. return those 1000 rows in one shot to the client app
5. goto (3) until no more rows

Shane

You're right, context switching is even more important for external languages that interface to Oracle.

I don't have much experience with them, except for Pro*C and a little indirect knowledge of OCI. I know Pro*C does support array processing: you define your cursor in Pro*C and then FETCH fixed numbers of rows INTO C-style arrays. This is all done without PL/SQL.

I believe the same thing can be done with OCI - but again the FETCHing is all done in host-SQL calls, not PL/SQL.

I don't think it would be easy to do what you are describing (fetching in PL/SQL) because it would meam binding PL/SQL collections (Nested Tables) to host arrays. I don't know how to do that, or even whether it can be done.