Joining Temporal Intervals

December 24, 2012, 8:48 am

≫ Next: Merging Temporal Intervals with Gaps

From time to time a customer asks me how to join multiple tables with temporal intervals (e.g. defined by two columns such as valid_from and valid_to per row). The solution is quite simple if you may limit your query to a certain point in time like now, yesterday or similar. Such a time point becomes just an additional filter criterion per temporal table (e.g. t1.some_date between t2.valid_from and t2.valid_to). But in some cases this approach is not feasible, e.g. if you have to provide all relevant time intervals. I’ll explain in this post a solution approach based on the following data model.

This model is based on the famous EMP and DEPT tables. I added a reference table for job just to get an additional table to join. You find the SQL script to create and populate the model here.

I’d like to query data from the tables EMPV, DEPTV, JOBV and EMPV (manager). Here is the content of the tables reduced to the data relevant for empno 7788 (SCOTT).

SQL> SELECT * FROM empv WHERE empno = 7788 ORDER BY valid_from;

EMPVID EMPNO ENAME JOBNO  MGR HIREDATE     SAL COMM DEPTNO VALID_FROM  VALID_TO
------ ----- ----- ----- ---- ----------- ---- ---- ------ ----------- -----------
     8  7788 SCOTT     5 7566 19-APR-1987 3000          20 19-APR-1987 31-DEC-1989
    22  7788 Scott     5 7566 19-APR-1987 3000          20 01-JAN-1990 31-MAR-1991
    36  7788 Scott     5 7566 19-APR-1987 3300          20 01-APR-1991 31-DEC-9999

SQL> SELECT * FROM jobv WHERE jobno = 5 ORDER BY valid_from;

JOBVID JOBNO JOB     VALID_FROM  VALID_TO
------ ----- ------- ----------- -----------
     5     5 ANALYST 01-JAN-1980 20-JAN-1990
    10     5 Analyst 21-JAN-1990 31-DEC-9999

SQL> SELECT * FROM deptv WHERE deptno = 20 ORDER BY valid_from;

DEPTVID DEPTNO DNAME    LOC    VALID_FROM  VALID_TO
------- ------ -------- ------ ----------- -----------
      2     20 RESEARCH DALLAS 01-JAN-1980 28-FEB-1990
      6     20 Research DALLAS 01-MAR-1990 31-MAR-1990
     10     20 Research Dallas 01-APR-1990 31-DEC-9999

SQL> SELECT * FROM empv WHERE empno = 7566 ORDER BY valid_from;

EMPVID EMPNO ENAME JOBNO  MGR HIREDATE       SAL COMM DEPTNO VALID_FROM  VALID_TO
------ ----- ----- ----- ---- ----------- ------ ---- ------ ----------- -----------
     4  7566 JONES     4 7839 02-APR-1981   2975          20 02-APR-1981 31-DEC-1989
    18  7566 Jones     4 7839 02-APR-1981   2975          20 01-JAN-1990 31-MAR-1991
    32  7566 Jones     4 7839 02-APR-1981 3272.5          20 01-APR-1991 31-DEC-9999

The following figure visualizes the expected result of a temporal join using the data queried previously.

In this case six result records (intervals) are expected. As you see the result is dependent on the number of different intervals or the distinct VALID_FROM values. The driving object is valid from 19-APR-1987 until 31-DEC-9999. VALID_FROM values outside of the validity are irrelevant (e.g. 01-JAN-1980 and 02-APR-1981).

Based on these information we are able to write the query. The highlighted inline view g produces a list of all distinct VALID_FROM values which will be used as additional join criterion for all temporal tables.

SELECT e.empno,
       MIN(g.valid_from) AS valid_from,
       LEAD(MIN(g.valid_from) - 1, 1, DATE '9999-12-31') OVER(
          PARTITION BY e.empno ORDER BY MIN(g.valid_from)
       ) AS valid_to,
       e.ename,
       j.job,
       e.mgr,
       m.ename AS mgr_ename,
       e.hiredate,
       e.sal,
       e.comm,
       e.deptno,
       d.dname,
       d.loc
  FROM empv e
 INNER JOIN (SELECT valid_from FROM empv
             UNION
             SELECT valid_from FROM deptv
             UNION
             SELECT valid_from FROM jobv) g
    ON g.valid_from BETWEEN e.valid_from AND e.valid_to
 INNER JOIN deptv d
    ON d.deptno = e.deptno
       AND g.valid_from BETWEEN d.valid_from AND d.valid_to
 INNER JOIN jobv j
    ON j.jobno = e.jobno
       AND g.valid_from BETWEEN j.valid_from AND j.valid_to
  LEFT JOIN empv m
    ON m.empno = e.mgr
       AND g.valid_from BETWEEN m.valid_from AND m.valid_to
 WHERE e.empno = 7788
 GROUP BY e.empno,
          e.ename,
          j.job,
          e.mgr,
          m.ename,
          e.hiredate,
          e.sal,
          e.comm,
          e.deptno,
          d.dname,
          d.loc
 ORDER BY empno, valid_from;

EMPNO VALID_FROM  VALID_TO    ENAME JOB      MGR MGR_ENAME HIREDATE     SAL COMM DEPTNO DNAME    LOC
----- ----------- ----------- ----- ------- ---- --------- ----------- ---- ---- ------ -------- ------
 7788 19-APR-1987 31-DEC-1989 SCOTT ANALYST 7566 JONES     19-APR-1987 3000          20 RESEARCH DALLAS
 7788 01-JAN-1990 20-JAN-1990 Scott ANALYST 7566 Jones     19-APR-1987 3000          20 RESEARCH DALLAS
 7788 21-JAN-1990 28-FEB-1990 Scott Analyst 7566 Jones     19-APR-1987 3000          20 RESEARCH DALLAS
 7788 01-MAR-1990 31-MAR-1990 Scott Analyst 7566 Jones     19-APR-1987 3000          20 Research DALLAS
 7788 01-APR-1990 31-MAR-1991 Scott Analyst 7566 Jones     19-APR-1987 3000          20 Research Dallas
 7788 01-APR-1991 31-DEC-9999 Scott Analyst 7566 Jones     19-APR-1987 3300          20 Research Dallas

The beauty of this approach is that it works with any granularity and it automatically merges identical intervals. In this example I use a granularity of a day, but this approach works also for granularity of seconds or even fraction of a seconds, e.g. if you are using a TIMESTAMP data type to define the interval boundaries.

It’s important to notice that I’ve used an including semantic for VALID_TO in this example. If you use an excluding semantic (VALID_TO = VALID_FROM of the subsequent interval) you have to amend the calculation of the VALID_TO and the join criteria (BETWEEN is not feasible with excluding semantic). Furthermore this example does not cover gaps in the historization. If you have gaps you need to amend the calculation of the VALID_TO column and ensure that you do not merge gaps. Merging intervals with a simple group by will produce wrong results if “disconnected” intervals have the same content. Issues are addressed in part 2 of this post.

Updated on 28-DEC-2012, emphasized possibility of wrong results and added link to part 2 of this post.

↧

Merging Temporal Intervals with Gaps

December 27, 2012, 7:05 am

≫ Next: Joining Temporal Intervals Part 2

≪ Previous: Joining Temporal Intervals

In Joining Temporal Intervals I explained how to join multiple temporal tables. The provided solution merges also temporal intervals but – as pointed out in that post – may produce wrong results if the underlying driving table is not gaplessly historized. In this post I’ll explain how to merge temporal intervals with various data constellations correctly.

You find the scripts to create and populate the tables for scenario A and B here.

Scenario A – No overlapping intervals

This scenario handles consistent data. This means no overlapping intervals, no duplicate intervals, no including intervals, no negative intervals (valid_to < valid_from). Here is the content of the example table t1:

SQL> SELECT * FROM t1;

       VID        OID VALID_FROM  VALID_TO    C1         C2
---------- ---------- ----------- ----------- ---------- ----------
         1          1 01-JAN-2010 31-DEC-2010 A          B1
         2          1 01-JAN-2011 31-MAR-2011 A          B2
         3          1 01-JUN-2011 31-JAN-2012 A          B2
         4          1 01-APR-2012 31-DEC-9999 A          B4
         5          2 01-JAN-2010 31-JUL-2012 B          B1
         6          2 01-AUG-2012 31-DEC-9999 B          B2
        18          4 01-JAN-2010 30-SEP-2011 D          D1
        19          4 01-OCT-2011 30-SEP-2012            D2
        20          4 01-OCT-2012 31-DEC-9999 D          D3

I’d like to write a query which produces all intervals for the columns OID and C1 honoring gaps in the historization. For OID 1 I expect that record 1 and 2 are merged, but records 3 and 4 are not merged because the intervals are not “connected”. For OID 2 I expected to get a single merged interval. For OID 4 I expect to get 3 records, since the records with C1=’D’ are not connected.

So, the following query result is expected:

OID VALID_FROM  VALID_TO    C1
---------- ----------- ----------- ----------
         1 01-JAN-2010 31-MAR-2011 A
         1 01-JUN-2011 31-JAN-2012 A
         1 01-APR-2012 31-DEC-9999 A
         2 01-JAN-2010 31-DEC-9999 B
         4 01-JAN-2010 30-SEP-2011 D
         4 01-OCT-2011 30-SEP-2012
         4 01-OCT-2012 31-DEC-9999 D

The next query produces exactly this result.

WITH
   calc_various AS (
      -- produces column has_gap with the following meaning:
      -- 1: offset > 0 between current and previous record (gap)
      -- 0: offset = 0 between current and previous record (no gap)
      -- produces column new_group with the following meaning:
      -- 1: group-by-columns differ in current and previous record
      -- 0: same group-by-columns in current and previous record
      SELECT oid,
             valid_from,
             valid_to,
             c1,
             c2,
             CASE
                WHEN LAG(valid_to, 1, valid_from - 1) OVER(
                        PARTITION BY oid ORDER BY valid_from
                     ) = valid_from - 1 THEN
                   0
                ELSE
                   1
             END AS has_gap,
             CASE
                WHEN LAG(c1, 1, c1) OVER(
                        PARTITION BY oid ORDER BY valid_from
                     ) = c1 THEN
                   0
                ELSE
                   1
             END AS new_group
        FROM t1
   ),
   calc_group AS (
      -- produces column group_no, records with the same group_no
      -- are mergeable, group_no is calculated per oid 
      SELECT oid,
             valid_from,
             valid_to,
             c1,
             c2,
             SUM(has_gap + new_group) OVER(
                PARTITION BY oid ORDER BY oid, valid_from
             ) AS group_no
        FROM calc_various
   ),
   merged AS (
      -- produces the final merged result
      -- grouping by group_no ensures that gaps are honored
      SELECT oid,
             MIN(valid_from) AS valid_from,
             MAX(valid_to) AS valid_to,
             c1
        FROM calc_group
       GROUP BY oid, c1, group_no
       ORDER BY oid, valid_from
   )
-- main 
SELECT * FROM merged;

The usability of the WITH clause aka Subquery Factoring Clause improved significantly with 11gR2. Since then it’s not necessary to reference all named queries anymore. The named queries become real transient views and this simplifies debugging a lot. – If you replace the content of line 57 with “SELECT * FROM calc_various ORDER BY oid, valid_from;” the query produces the following result:

OID VALID_FROM  VALID_TO    C1 C2 HAS_GAP NEW_GROUP
--- ----------- ----------- -- -- ------- ---------
  1 01-JAN-2010 31-DEC-2010 A  B1       0         0
  1 01-JAN-2011 31-MAR-2011 A  B2       0         0
  1 01-JUN-2011 31-JAN-2012 A  B2       1         0
  1 01-APR-2012 31-DEC-9999 A  B4       1         0
  2 01-JAN-2010 31-JUL-2012 B  B1       0         0
  2 01-AUG-2012 31-DEC-9999 B  B2       0         0
  4 01-JAN-2010 30-SEP-2011 D  D1       0         0
  4 01-OCT-2011 30-SEP-2012    D2       0         1
  4 01-OCT-2012 31-DEC-9999 D  D3       0         1

You see that the value 1 for HAS_GAP indicates that the record is not “connected” with the previous record. Additionally the value 1 for the column NEW_GROUP indicates that the records must not be merged even if they are connected.

To simplify the calculation of NEW_GROUP for multiple group by columns (used in the named query “merged”) build a concatenated string of all relevant columns to deal with a single column similar to the column C1 in this example.

HAS_GAP and NEW_GROUP are used in the subsequent named query calc_group which produces the following result:

OID VALID_FROM  VALID_TO    C1 C2 GROUP_NO
--- ----------- ----------- -- -- --------
  1 01-JAN-2010 31-DEC-2010 A  B1        0
  1 01-JAN-2011 31-MAR-2011 A  B2        0
  1 01-JUN-2011 31-JAN-2012 A  B2        1
  1 01-APR-2012 31-DEC-9999 A  B4        2
  2 01-JAN-2010 31-JUL-2012 B  B1        0
  2 01-AUG-2012 31-DEC-9999 B  B2        0
  4 01-JAN-2010 30-SEP-2011 D  D1        0
  4 01-OCT-2011 30-SEP-2012    D2        1
  4 01-OCT-2012 31-DEC-9999 D  D3        2

The GROUP_NO is calculated per OID. It’s technically a running total of HAS_GAP + NEW_GROUP. All intervals with the same GROUP_NO are mergeable. E.g. record 2, 3 and 4 have a different GROUP_NO which ensures that every single gap is honored for OID 1.

Scenario B – With Overlapping intervals

Reality is that we have sometimes to deal with inconsistent data. E.g. duplicate data intervals, overlapping data intervals or even negative data intervals (valid_to < valid_from). I’ve created an example table t2 which is in fact a copy of t1 but includes an additional messy OID 3 with the following data:

SQL> SELECT * FROM t2 WHERE oid >= 3 ORDER BY vid;

       VID        OID VALID_FROM  VALID_TO    C1         C2
---------- ---------- ----------- ----------- ---------- ----------
         7          3 01-JAN-2010 31-DEC-2010 C          B1
         8          3 01-JAN-2010 31-MAR-2010 C          B2
         9          3 01-JUN-2010 31-AUG-2010 C          B3
        10          3 01-OCT-2010 31-DEC-2010 C          B4
        11          3 01-FEB-2011 30-JUN-2011 C          B5
        12          3 01-FEB-2011 30-JUN-2011 C          B6
        13          3 01-JUN-2011 31-AUG-2011 C          B7
        14          3 31-AUG-2011 30-SEP-2011 C          B8
        15          3 01-DEC-2011 31-MAY-2012 C          B9
        16          3 01-DEC-2011 31-MAY-2012 C          B9
        17          3 01-JUN-2012 31-DEC-2012 C          B10

The following figure visualizes the intervals of OID 3. The green patterns are expected to be used and the red pattern are expected to be ignored. The rationale is that the VID (version ID) is typically based on an Oracle Sequence and therefore it’s assumed that higher VIDs are newer and therefore more adequate. In real cases you may have additional information such a created_at or modified_on timestamps which helps you identifying the record to be used in conflicting situations. Since data is considered inconsistent in this scenario the cleansing strategy might be very different in real cases. However, cleansing is always the first step and in this example the highest VID has the highest priority in case of conflicts.

In cleansing step 1 we gather all potential starting intervals. We do not need all these intervals, but since we will merge intervals in the last processing step, we do not have to care about some unnecessary intervals right now.

SQL> WITH
  2     o AS (
  3        -- object identifier and their valid_from values
  4        SELECT oid, valid_from FROM t2
  5        UNION
  6        SELECT oid, valid_to + 1 FROM t2
  7         WHERE valid_to != DATE '9999-12-31'
  8     )
  9  -- main
 10  SELECT * FROM o WHERE oid = 3;

       OID VALID_FROM
---------- -----------
         3 01-JAN-2010
         3 01-APR-2010
         3 01-JUN-2010
         3 01-SEP-2010
         3 01-OCT-2010
         3 01-JAN-2011
         3 01-FEB-2011
         3 01-JUN-2011
         3 01-JUL-2011
         3 31-AUG-2011
         3 01-SEP-2011
         3 01-OCT-2011
         3 01-DEC-2011
         3 01-JUN-2012
         3 01-JAN-2013

15 rows selected.

In cleansing step 2 we calculate the relevant VID for every previous result. Additionally we may inexpensively calculate if an interval is a gap.

SQL> WITH
  2     o AS (
  3        -- object identifier and their valid_from values
  4        SELECT oid, valid_from FROM t2
  5        UNION
  6        SELECT oid, valid_to + 1 FROM t2
  7         WHERE valid_to != DATE '9999-12-31'
  8     ),
  9     v AS (
 10        -- relevant version identifier per valid_from
 11        -- produces column is_gap
 12        SELECT o.oid,
 13               MAX(vid) AS vid,
 14               o.valid_from,
 15               NVL2(MAX(vid), 0, 1) AS is_gap
 16          FROM o
 17          LEFT JOIN t2
 18            ON t2.oid = o.oid
 19               AND o.valid_from BETWEEN t2.valid_from AND t2.valid_to
 20         GROUP BY o.oid, o.valid_from
 21     )
 22  -- main
 23  SELECT * FROM v WHERE oid = 3 ORDER BY valid_from;

       OID        VID VALID_FROM      IS_GAP
---------- ---------- ----------- ----------
         3          8 01-JAN-2010          0
         3          7 01-APR-2010          0
         3          9 01-JUN-2010          0
         3          7 01-SEP-2010          0
         3         10 01-OCT-2010          0
         3            01-JAN-2011          1
         3         12 01-FEB-2011          0
         3         13 01-JUN-2011          0
         3         13 01-JUL-2011          0
         3         14 31-AUG-2011          0
         3         14 01-SEP-2011          0
         3            01-OCT-2011          1
         3         16 01-DEC-2011          0
         3         17 01-JUN-2012          0
         3            01-JAN-2013          1

15 rows selected.

In cleansing step 3 we extend the previous result by the missing columns from table t2 and calculate the NEW_GROUP column with the same logic as in scenario A.

SQL> WITH
  2     o AS (
  3        -- object identifier and their valid_from values
  4        SELECT oid, valid_from FROM t2
  5        UNION
  6        SELECT oid, valid_to + 1 FROM t2
  7         WHERE valid_to != DATE '9999-12-31'
  8     ),
  9     v AS (
 10        -- relevant version identifier per valid_from
 11        -- produces column is_gap
 12        SELECT o.oid,
 13               MAX(vid) AS vid,
 14               o.valid_from,
 15               NVL2(MAX(vid), 0, 1) AS is_gap
 16          FROM o
 17          LEFT JOIN t2
 18            ON t2.oid = o.oid
 19               AND o.valid_from BETWEEN t2.valid_from AND t2.valid_to
 20         GROUP BY o.oid, o.valid_from
 21     ),
 22     combined AS (
 23        -- combines previous intermediate result v with t2
 24        -- produces the valid_to and new_group columns
 25        SELECT t2.vid,
 26               v.oid,
 27               v.valid_from,
 28               LEAD(v.valid_from - 1, 1, DATE '9999-12-31') OVER (
 29                  PARTITION BY v.oid ORDER BY v.valid_from
 30               ) AS valid_to,
 31               t2.c1,
 32               t2.c2,
 33               v.is_gap,
 34               CASE
 35                  WHEN LAG(t2.c1, 1, t2.c1) OVER(
 36                          PARTITION BY t2.oid ORDER BY t2.valid_from
 37                       ) = c1 THEN
 38                     0
 39                  ELSE
 40                     1
 41               END AS new_group
 42          FROM v
 43          LEFT JOIN t2
 44            ON t2.oid = v.oid
 45               AND t2.vid = v.vid
 46               AND v.valid_from BETWEEN t2.valid_from AND t2.valid_to
 47     )
 48  -- main
 49  SELECT * FROM combined WHERE oid = 3 ORDER BY valid_from;

VID OID VALID_FROM  VALID_TO    C1 C2  IS_GAP NEW_GROUP
--- --- ----------- ----------- -- --- ------ ---------
  8   3 01-JAN-2010 31-MAR-2010 C  B2       0         0
  7   3 01-APR-2010 31-MAY-2010 C  B1       0         0
  9   3 01-JUN-2010 31-AUG-2010 C  B3       0         0
  7   3 01-SEP-2010 30-SEP-2010 C  B1       0         0
 10   3 01-OCT-2010 31-DEC-2010 C  B4       0         0
      3 01-JAN-2011 31-JAN-2011             1         1
 12   3 01-FEB-2011 31-MAY-2011 C  B6       0         0
 13   3 01-JUN-2011 30-JUN-2011 C  B7       0         0
 13   3 01-JUL-2011 30-AUG-2011 C  B7       0         0
 14   3 31-AUG-2011 31-AUG-2011 C  B8       0         0
 14   3 01-SEP-2011 30-SEP-2011 C  B8       0         0
      3 01-OCT-2011 30-NOV-2011             1         1
 16   3 01-DEC-2011 31-MAY-2012 C  B9       0         0
 17   3 01-JUN-2012 31-DEC-2012 C  B10      0         0
      3 01-JAN-2013 31-DEC-9999             1         1

15 rows selected.

Now we have cleansed the data and are ready for the final steps “calc_group” and “merge” which are very similar to scenario A. The relevant difference is the highlighted line 70 which filters non-gap records. Here is the complete statement and the query result:

WITH
   o AS (
      -- object identifier and their valid_from values
      SELECT oid, valid_from FROM t2
      UNION
      SELECT oid, valid_to + 1 FROM t2 
       WHERE valid_to != DATE '9999-12-31'
   ),
   v AS (
      -- relevant version identifier per valid_from 
      -- produces column is_gap
      SELECT o.oid,
             MAX(vid) AS vid,
             o.valid_from,
             NVL2(MAX(vid), 0, 1) AS is_gap
        FROM o
        LEFT JOIN t2
          ON t2.oid = o.oid
             AND o.valid_from BETWEEN t2.valid_from AND t2.valid_to
       GROUP BY o.oid, o.valid_from
   ),
   combined AS (
      -- combines previous intermediate result v with t2
      -- produces the valid_to and new_group columns
      SELECT t2.vid,
             v.oid,
             v.valid_from,
             LEAD(v.valid_from - 1, 1, DATE '9999-12-31') OVER (
                PARTITION BY v.oid ORDER BY v.valid_from
             ) AS valid_to,
             t2.c1,
             t2.c2,
             v.is_gap,
             CASE
                WHEN LAG(t2.c1, 1, t2.c1) OVER(
                        PARTITION BY t2.oid ORDER BY t2.valid_from
                     ) = c1 THEN
                   0
                ELSE
                   1
             END AS new_group
        FROM v
        LEFT JOIN t2
          ON t2.oid = v.oid
             AND t2.vid = v.vid
             AND v.valid_from BETWEEN t2.valid_from AND t2.valid_to 
   ),
   calc_group AS (
      -- produces column group_no, records with the same group_no
      -- are mergeable, group_no is calculated per oid 
      SELECT oid,
             valid_from,
             valid_to,
             c1,
             c2,
             is_gap,
             SUM(is_gap + new_group) OVER(
                PARTITION BY oid ORDER BY oid, valid_from
             ) AS group_no
        FROM combined
   ),
   merged AS (
      -- produces the final merged result
      -- grouping by group_no ensures that gaps are honored
      SELECT oid,
             MIN(valid_from) AS valid_from,
             MAX(valid_to) AS valid_to,
             c1
        FROM calc_group
       WHERE is_gap = 0
       GROUP BY OID, c1, group_no
       ORDER BY OID, valid_from
   )
-- main 
SELECT * FROM merged;

OID VALID_FROM  VALID_TO    C1
---------- ----------- ----------- ----------
         1 01-JAN-2010 31-MAR-2011 A
         1 01-JUN-2011 31-JAN-2012 A
         1 01-APR-2012 31-DEC-9999 A
         2 01-JAN-2010 31-DEC-9999 B
         3 01-JAN-2010 31-DEC-2010 C
         3 01-FEB-2011 30-SEP-2011 C
         3 01-DEC-2011 31-DEC-2012 C
         4 01-JAN-2010 30-SEP-2011 D
         4 01-OCT-2011 30-SEP-2012
         4 01-OCT-2012 31-DEC-9999 D

If you change line 70 to “WHERE is_gap = 1″ you’ll get all gap records, just a way to query non-existing intervals.

OID VALID_FROM  VALID_TO    C1
---------- ----------- ----------- ----------
         1 01-APR-2011 31-MAY-2011
         1 01-FEB-2012 31-MAR-2012
         3 01-JAN-2011 31-JAN-2011
         3 01-OCT-2011 30-NOV-2011
         3 01-JAN-2013 31-DEC-9999

Conclusion

Merging temporal intervals is challenging, especially if the history has gaps and the data is inconsistent as in scenario B. However, the SQL engine is a powerful tool to clean up data and merge the temporal intervals efficiently in a single SQL statement.

↧

Joining Temporal Intervals Part 2

December 28, 2012, 10:36 am

≫ Next: Loading Historical Data Into Flashback Archive Enabled Tables

≪ Previous: Merging Temporal Intervals with Gaps

The solution I’ve provided in Joining Temporal Intervals produces wrong results if one or more temporal tables have gaps in their history or if disconnected intervals have the same content. In this post I’ll address both problems.

Test Data

The example queries are based on the same model as described in Joining Temporal Intervals. For the join of the tables EMPV, DEPTV, JOBV and EMPV (manager) I’ve amended the history to contain some gaps which are highlighted in the following listing.

SQL> SELECT * FROM empv WHERE empno = 7788 ORDER BY valid_from;

EMPVID EMPNO ENAME JOBNO  MGR HIREDATE       SAL COMM DEPTNO VALID_FROM  VALID_TO
------ ----- ----- ----- ---- ----------- ------ ---- ------ ----------- -----------
     8  7788 SCOTT     5 7566 19-APR-1987 3000.0          20 19-APR-1987 31-DEC-1989
    22  7788 Scott     5 7566 19-APR-1987 3000.0          20 01-JAN-1990 31-MAR-1991
    36  7788 Scott     5 7566 19-APR-1987 3300.0          20 01-APR-1991 31-JUL-1991
    43  7788 Scott     5 7566 01-JAN-1992 3500.0          20 01-JAN-1992 31-DEC-9999

SQL> SELECT * FROM jobv WHERE jobno = 5 ORDER BY valid_from;

    JOBVID JOBNO JOB       VALID_FROM  VALID_TO
---------- ----- --------- ----------- -----------
         5     5 ANALYST   01-JAN-1980 20-JAN-1990
        10     5 Analyst   22-JAN-1990 31-DEC-9999

SQL> SELECT * FROM deptv WHERE deptno = 20 ORDER BY valid_from;

   DEPTVID DEPTNO DNAME          LOC           VALID_FROM  VALID_TO
---------- ------ -------------- ------------- ----------- -----------
         2     20 RESEARCH       DALLAS        01-JAN-1980 28-FEB-1990
         6     20 Research       DALLAS        01-MAR-1990 31-MAR-1990
        10     20 Research       Dallas        01-APR-1990 31-DEC-9999

SQL> SELECT * FROM empv WHERE empno = 7566 ORDER BY valid_from;

EMPVID EMPNO ENAME JOBNO  MGR HIREDATE       SAL COMM DEPTNO VALID_FROM  VALID_TO
------ ----- ----- ----- ---- ----------- ------ ---- ------ ----------- -----------
     4  7566 JONES     4 7839 02-APR-1981 2975.0          20 02-APR-1981 31-DEC-1989
    18  7566 Jones     4 7839 02-APR-1981 2975.0          20 01-JAN-1990 31-MAR-1991
    32  7566 Jones     4 7839 02-APR-1981 3272.5          20 01-APR-1991 31-DEC-9999

SQL> SELECT * FROM empv WHERE mgr = 7788 ORDER BY valid_from;

EMPVID EMPNO ENAME JOBNO  MGR HIREDATE       SAL COMM DEPTNO VALID_FROM  VALID_TO
------ ----- ----- ----- ---- ----------- ------ ---- ------ ----------- -----------
    11  7876 ADAMS     1 7788 23-MAY-1987 1100.0          20 23-MAY-1987 31-DEC-1989
    25  7876 Adams     1 7788 23-MAY-1987 1100.0          20 01-JAN-1990 31-MAR-1991
    39  7876 Adams     1 7788 23-MAY-1987 1210.0          20 01-APR-1991 31-DEC-9999

From a business point of view Scott left the company on 31-JUL-1991 and came back on 01-JAN-1992 with a better salary. It’s important to notice, that Scott is Adams manager and Adams is therefore leaderless from 01-AUG-1991 until 31-DEC-1991. Additionally I fabricated a gap for JOBNO 5 on 21-JAN-1990.

You find the SQL script to create and populate the model here.

Gap-Aware Temporal Join

The following figure visualizes the expected result of the temporal join. The raw data intervals queried perviously are represented in blue and the join result in red. The yellow bars highlight the gaps in the source and result data set.

Here is the query and the join result for EMPNO = 7788. Please note that the column LOC from table DEPV is not queried, which will reduce the number of final result intervals from 7 to 6.

SELECT e.empno,
       g.valid_from,
       LEAST(
          e.valid_to, 
          d.valid_to, 
          j.valid_to, 
          NVL(m.valid_to, e.valid_to),
          LEAD(g.valid_from - 1, 1, e.valid_to) OVER(
             PARTITION BY e.empno ORDER BY g.valid_from
          )
       ) AS valid_to,
       e.ename,
       j.job,
       e.mgr,
       m.ename AS mgr_ename,
       e.hiredate,
       e.sal,
       e.comm,
       e.deptno,
       d.dname
  FROM empv e
 INNER JOIN (SELECT valid_from FROM empv
             UNION
             SELECT valid_from FROM deptv
             UNION
             SELECT valid_from FROM jobv
             UNION
             SELECT valid_to + 1 FROM empv 
              WHERE valid_to != DATE '9999-12-31'
             UNION
             SELECT valid_to + 1 FROM deptv 
              WHERE valid_to != DATE '9999-12-31'
             UNION
             SELECT valid_to + 1 FROM jobv 
              WHERE valid_to != DATE '9999-12-31') g
    ON g.valid_from BETWEEN e.valid_from AND e.valid_to
 INNER JOIN deptv d
    ON d.deptno = e.deptno
       AND g.valid_from BETWEEN d.valid_from AND d.valid_to
 INNER JOIN jobv j
    ON j.jobno = e.jobno
       AND g.valid_from BETWEEN j.valid_from AND j.valid_to
  LEFT JOIN empv m
    ON m.empno = e.mgr
       AND g.valid_from BETWEEN m.valid_from AND m.valid_to
WHERE e.empno = 7788
ORDER BY 1, 2;

EMPNO VALID_FROM  VALID_TO    ENAME JOB     MGR MGR_E HIREDATE       SAL COMM DEPTNO DNAME
----- ----------- ----------- ----- ------- ---- ----- ----------- ------ ---- ------ --------
 7788 19-APR-1987 22-MAY-1987 SCOTT ANALYST 7566 JONES 19-APR-1987 3000.0          20 RESEARCH
 7788 23-MAY-1987 31-DEC-1989 SCOTT ANALYST 7566 JONES 19-APR-1987 3000.0          20 RESEARCH
 7788 01-JAN-1990 20-JAN-1990 Scott ANALYST 7566 Jones 19-APR-1987 3000.0          20 RESEARCH
 7788 22-JAN-1990 28-FEB-1990 Scott Analyst 7566 Jones 19-APR-1987 3000.0          20 RESEARCH
 7788 01-MAR-1990 31-MAR-1990 Scott Analyst 7566 Jones 19-APR-1987 3000.0          20 Research
 7788 01-APR-1990 31-MAR-1991 Scott Analyst 7566 Jones 19-APR-1987 3000.0          20 Research
 7788 01-APR-1991 31-JUL-1991 Scott Analyst 7566 Jones 19-APR-1987 3300.0          20 Research
 7788 01-JAN-1992 31-DEC-9999 Scott Analyst 7566 Jones 01-JAN-1992 3500.0          20 Research

The highlighted inline view g produces a list of all distinct VALID_FROM values which will be used as additional join criterion for all temporal tables. Unlinke in Joining Temporal Intervals all interval endpoints need also be considered to identify gaps.

The calculation of the VALID_TO column is a bit laborious (see highlighted lines 3 to 11). You need to get the lowest value for VALID_TO of all involved intervals, including outer joined intervals like m. Also the subsequent VALID_FROM has to be considered since the inline view g is providing all VALID_FROM values to be probed and they may be completely independent of the involved intervals.

The remaining part of the query is quite simple. I’ve highlighted the rows in the result set which should be merged in a subsequent step (see line 4 and 8).

If you change line 46 to “WHERE e.mgr = 7788″ you get the following result:

EMPNO VALID_FROM  VALID_TO    ENAME JOB      MGR MGR_E HIREDATE       SAL COMM DEPTNO DNAME
----- ----------- ----------- ----- ------- ---- ----- ----------- ------ ---- ------ --------
 7876 23-MAY-1987 31-DEC-1989 ADAMS CLERK   7788 SCOTT 23-MAY-1987 1100.0          20 RESEARCH
 7876 01-JAN-1990 20-JAN-1990 Adams CLERK   7788 Scott 23-MAY-1987 1100.0          20 RESEARCH
 7876 21-JAN-1990 21-JAN-1990 Adams Clerk   7788 Scott 23-MAY-1987 1100.0          20 RESEARCH
 7876 22-JAN-1990 28-FEB-1990 Adams Clerk   7788 Scott 23-MAY-1987 1100.0          20 RESEARCH
 7876 01-MAR-1990 31-MAR-1990 Adams Clerk   7788 Scott 23-MAY-1987 1100.0          20 Research
 7876 01-APR-1990 31-MAR-1991 Adams Clerk   7788 Scott 23-MAY-1987 1100.0          20 Research
 7876 01-APR-1991 31-JUL-1991 Adams Clerk   7788 Scott 23-MAY-1987 1210.0          20 Research
 7876 01-AUG-1991 31-DEC-1991 Adams Clerk   7788       23-MAY-1987 1210.0          20 Research
 7876 01-JAN-1992 31-DEC-9999 Adams Clerk   7788 Scott 23-MAY-1987 1210.0          20 Research

This result is interesting for several reasons. First, there are two highlighted records which should be merged in a subsequent step. Second, the line 10 represents the time where Scott was not employed by this company. Third, the records in line 9 and 11 must not be merged in a subsequent step. They are identical (beside VALID_FROM and VALID_TO) but the intervals are not connected.

Merging Temporal Intervals

If you look at the result of the previous query you might be tempted to avoid the merging step since there are just a few intervals which need merging. However, in real live scenarios you might easily end up with daily intervals for large tables, since the inline-view g considers all valid_from and valid_to columns of all involved tables. Sooner or later you will think about merging temporal intervals or about other solutions to reduce the result set. – If you’re skeptical, then querying for “e.empno = 7369″ might give you an idea what I’m taking about (21 intervals before merge, 6 intervals after merge).

Since I covered this topic in Merging Temporal Intervals with Gaps I’ll provide the query to produce the final result and explain the specialities only.

WITH 
   joined AS (
      -- gap-aware temporal join
      -- produces result_cols to calculate new_group in the subsequent query
      SELECT e.empno,
             g.valid_from,
             LEAST(
                e.valid_to, 
                d.valid_to, 
                j.valid_to, 
                NVL(m.valid_to, e.valid_to),
                LEAD(g.valid_from - 1, 1, e.valid_to) OVER(
                   PARTITION BY e.empno ORDER BY g.valid_from
                )
             ) AS valid_to,
             (
                e.ename 
                || ',' || j.job 
                || ',' || e.mgr 
                || ',' || m.ename 
                || ',' || TO_CHAR(e.hiredate,'YYYY-MM-DD') 
                || ',' || e.sal 
                || ',' || e.comm 
                || ',' || e.deptno 
                || ',' || d.dname 
             ) AS result_cols,
             e.ename,
             j.job,
             e.mgr,
             m.ename AS mgr_ename,
             e.hiredate,
             e.sal,
             e.comm,
             e.deptno,
             d.dname             
        FROM empv e
       INNER JOIN (SELECT valid_from FROM empv
                   UNION
                   SELECT valid_from FROM deptv
                   UNION
                   SELECT valid_from FROM jobv
                   UNION
                   SELECT valid_to + 1 FROM empv 
                    WHERE valid_to != DATE '9999-12-31'
                   UNION
                   SELECT valid_to + 1 FROM deptv 
                    WHERE valid_to != DATE '9999-12-31'
                   UNION
                   SELECT valid_to + 1 FROM jobv 
                    WHERE valid_to != DATE '9999-12-31') g
          ON g.valid_from BETWEEN e.valid_from AND e.valid_to
       INNER JOIN deptv d
          ON d.deptno = e.deptno
             AND g.valid_from BETWEEN d.valid_from AND d.valid_to
       INNER JOIN jobv j
          ON j.jobno = e.jobno
             AND g.valid_from BETWEEN j.valid_from AND j.valid_to
        LEFT JOIN empv m
          ON m.empno = e.mgr
             AND g.valid_from BETWEEN m.valid_from AND m.valid_to
   ),
   calc_various AS (
      -- produces columns has_gap, new_group
      SELECT empno,
             valid_from,
             valid_to,
             result_cols,
             ename,
             job,
             mgr,
             mgr_ename,
             hiredate,
             sal,
             comm,
             deptno,
             dname,
             CASE
                WHEN LAG(valid_to, 1, valid_from - 1) OVER(
                        PARTITION BY empno ORDER BY valid_from
                     ) = valid_from - 1 THEN
                   0
                ELSE
                   1
             END AS has_gap,
             CASE 
                WHEN LAG(result_cols, 1, result_cols) OVER (
                        PARTITION BY empno ORDER BY valid_from
                     ) = result_cols THEN
                   0
                ELSE
                   1
             END AS new_group
        FROM joined
   ),
   calc_group AS (
      -- produces column group_no
      SELECT empno,
             valid_from,
             valid_to,
             ename,
             job,
             mgr,
             mgr_ename,
             hiredate,
             sal,
             comm,
             deptno,
             dname,
              SUM(has_gap + new_group) OVER(
                PARTITION BY empno ORDER BY valid_from
             ) AS group_no
        FROM calc_various
   ),
   merged AS (
      -- produces the final merged result
      -- grouping by group_no ensures that gaps are honored
      SELECT empno,
             MIN(valid_from) AS valid_from,
             MAX(valid_to) AS valid_to,
             ename,
             job,
             mgr,
             mgr_ename,
             hiredate,
             sal,
             comm,
             deptno,
             dname
        FROM calc_group
       GROUP BY empno,
                group_no,
                ename,
                job,
                mgr,
                mgr_ename,
                hiredate,
                sal,
                comm,
                deptno,
                dname
       ORDER BY empno,
                valid_from
   )   
-- main
select * FROM merged WHERE empno = 7788;

EMPNO VALID_FROM  VALID_TO    ENAME JOB      MGR MGR_E HIREDATE       SAL COMM DEPTNO DNAME
----- ----------- ----------- ----- ------- ---- ----- ----------- ------ ---- ------ --------
 7788 19-APR-1987 31-DEC-1989 SCOTT ANALYST 7566 JONES 19-APR-1987 3000.0          20 RESEARCH
 7788 01-JAN-1990 20-JAN-1990 Scott ANALYST 7566 Jones 19-APR-1987 3000.0          20 RESEARCH
 7788 22-JAN-1990 28-FEB-1990 Scott Analyst 7566 Jones 19-APR-1987 3000.0          20 RESEARCH
 7788 01-MAR-1990 31-MAR-1991 Scott Analyst 7566 Jones 19-APR-1987 3000.0          20 Research
 7788 01-APR-1991 31-JUL-1991 Scott Analyst 7566 Jones 19-APR-1987 3300.0          20 Research
 7788 01-JAN-1992 31-DEC-9999 Scott Analyst 7566 Jones 01-JAN-1992 3500.0          20 Research

The named query “joined” produces an additional column RESULT_COLS (see highlighted lines 16-26). It’s simply a concatenation of all columns used in the group by clause of the named query “merged”. RESULT_COLS is used in the named query “calc_various” (see highlighted lines 86-88) to calculate the column NEW_GROUP. NEW_GROUP is set to 1 if the value of RESULT_COLS is different for the current and previous row. NEW_GROUP ensures that Adam’s intervals valid from 01-APR-1991 and valid from 01-JAN-1992 are not merged. See line 7 and 9 in the following listing.

EMPNO VALID_FROM  VALID_TO    ENAME JOB      MGR MGR_E HIREDATE       SAL COMM DEPTNO DNAME
----- ----------- ----------- ----- ------- ---- ----- ----------- ------ ---- ------ --------
 7876 23-MAY-1987 31-DEC-1989 ADAMS CLERK   7788 SCOTT 23-MAY-1987 1100.0          20 RESEARCH
 7876 01-JAN-1990 20-JAN-1990 Adams CLERK   7788 Scott 23-MAY-1987 1100.0          20 RESEARCH
 7876 21-JAN-1990 28-FEB-1990 Adams Clerk   7788 Scott 23-MAY-1987 1100.0          20 RESEARCH
 7876 01-MAR-1990 31-MAR-1991 Adams Clerk   7788 Scott 23-MAY-1987 1100.0          20 Research
 7876 01-APR-1991 31-JUL-1991 Adams Clerk   7788 Scott 23-MAY-1987 1210.0          20 Research
 7876 01-AUG-1991 31-DEC-1991 Adams Clerk   7788       23-MAY-1987 1210.0          20 Research
 7876 01-JAN-1992 31-DEC-9999 Adams Clerk   7788 Scott 23-MAY-1987 1210.0          20 Research

The ORA_HASH function over RESULT_COLS could give me a shorter representation of all result columns. But since I do not detect hash collisions and this would lead to wrong results in some rare data constellations, I decided not to use a hash function.

The named queries “calc_various”, “calc_group” and “merged” are based on the query in scenario A of Merging Temporal Intervals with Gaps. The columns HAS_GAP, NEW_GROUP and GROUP_NO are explained in this post.

Conclusion

Joining and merging temporal intervals is indeed very challenging. Even if I showed in this post that it is doable I recommend to choose a simpler solution whenever feasible. E.g. limiting the query to a certain point in time for all involved temporal tables, since this eliminates the need of merging temporal intervals and even simplifies the gap-aware temporal join.

↧

Loading Historical Data Into Flashback Archive Enabled Tables

January 2, 2013, 6:34 pm

≫ Next: Trivadis PL/SQL & SQL CodeAnalyzer Released

≪ Previous: Joining Temporal Intervals Part 2

Oracle provides via OTN an import solution for FBA (Flashback Data Archive also known as Total Recall). The solution extends the SCN to TIMESTAMP mapping plus provides a wrapper to existing APIs to populate the history. However, issues like using a customized mapping period/precision or ORA-1466 when using the AS OF TIMESTAMP clause are not addressed. I show in this post how to load historical data into flashback archive enabled tables using the standard API. Unfortunately there are still some experimental actions necessary to get a fully functional result, at least with version 11.2.0.3.4.

Test Scenario

There are various reasons why you may want to load historical data into a FBA enable table. E.g. for testing purposes, to move FBA enabled tables from one database instance to another or to migrate conventionally historized tables. This example is based on a migration scenario. Table T1 has the following content and shall be migrated to FBA. You find the script to create and populate the table T1 here.

SQL> SELECT * FROM t1;

VID OID CREATED_AT          OUTDATED_AT         C1 C2
--- --- ------------------- ------------------- -- --
  1   1 2012-12-19 13:00:57 2012-12-21 08:31:01 A  A1
  2   1 2012-12-21 08:31:01 2012-12-23 20:58:05 A  A2
  3   1 2012-12-23 20:58:05 2012-12-27 11:40:41 A  A3
  4   1 2012-12-27 11:40:41 9999-12-31 23:59:59 A  A4
  5   2 2012-12-20 13:51:55 9999-12-31 23:59:59 B  B1
  6   4 2012-12-22 11:03:22 2012-12-23 19:36:08 C  C1
  7   4 2012-12-28 14:25:50 2012-12-30 17:10:39 C  C1
  8   4 2012-12-30 17:10:39 2012-12-31 12:05:40 C  C2

The column VID is the version identifier and the primary key. OID is the object identifier, which is unique at every point in time. CREATED_AT and OUTDATED_AT define the interval boundaries. Column C1 and C2 are the payload columns, which may change over time.

The following queries return data valid at 2012-12-23 19:36:08 and now.

SQL> SELECT * FROM t1
  2   WHERE created_at <= TIMESTAMP '2012-12-23 19:36:08'
  3     AND outdated_at > TIMESTAMP '2012-12-23 19:36:08';

VID OID CREATED_AT          OUTDATED_AT         C1 C2
--- --- ------------------- ------------------- -- --
  2   1 2012-12-21 08:31:01 2012-12-23 20:58:05 A  A2
  5   2 2012-12-20 13:51:55 9999-12-31 23:59:59 B  B1

SQL> SELECT * FROM t1 WHERE outdated_at > SYSDATE;

VID OID CREATED_AT          OUTDATED_AT         C1 C2
--- --- ------------------- ------------------- -- --
  4   1 2012-12-27 11:40:41 9999-12-31 23:59:59 A  A4
  5   2 2012-12-20 13:51:55 9999-12-31 23:59:59 B  B1

It’s important to notice that OID 4 (see highlighted line above) is not part of the first query result, since OUTDATED_AT has an excluding semantic. I mention this fact because the column ENDSCN in the table SYS_FBA_HIST_<object_id> uses also an excluding semantic, which simplifies the migration process, at least in this area.

From UTC to SCN

Oracle uses its own time standard SCN for FBA. The SCN is initialized during database creation and is valid for a single Oracle instance, even if synchronization mechanisms among database instances exist (see also MOS note 1376995.1). A SCN may not represent a date-time value before 1988-01-01 00:00:00. The time spent between two SCNs is varying, it may be shorter when the database instance is executing a lot of transactions and it may be longer in more idle times or when the database instance is shut down. Oracle uses the table SYS.SMON_SCN_TIME to map SCN to TIMESTAMPs and vice versa and provides the functions SCN_TO_TIMESTAMP and TIMESTAMP_TO_SCN for that purpose.

So what is the first date-time value which may be converted into a SCN?

SQL> SELECT MIN(time_dp) AS min_time_dp,
  2         MIN(scn) AS min_scn,
  3         CAST(scn_to_timestamp(MIN(scn)) AS DATE) AS min_scn_to_ts,
  4         MAX(dbtimezone) AS dbtimezone,
  5         sessiontimezone
  6    FROM sys.smon_scn_time;

MIN_TIME_DP            MIN_SCN MIN_SCN_TO_TS       DBTIMEZONE SESSIONTIMEZONE
------------------- ---------- ------------------- ---------- ---------------
2010-09-05 22:40:05      18669 2010-09-06 00:40:05 +00:00     +01:00

The first value is 2010-09-06 00:40:05. You may notice the two hour difference to MIN_TIME_DP. Oracle stores the date values in this table in UTC (DBTIMEZONE) and my database server’s time zone is CET (Central European Time) which is UTC+01:00 at the point of query (see SESSIONTIMEZONE). However, in September daylight saving time was active and back then CET was UTC+02:00. That explains the two hour difference.

Let’s test the boundary values.

SQL> SELECT timestamp_to_scn(TIMESTAMP '2010-09-06 00:40:05') AS SCN FROM dual;

       SCN
----------
     18669

SQL> SELECT timestamp_to_scn(TIMESTAMP '2010-09-06 00:40:04') AS SCN FROM dual;
SELECT timestamp_to_scn(TIMESTAMP '2010-09-06 00:40:04') AS SCN FROM dual
       *
ERROR at line 1:
ORA-08180: no snapshot found based on specified time
ORA-06512: at "SYS.TIMESTAMP_TO_SCN", line 1

As expected 2010-09-06 00:40:05 works and 2010-09-06 00:40:04 raises an ORA-8180 error.

I expect that you get complete different values in your database, since the SMON process deletes “old” values in SYS.SMON_SCN_TIME based on UNDO and FBA configuration. Since this table is solely used for timestamp to SCN conversion I assume it is save to extend it manually. In fact the PL/SQL package DBMS_FDA_MAPPINGS provided by Oracle is exactly doing that, but distributes the remaining SCNs between 1988-01-01 and the MIN(time_dp) in SYS.SMON_SCN_TIME uniformly. In my case there are only 18868 SCNs remaining to be assigned to timestamps before 2010-09-06 00:40:05. So if I know that I won’t need timestamps to be mapped to SCN let’s say before 2010-01-01 I may use the remaining values to improve the precision of timestamp to SCN mappings for this reduced period.

Im my case I do not need to extend the mapping in SYS.SMON_SCN_TIME, but here is an example how it can be done:

SQL> INSERT INTO smon_scn_time (
  2     thread,
  3     orig_thread,
  4     time_mp,
  5     time_dp,
  6     scn_wrp,
  7     scn_bas,
  8     scn,
  9     num_mappings
 10  )
 11  SELECT 0 AS thread,
 12         0 AS orig_thread,
 13         (
 14            CAST(
 15               to_timestamp_tz(
 16                  '2010-09-06 00:40:04 CET',
 17                  'YYYY-MM-DD HH24:MI:SS TZR'
 18               ) AT TIME ZONE 'UTC' AS DATE
 19            ) - DATE '1970-01-01'
 20         ) * 60 * 60 * 24 AS time_mp,
 21         CAST(
 22            to_timestamp_tz(
 23               '2010-09-06 00:40:04 CET',
 24               'YYYY-MM-DD HH24:MI:SS TZR'
 25            ) at TIME ZONE 'UTC' AS DATE
 26         ) AS time_dp,
 27         FLOOR((MIN(scn) - 1) / POWER(2, 32)) AS scn_wrp,
 28         MOD(MIN(scn) - 1, POWER(2, 32)) AS scn_bas,
 29         MIN(scn) - 1 AS scn,
 30         0 AS num_mappings
 31    FROM sys.smon_scn_time;

1 row created.

SQL> SELECT timestamp_to_scn(TIMESTAMP '2010-09-06 00:40:04') AS scn FROM dual;

       SCN
----------
     18668

TIME_MP is the number of seconds since 1970-01-01. TIME_DP is the date value of TIME_MP. The calculation of these columns includes time zone conversion from CET to UTC which makes it a bit verbose. SCN_WRAP counts the number of times SCN_BASE has reached its 32-bit value maximum of 4294967295.

Alternatively you may simply extend the SYS.SMON_SCN_TIME based on SYS.V$LOG_HISTORY.

INSERT INTO smon_scn_time
   (thread,
    orig_thread,
    time_mp,
    time_dp,
    scn_wrp,
    scn_bas,
    scn,
    num_mappings)
   SELECT 0 AS thread,
          0 AS orig_thread,
          (first_time - DATE '1970-01-01') * 60 * 60 * 24 AS time_mp,
          first_time AS time_dp,
          floor(first_change# / power(2, 32)) AS scn_wrp,
          MOD(first_change#, power(2, 32)) AS scn_bas,
          first_change# AS scn,
          0 AS num_mappings
     FROM v$log_history
    WHERE first_time < (SELECT MIN(time_dp)
                          FROM smon_scn_time);

BTW: The TIMESTAMP_TO_SCN function uses an own kind of result cache. You may need to restart the database if you undo changes in SYS.SMON_SCN_TIME.

Next, we should check if the mapping from timestamp to SCN is precise enough, means a SCN should not be used by multiple timestamps. Here’s the query for T1:

SQL> SELECT COUNT(DISTINCT ts) AS cnt_ts,
  2         COUNT(DISTINCT timestamp_to_scn(ts)) AS cnt_ts_to_scn
  3    FROM (SELECT created_at ts FROM t1
  4          UNION
  5          SELECT outdated_at FROM t1
  6          MINUS
  7          SELECT TIMESTAMP '9999-12-31 23:59:59' FROM dual);

    CNT_TS CNT_TS_TO_SCN
---------- -------------
        10            10

T1 is ready to be migrated to a FBA enabled table if CNT_TS and CNT_TS_TO_SCN are equal. If the values are different you have basically two options. a) amend T1 (e.g. change timestamps or merge intervals) or b) amend SYS.SMON_SCN_TIME as explained above.

Migration

The following script creates a flashback archive and the FBA enabled table T2. At the end a small PL/SQL block is executed to create views for the 3 SYS_FBA_…_<object_id> tables.

-- create FBA
CREATE FLASHBACK ARCHIVE fba TABLESPACE USERS QUOTA 10M RETENTION 10 YEAR;

-- create FBA enabled table t2
CREATE TABLE t2 (
   oid         NUMBER(4,0)  NOT NULL PRIMARY KEY,
   c1          VARCHAR2(10) NOT NULL,
   c2          VARCHAR2(10) NOT NULL
) FLASHBACK ARCHIVE fba;

-- enforce visibility of SYS_FBA tables
BEGIN
   dbms_flashback_archive.disassociate_fba(owner_name => USER, table_name => 'T2');
   dbms_flashback_archive.reassociate_fba(owner_name => USER, table_name => 'T2');
END;
/

-- create views on SYS_FBA tables
DECLARE
   PROCEDURE create_view(in_view_name  IN VARCHAR2,
                         in_table_name IN VARCHAR2) IS
   BEGIN
      EXECUTE IMMEDIATE 'CREATE OR REPLACE VIEW ' || in_view_name ||
                        ' AS SELECT * FROM ' || in_table_name;
   END create_view;
BEGIN
   FOR l_rec IN (SELECT object_id
                   FROM user_objects
                  WHERE object_name = 'T2')
   LOOP
      create_view('T2_DDL_COLMAP','SYS_FBA_DDL_COLMAP_' || l_rec.object_id);
      create_view('T2_HIST', 'SYS_FBA_HIST_' || l_rec.object_id);
      create_view('T2_TCRV', 'SYS_FBA_TCRV_' || l_rec.object_id);
   END LOOP;
END;
/

Flashback Query (since Oracle9i) and Flashback Data Archive (since Oracle 11g) are tightly coupled. The SYS_FBA_…<object_id> tables are created delayed by the FBDA background process. DML and querying table T2 is possible anyway with the help of UNDO and Flashback Query. The highlighted lines show the PL/SQL block to enforce the creation of the SYS_FBA…<object_id> tables. This step is necessary to create the views T2_DDL_COLMAP, T2_HIST and T2_TCRV. These views simplify the access to the underlying tables in my SQL scripts.

The next script copies data from table T1 to T2. Basically that’s what the PL/SQL package DBMS_FDA_IMPORT provided by Oracle does.

-- migrate current rows
INSERT INTO t2 (OID, c1, c2)
   SELECT OID, c1, c2
     FROM t1
    WHERE outdated_at > SYSDATE;
COMMIT;

-- enable DML on FBA tables
BEGIN
   dbms_flashback_archive.disassociate_fba(owner_name => USER, table_name => 'T2');
END;
/

-- migrate T1 rows into T2
INSERT INTO t2_hist (RID, STARTSCN, ENDSCN, XID, OPERATION, OID, C1, C2)
-- outdated INSERTs (simulating INSERT/DELETE logic)
SELECT NULL AS rid,
       timestamp_to_scn(created_at) AS startscn,
       timestamp_to_scn(outdated_at) AS endscn,
       NULL AS XID,
       'I' AS operation,
       OID,
       c1,
       c2
  FROM t1
 WHERE outdated_at < SYSDATE
-- current INSERTs (workaround for ORA-55622 on insert into T2_TRCV)
UNION ALL
SELECT t2.rowid AS rid,
       timestamp_to_scn(t1.created_at) AS startscn,
       h.startscn AS endscn,
       NULL AS XID,
       'I' AS operation,
       t2.OID,
       t2.c1,
       t2.c2
  FROM t1 t1
 INNER JOIN t2
    ON t2.oid = t1.oid
 INNER JOIN t2_tcrv h
    ON h.RID = t2.rowid
 WHERE t1.outdated_at > SYSDATE;

COMMIT;

-- disable DML on FBA tables
BEGIN
   dbms_flashback_archive.reassociate_fba(owner_name => USER, table_name => 't2');
END;
/

The following queries return data valid at 2012-12-23 19:36:08 and now (as for T1 above).

SQL> SELECT * FROM t2 AS OF SCN timestamp_to_scn(TIMESTAMP '2012-12-23 19:36:08');

OID C1 C2
--- -- --
  1 A  A2
  2 B  B1

SQL> SELECT * FROM t2;

OID C1 C2
--- -- --
  1 A  A4
  2 B  B1

As you see the results are identical with the ones of T1. The script here compares the result for every single point in time in T1 with T2. I’ve run it without detecting differences.

Migration Issue 1 – ORA-1466 When Using AS OF TIMESTAMP Clause

In the examples above I avoided the use of the AS OF TIMESTAMP clause and used the AS OF SCN clause instead. The reason becomes apparent when executing the following query:

SQL> SELECT * FROM t2 AS OF TIMESTAMP TIMESTAMP '2012-12-23 19:36:08';
SELECT * FROM t2 AS OF TIMESTAMP TIMESTAMP '2012-12-23 19:36:08'
              *
ERROR at line 1:
ORA-01466: unable to read data - table definition has changed

Ok, we have not changed the table definition, but was the table definition for T2 valid on 2012-12-23 19:36:08?

SQL> SELECT column_name,
  2         startscn,
  3         endscn,
  4         CAST(scn_to_timestamp(startscn) AS DATE) AS start_ts
  5    FROM t2_ddl_colmap;

COLUMN_NAME       STARTSCN     ENDSCN START_TS
----------- -------------- ---------- -------------------
OID           161382016632            2013-01-03 00:17:42
C1            161382016632            2013-01-03 00:17:42
C2            161382016632            2013-01-03 00:17:42

Since I’ve created the table on 2013-01-03 00:17:42 Oracle assumes that no columns exist before this point in time. However, the AS OF SCN clause is not that picky.

To fix the problem we need to update T2_DDL_COLMAP but unfortunately DBMS_FLASHBACK_ARCHIVE.DISASSOCIATE_FBA does not allow us to change the content of the DDL_COLMAP directly (it is possible indirectly by altering the HIST table, but this is not helpful in this case).

I’ve written a PL/SQL package TVD_FBA_HELPER which updates SYS.TAB$ behind the scenes to overcome this restriction. Please consult Oracle Support how to proceed if you plan to use it in productive environments. The package is provided as is and of course you are using it at your own risk.

Here is the script to fix validity of the DDL_COLMAP:

-- enable DML FBA table
BEGIN
   tvd_fba_helper.disassociate_col_map(in_owner_name => USER, in_table_name => 'T2');
END;
/

-- enforce ddl colmap consistentcy (valid for first entries)
UPDATE t2_ddl_colmap
   SET startscn =
       (SELECT MIN(startscn)
          FROM t2_hist);
COMMIT;

-- disable DML an FBA tables
BEGIN
   tvd_fba_helper.reassociate_col_map(in_owner_name => USER, in_table_name => 'T2');
END;
/

Now the AS OF TIMESTAMP clause works as well:

SQL> SELECT * FROM t2 AS OF TIMESTAMP TIMESTAMP '2012-12-23 19:36:08';

OID C1 C2
--- -- --
  1 A  A2
  2 B  B1

Migration Issue 2 – Different Number of Intervals

If you query the full history of T2 you get 10 rows, but T1 contains 8 rows only.

SQL> SELECT ROWNUM AS vid, OID, created_at, outdated_at, c1, c2
  2    FROM (SELECT OID,
  3                 TO_DATE(TO_CHAR(versions_starttime, 'YYYY-MM-DD HH24:MI:SS'),
  4                         'YYYY-MM-DD HH24:MI:SS') AS created_at,
  5                 NVL(TO_DATE(TO_CHAR(versions_endtime, 'YYYY-MM-DD HH24:MI:SS'),
  6                             'YYYY-MM-DD HH24:MI:SS'),
  7                     TIMESTAMP '9999-12-31 23:59:59') AS outdated_at,
  8                 c1,
  9                 c2
 10            FROM t2 VERSIONS BETWEEN TIMESTAMP TIMESTAMP '2012-12-19 13:00:57'
 11                                     AND SYSTIMESTAMP
 12           ORDER BY 1, 2);

VID OID CREATED_AT          OUTDATED_AT         C1 C2
--- --- ------------------- ------------------- -- --
  1   1 2012-12-19 13:00:53 2012-12-21 08:20:19 A  A1
  2   1 2012-12-21 08:20:19 2012-12-23 20:58:03 A  A2
  3   1 2012-12-23 20:58:03 2012-12-27 11:40:40 A  A3
  4   1 2012-12-27 11:40:40 2013-01-03 00:50:19 A  A4
  5   1 2013-01-03 00:50:19 9999-12-31 23:59:59 A  A4
  6   2 2012-12-20 13:51:54 2013-01-03 00:50:19 B  B1
  7   2 2013-01-03 00:50:19 9999-12-31 23:59:59 B  B1
  8   4 2012-12-22 10:26:35 2012-12-23 19:36:07 C  C1
  9   4 2012-12-28 14:25:50 2012-12-30 17:10:38 C  C1
 10   4 2012-12-30 17:10:38 2012-12-31 12:05:39 C  C2

The highlighted rows for OID 1 and 2 could be merged. The content is not really wrong it’s just different to T1. The reason is, that DBMS_FLASHBACK_ARCHIVE.DISASSOCIATE_FBA does not allow us to modify T2_TCRV (SYS_FBA_TCRV…<object_id> table). This table contains validity information for the current rows in T2.

SQL> SELECT rid,
  2         startscn,
  3         CAST(scn_to_timestamp(startscn) AS DATE) AS start_ts,
  4         endscn,
  5         op
  6    FROM t2_tcrv;

RID                      STARTSCN START_TS                ENDSCN O
------------------ -------------- ------------------- ---------- -
AAAZ1PAAIAAAAb0AAA   161382018095 2013-01-03 00:50:19            I
AAAZ1PAAIAAAAb0AAB   161382018095 2013-01-03 00:50:19            I

That’s why I had to inserted the actual rows also in T2_HIST.

But with the help of the PL/SQL package TVD_FBA_HELPER the data may be fixed as follows:

-- enable DML on FBA tables
BEGIN
   dbms_flashback_archive.disassociate_fba(owner_name => USER, table_name => 'T2');
   tvd_fba_helper.disassociate_tcrv(in_owner_name => USER, in_table_name => 'T2');
END;
/

-- extend begin of validity in TCRV table and fix HIST table accordingly
MERGE INTO t2_tcrv t
USING (SELECT rid, startscn FROM t2_hist b) s
   ON (s.rid = t.rid)
 WHEN MATCHED THEN
    UPDATE SET t.startscn = s.startscn;
DELETE FROM t2_hist WHERE rid IN (SELECT rid FROM t2_tcrv);
COMMIT;

-- disable DML an FBA tables
BEGIN
   tvd_fba_helper.reassociate_tcrv(in_owner_name => USER, in_table_name => 'T2');
   dbms_flashback_archive.reassociate_fba(owner_name => USER, table_name => 'T2');
END;
/

Now the query returns 8 rows:

SQL> SELECT ROWNUM AS vid, OID, created_at, outdated_at, c1, c2
  2    FROM (SELECT OID,
  3                 TO_DATE(TO_CHAR(versions_starttime, 'YYYY-MM-DD HH24:MI:SS'),
  4                         'YYYY-MM-DD HH24:MI:SS') AS created_at,
  5                 NVL(TO_DATE(TO_CHAR(versions_endtime, 'YYYY-MM-DD HH24:MI:SS'),
  6                             'YYYY-MM-DD HH24:MI:SS'),
  7                     TIMESTAMP '9999-12-31 23:59:59') AS outdated_at,
  8                 c1,
  9                 c2
 10            FROM t2 VERSIONS BETWEEN TIMESTAMP TIMESTAMP '2012-12-19 13:00:57'
 11                                     AND SYSTIMESTAMP
 12           ORDER BY 1, 2);

VID OID CREATED_AT          OUTDATED_AT         C1 C2
--- --- ------------------- ------------------- -- --
  1   1 2012-12-19 13:00:53 2012-12-21 08:20:19 A  A1
  2   1 2012-12-21 08:20:19 2012-12-23 20:58:03 A  A2
  3   1 2012-12-23 20:58:03 2012-12-27 11:40:40 A  A3
  4   1 2012-12-27 11:40:40 9999-12-31 23:59:59 A  A4
  5   2 2012-12-20 13:51:54 9999-12-31 23:59:59 B  B1
  6   4 2012-12-22 10:26:35 2012-12-23 19:36:07 C  C1
  7   4 2012-12-28 14:25:50 2012-12-30 17:10:38 C  C1
  8   4 2012-12-30 17:10:38 2012-12-31 12:05:39 C  C2

The timestamps are slightly different to the ones in T1, but that’s expected behavior since the precision of TIMESTAMP_TO_SCN conversion is limited to around 3 seconds.

Conclusion

Loading historical data into FBA enable tables requires a strategy to populate historical mappings in SYS.SMON_SCN_TIME. Afterwards you may load the associated SYS_FBA_HIST_<object_id> table with the help of the Oracle supplied PL/SQL package DBMS_FLASHBACK_ARCHIVE and its procedures DISASSOCIATE_FBA and REASSOCIATE_FBA. I recommend this approach for 11gR2 Database environments.

The solutions to fix migration issue 1 (ORA-1466 when using AS OF TIMESTAMP clause) and migration issue 2 (different number of intervals) are considered experimental. I suggest to contact Oracle Support to discuss how to proceed if you need a solution in this area.

Updated on 10-APR-2014, changed calculation of SCN_WRP and SCN_BAS in Extend SMON_SCN_TIME (1) and Extend SMON_SCN_TIME (2), changed link to new version of TVD_FBA_HELPER.

↧

Trivadis PL/SQL & SQL CodeAnalyzer Released

July 28, 2013, 11:50 am

≫ Next: Trivadis PL/SQL & SQL CodeChecker Released

≪ Previous: Loading Historical Data Into Flashback Archive Enabled Tables

A month ago I had a talk about “Extending the Oracle Data Dictionary for Fine-Grained PL/SQL and SQL Analysis” during the ODTUG Kscope13 conference in New Orleans. Oracle data dictionary views as DBA_IDENTIFIERS or DBA_DEPENDENCIES are in many cases sufficient to analyze static PL/SQL and SQL code within the Oracle database. But what if more detailed analysis are required, such as the use of tables or columns in PL/SQL package units, in SQL statements or in SQL statement clauses? Wouldn’t a DBA_OBJECT_USAGE view – showing DML and query operations on tables/views per database object – be a helpful tool?

TVDCA – the Trivadis PL/SQL and SQL CodeAnalyzer – is such a tool and helps you to overcome several analysis restrictions in an Oracle 10g, 11g or 12c database. At Kscope13 some of my attentive session attendees got an USB stick with TVDCA 0.4.1 Beta. In the meantime I was busy fixing bugs to proudly present you now an updated trial/preview version free of charge in the download section of this blog.

The following query might give you an idea of the functionality of tvdca:

SQL> SELECT object_name, procedure_name, operation, table_name, column_name
  2    FROM tvd_object_col_usage_v
  3   WHERE owner = 'TVDCA'
  4         AND object_type = 'PACKAGE BODY';

OBJECT_NAME    PROCEDURE_NAME OPERATION TABLE_NAME           COLUMN_NAME
-------------- -------------- --------- -------------------- ----------------
TVD_COLDEP_PKG GET_DEP        SELECT    DBA_DEPENDENCIES     NAME
TVD_COLDEP_PKG GET_DEP        SELECT    DBA_DEPENDENCIES     OWNER
TVD_COLDEP_PKG GET_DEP        SELECT    DBA_DEPENDENCIES     REFERENCED_NAME
TVD_COLDEP_PKG GET_DEP        SELECT    DBA_DEPENDENCIES     REFERENCED_OWNER
TVD_COLDEP_PKG GET_DEP        SELECT    TVD_PARSED_OBJECTS_V OBJECT_NAME
TVD_COLDEP_PKG GET_DEP        SELECT    TVD_PARSED_OBJECTS_V OBJECT_TYPE
TVD_COLDEP_PKG GET_DEP        SELECT    TVD_PARSED_OBJECTS_V OWNER
TVD_COLDEP_PKG PROCESS_VIEW   SELECT    DBA_TAB_COLUMNS      COLUMN_ID
TVD_COLDEP_PKG PROCESS_VIEW   SELECT    DBA_TAB_COLUMNS      OWNER
TVD_COLDEP_PKG PROCESS_VIEW   SELECT    DBA_TAB_COLUMNS      TABLE_NAME

Please have a look at my slides or the information in the download section if you are interested to learn more about tvdca.

↧

Trivadis PL/SQL & SQL CodeChecker Released

October 20, 2013, 7:07 am

≫ Next: Column-less Table Access

≪ Previous: Trivadis PL/SQL & SQL CodeAnalyzer Released

In August 2009 Trivadis – the company I work for – released the first version of their PL/SQL & SQL Coding Guidelines. Back then we made our PL/SQL assessments based on interviews and checked the code against our guidelines using Code Xpert, SQL*Plus scripts and some manual/visual checks. You may imagine that this approach had some shortcomings, especially if you think about repeating the process after some corrections.

Back than the idea was born to build a tool which allows to run the checks fully automated and make it part of our continuous integration environment.

Today I’m proud to release the first public beta version of TVDCC – the Trivadis PL/SQL & SQL CodeChecker. TVDCC is a file based command line utility and does not require a connection to an Oracle database at any time. Simply run

tvdcc path=.

to scan the current directory including all subdirectories for SQL*Plus files and to create HTML and Excel reports.

See my download area for more information about TVDCC and to grab your copy of TVDCC. Any feedback is highly appreciated.

↧

Column-less Table Access

January 2, 2014, 12:29 pm

≫ Next: Multi-temporal Features in Oracle 12c

≪ Previous: Trivadis PL/SQL & SQL CodeChecker Released

While writing some JUnit tests after fixing bugs in dependency analysis views, I came up with the following query:

SELECT owner, object_type, object_name, operation, table_name
  FROM tvd_object_usage_v
MINUS
SELECT owner, object_type, object_name, operation, table_name
  FROM tvd_object_col_usage_v

The first view tvd_object_usage_v contains all table/view usages per object. The second view tvd_object_col_usages_v contains all column usages per object.

The idea was to check the completeness of the second view tvd_object_col_usages_v. I believed that there cannot be an object usage without one or more corresponding column usages. Therefore I assumed the query above should retrieve now rows, but obviously I was plain wrong.

Here are some examples of column-less table accesses:

SELECT sys_guid() 
  FROM dual;

SELECT COUNT(*) 
  FROM bonus;

SELECT rownum AS row_num
  FROM dual
CONNECT BY rownum <= 1000;

SELECT e.empno, e.ename
  FROM emp e, dept d;

Based on that I’ve build the test case as follows:

INSERT INTO tvd_captured_sql_t
   (cap_id, cap_source)
VALUES
   (-1007,
    'SELECT sys_guid() FROM dual;
     SELECT COUNT(*) FROM bonus;
     SELECT rownum AS row_num FROM dual CONNECT BY rownum <= 1000;  
     SELECT e.empno, e.ename FROM emp e, dept d;');
COMMIT;

tvdca.sh user=tvdca password=tvdca host=groemitz sid=phs112

SQL> SELECT operation, table_name
  2    FROM tvd_sql_usage_v
  3   WHERE cap_id = -1007;

OPERAT TABLE_NAME
------ ------------------------------
SELECT DUAL
SELECT BONUS
SELECT DUAL
SELECT EMP
SELECT DEPT

SQL> SELECT operation, table_name, column_name
  2    FROM tvd_sql_col_usage_v
  3   WHERE cap_id = -1007;

OPERAT TABLE_NAME                     COLUMN_NAME
------ ------------------------------ ------------------------------
SELECT EMP                            EMPNO
SELECT EMP                            ENAME

SQL> SELECT operation, table_name
  2    FROM tvd_sql_usage_v
  3   WHERE cap_id = -1007
  4  MINUS
  5  SELECT operation, table_name
  6    FROM tvd_sql_col_usage_v
  7   WHERE cap_id = -1007;

OPERAT TABLE_NAME
------ ------------------------------
SELECT BONUS
SELECT DEPT
SELECT DUAL

These tests are now part of my TVDCA test suite to ensure column-less table access is handled appropriately ;-)

BTW, here is an excerpt of my JUnit test:

@Test
public void testColumnLessTableAccess() {
	String tabSql = "SELECT COUNT(*) FROM tvd_sql_usage_v WHERE cap_id = -1007 AND table_name LIKE :table_name";
	String colSql = "SELECT COUNT(*) FROM tvd_sql_col_usage_v WHERE cap_id = -1007 AND table_name LIKE :table_name and column_name LIKE :column_name";
	int count;
	Map<String, String> namedParameters = new HashMap<String, String>();
	// all tables
	namedParameters.put("table_name", "%");
	namedParameters.put("column_name", "%");
	count = jdbcTemplate.queryForObject(tabSql, namedParameters,
			Integer.class);
	Assert.assertEquals(5, count);
	count = jdbcTemplate.queryForObject(colSql, namedParameters,
			Integer.class);
	Assert.assertEquals(2, count);
}

↧

Multi-temporal Features in Oracle 12c

January 4, 2014, 4:19 am

≫ Next: Trivadis PL/SQL & SQL CodeChecker for SQL Developer Released

≪ Previous: Column-less Table Access

Oracle 12c has a feature called Temporal Validity. With Temporal Validity you can add one or more valid time dimensions to a table using existing columns, or using columns automatically created by the database. This means that Oracle offers combined with Flashback Data Archive native bi-temporal and even multi-temporal historization features. This blog post explains the different types of historization, when and how to use them and positions the most recent Oracle 12c database features.

Semantics and Granularity of Periods

In Flashback Data Archive Oracle defines periods with a half-open interval. This means that a point in time x is part of a period if x >= the start of the period and x < the end of the period. It is no surprise that Oracle uses also half-open intervals for Temporal Validity. The following figure visualizes the principle:

Fig. 1: Semantics and Granularity of Periods

The advantage of a half-open interval is that the end of a preceding period is identical with the start of the subsequent period. Thus there is no gap and the granularity of a period (year, month, day, second, millisecond, nanosecond, etc.) is irrelevant. The disadvantage is that querying data at a point in time using a traditional WHERE clause is a bit more verbose compared to closed intervals since BETWEEN conditions are not applicable.

Furthermore, Oracle uses NULL for -∞ und +∞. Considering this information the WHERE clause to filter the currently valid periods looks as follows:

WHERE (vt_start IS NULL OR vt_start <= SYSTIMESTAMP)
  AND (vt_end IS NULL OR vt_end > SYSTIMESTAMP)

Use of Temporal Periods

In an entity-relationship model temporal periods may be used for master or reference data. For transactions or positions we do not need temporal periods since the data itself contains one or more timestamps. Corrections may be done through a reversal or difference posting logic, similar to bookkeeping transactions.

The situation is similar in a dimensional model. Dimensions correspond to master and reference data and may have a temporal period (e.g. slowly changing dimensions type 2). Facts do not have temporal periods. Instead they are modeled with one or more relationships to the time dimension. A fact is immutable. Changes are applied through new facts using a reversal or difference posting logic.

Transaction Time – TT

A flight data recorder collects and records various metrics during a flight to allow the reconstruction of the past. The transaction or system time in a data model is comparable to the functionality of such a flight data recorder. A table with a transaction time axis allows to query the current and the past state, but changes in the past or in the future are not possible.

Example: Scott becomes a manager. The change of the job description from “Analyst” to “Manager” is entered into the system on the April, 15 2013 at 15:42:42. The previous description Analyst is terminated at this point in time and the new description Manager becomes current at exactly the same point in time.

Oracle supports the transaction time with Flashback Data Archive (formally known as Total Recall). Using Flashback Data Archive you may query a consistent state of the past.

SCN	Session A	Session B
1	INSERT INTO emp (empno, ename, job, sal, deptno) VALUES (4242, 'CARTER', 'CLERK', '2400', 20);
2	SELECT COUNT(*) FROM emp; -- 15 rows
3		SELECT COUNT(*) FROM emp; -- 14 rows
4	COMMIT;

Tab. 1: Consistent View of the Past

What is the result of the query “SELECT COUNT(*) FROM emp AS OF SCN 3″ based on the table 1 above? – 14 rows. This is a good and reasonable representation of the past. However, it also shows, that the consistent representation of the past is a matter of definition and in this case it does not represent situation of session A.

Valid Time – VT

The valid time describes the period during which something in the real world is considered valid. This period is independent of the entry into the system and therefore needs to be maintained explicitly. Changes and queries are supported in the past as well as in the future.

Example: Scott becomes a manager. The change of the job description from “Analyst” to “Manager” is valid from January, 1 2014. The previous description Analyst is terminated at this point in time and the new description Manager becomes valid at exactly the same point in time. It is irrelevant when this change is entered into the System.

Decision Time – DT

The decision time describes the date and time a decision has been made. This point in time is independent of an entry into the System and is not directly related to the valid time period. Future changes are not possible.

Example: Scott becomes manager. The decision to change the job description from “Analyst” to “Manager” has been made on March, 24 2013. The previous job description Analyst is terminated on the decision time axis at this point in time and the new description Manager becomes current at exactly the same point in time on the decision time axis. It is irrelevant when this change is entered into the System and it is irrelevant when Scott may call himself officially a manager.

Historization Types

On one hand the historization types are based on the time dimensions visualized in figure 2 and on the other hand categorized on the combination of these time dimensions. In this post only the most popular and generic time periods are covered. However, depending on the requirements additional, specific time periods are conceivable.

Fig. 2: Historization Types

Non-temporal models do not have any time dimensions (e.g. EMP and DEPT in Schema SCOTT).

Uni-temporal models use just one time dimension (e.g. transaction time or valid time).

Bi-temporal models use exactly two time dimensions (e.g. transaction time and valid time).

Multi-temporal models use at least three time dimensions.

Tri-temporal models are based exactly on three time dimensions.

Temporal Validity

The feature Temporal Validity covers the DDL and DML enhancements in Oracle 12c concerning temporal data management. The statements CREATE TABLE, ALTER TABLE and DROP TABLE have been extended by a new PERIOD FOR clause. Here is an example:

SQL> ALTER TABLE dept ADD (
  2     vt_start DATE,
  3     vt_end   DATE,
  4     PERIOD FOR vt (vt_start, vt_end)
  5  );

SQL> SELECT * FROM dept;	

    DEPTNO DNAME          LOC           VT_START   VT_END
---------- -------------- ------------- ---------- ----------
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS
        30 SALES          CHICAGO
        40 OPERATIONS     BOSTON

VT names the period and is a hidden column. The association of the VT period to the VT_START and VT_END column is stored in the Oracle Data Dictionary in the table SYS_FBA_PERIOD. You need a dedicated ALTER TABLE call for every additional period.

For every period a constraint is created to enforce positive time periods (VT_START < VT_END). But it is not possible to define temporal constraints, e.g. prohibit overlapping periods, gaps, or orphaned parent/child periods.

Oracle 12c does not deliver support for temporal DML. Desirable would be for example:

insert, update delete for a given period
update a subset of columns for a given period
merge of connected and identical periods

Hence temporal changes have to be implemented as a series of conventional DML. Here is an example:

SQL> UPDATE dept SET vt_end = DATE '2014-01-01' WHERE deptno = 30;

SQL> INSERT INTO dept (deptno, dname, loc, vt_start)
  2       VALUES (30, 'SALES', 'SAN FRANCISCO', DATE '2014-01-01');

SQL> SELECT * FROM dept WHERE deptno = 30 ORDER BY vt_start NULLS FIRST;

 DEPTNO    DNAME          LOC           VT_START   VT_END
---------- -------------- ------------- ---------- ----------
        30 SALES          CHICAGO                  2014-01-01
        30 SALES          SAN FRANCISCO 2014-01-01

Temporal Flashback Query

The feature Temporal Flashback Query covers query enhancements in Oracle 12c concerning temporal data. Oracle extended the existing Flashback Query interfaces. The FLASHBACK_QUERY_CLAUSE of the SELECT statement has been extended by a PERIOD FOR clause. Here is an example:

SQL> SELECT *
  2    FROM dept AS OF PERIOD FOR vt DATE '2015-01-01'
  3   ORDER BY deptno;

    DEPTNO DNAME          LOC           VT_START   VT_END
---------- -------------- ------------- ---------- ----------
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS
        30 SALES          SAN FRANCISCO 2014-01-01
        40 OPERATIONS     BOSTON

Instead of “AS OF PERIOD FOR” you may also use “VERSIONS PERIOD FOR”. However, it is important to notice that you may not define multiple PERIOD FOR clauses . Hence you need to filter additional temporal periods in the WHERE clause.

The PERIOD FOR clause is not applicable for views. For views the enhancement in the PL/SQL package DBMS_FLASHBACK_ARCHIVE are interesting, especially the procedures ENABLE_AT_VALID_TIME and DISABLE_ASOF_VALID_TIME to manage a temporal context. Here is an example:

SQL> BEGIN
  2     dbms_flashback_archive.enable_at_valid_time(
  3        level      => 'ASOF', 
  4        query_time => DATE '2015-01-01'
  5     );
  6  END;
  7  /

SQL> SELECT * FROM dept ORDER BY deptno;

    DEPTNO DNAME          LOC           VT_START    VT_END
---------- -------------- ------------- ---------- ----------
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS
        30 SALES          SAN FRANCISCO 2014-01-01
        40 OPERATIONS     BOSTON

Currently it is not possible to define a temporal period and therefore the context is applied for every temporal period. In these cases you have to set the context via the WHERE clause.

A limitation of Oracle 12c is that Temporal Flashback Query predicates are not applied in multitenant configuration. The PERIOD FOR clause in the SELECT statement and the DBMS_FLASHBACK_ARCHIVE.ENABLE_AT_VALID_TIME calls are simply ignored.

Another limitation is, that Oracle 12 does not provide support for temporal joins and temporal aggregations.

Tri-temporal Data Model

The following data model is based on the EMP/DEPT model in the schema SCOTT. The table EMPV implements three temporal dimensions:

Transaction time (TT) with Flashback Data Archive
Valid time (VT) with Temporal Validity
Decision time (DT) with Temporal Validity

Fig. 3: Tri-temporal Data Model

The table EMP is reduced to the primary key (EMPNO) which is not temporal. This allows to define and enable the foreign key constraint EMPV_EMP_MGR_FK.

The following six events will be represented with this model.

No	Transaction Time (TT)	Valid Time (VT)	Decision Time (DT)	Action
#1	1			Initial load from SCOTT.EMP table
#2	2	1990-01-01		Change name from SCOTT to Scott
#3	3	1991-04-01		Scott leaves the company
#4	4	1991-10-01		Scott rejoins
#5	5	1989-01-01		Change job from ANALYST TO Analyst
#6	6	2014-01-01	2013-03-24	Change job to Manager and double salary

Tab. 2: Events

After the processing of all 6 events the periods for the employee 7788 (Scott) in the table EMPV may be queried as follows. The transaction time is represented as the System Change Number SCN.

SQL> SELECT dense_rank() OVER(ORDER BY versions_startscn) event_no, empno, ename, job,
  2         sal, versions_startscn tt_start, versions_endscn tt_end,
  3         to_char(vt_start,'YYYY-MM-DD') vt_start, to_char(vt_end,'YYYY-MM-DD') vt_end,
  4         to_CHAR(dt_start,'YYYY-MM-DD') dt_start, to_char(dt_end,'YYYY-MM-DD') dt_end
  5    FROM empv VERSIONS BETWEEN SCN MINVALUE AND MAXVALUE
  6   WHERE empno = 7788 AND versions_operation IN ('I','U')
  7   ORDER BY tt_start, vt_start NULLS FIRST, dt_start NULLS FIRST;

# EMPNO ENAME JOB       SAL TT_START   TT_END VT_START   VT_END     DT_START   DT_END
-- ----- ----- ------- ----- -------- -------- ---------- ---------- ---------- ----------
 1  7788 SCOTT ANALYST  3000  2366310  2366356
 2  7788 SCOTT ANALYST  3000  2366356  2366559            1990-01-01
 2  7788 Scott ANALYST  3000  2366356  2366408 1990-01-01
 3  7788 Scott ANALYST  3000  2366408  2366559 1990-01-01 1991-04-01
 4  7788 Scott ANALYST  3000  2366424  2366559 1991-10-01
 5  7788 SCOTT ANALYST  3000  2366559                     1989-01-01
 5  7788 SCOTT Analyst  3000  2366559          1989-01-01 1990-01-01
 5  7788 Scott Analyst  3000  2366559          1990-01-01 1991-04-01
 5  7788 Scott Analyst  3000  2366559  2366670 1991-10-01
 6  7788 Scott Analyst  3000  2366670          1991-10-01                       2013-03-24
 6  7788 Scott Analyst  3000  2366670          1991-10-01 2014-01-01 2013-03-24
 6  7788 Scott Manager  6000  2366670          2014-01-01            2013-03-24

7 rows have been changed or added based on the event #5 at the transaction time 2366559. It clearly shows that DML operations in a temporal model are not trivial. All the more a support in that area for VT and DT is missed.

The next query filters the data for Scott on the transaction time (SYSDATE=default), valid time (2014-01-01) and decision time (2013-04-01). This way the result is reduced exactly to a single row.

SQL> SELECT empno, ename, job, sal,
  2         to_char(vt_start,'YYYY-MM-DD') AS vt_start,
  3         to_char(vt_end,'YYYY-MM-DD') AS vt_end,
  4         to_CHAR(dt_start,'YYYY-MM-DD') AS dt_start,
  5         to_char(dt_end,'YYYY-MM-DD') AS dt_end
  6    FROM empv AS OF period FOR dt DATE '2013-04-01'
  7   WHERE empno = 7788 AND
  8         (vt_start <= DATE '2014-01-01' OR vt_start IS NULL) AND
  9         (vt_end > DATE '2014-01-01' OR vt_end IS NULL)
 10   ORDER BY vt_start NULLS FIRST, dt_start NULLS FIRST;

EMPNO ENAME JOB       SAL VT_START   VT_END     DT_START   DT_END
----- ----- ------- ----- ---------- ---------- ---------- ----------
 7788 Scott Manager  6000 2014-01-01            2013-03-24

Queries on multi-temporal data are relatively simple if all time periods are filtered at a point in time. The AS OF PERIOD clause (for DT) simplifies the query, but the complexity of a traditional WHERE condition (for VT) is not much higher.

Conclusion

The support for temporal data management in Oracle 12c is based on sound concepts, but the implementation is currently incomplete. I miss mainly a temporal DML API, temporal integrity constraints, temporal joins and temporal aggregations. I recommend to use Oracle’s semantics for periods (half-open intervals, NULL for +/- infinity) in existing models, to simplify the migration to Temporal Validity.

In the real world we use a lot of temporal dimensions, consciously or unconsciously at the same time. However, in data models every additional temporal dimension increases the complexity significantly. Data models are simplifications of the real world, based on requirements and a limited budget. I do not recommend to use bi-temporality or even multi-temporality as an universal design pattern. Quite the contrary I recommend to determine and document the reason for a temporal dimension per entity to ensure that temporal dimensions are used consciously and not modeled unnecessarily.

Oracle’s Flashback Data Archive is a good, transparent and since Oracle 11.2.0.4 also a cost free option to implement requirements regarding the transaction time. For all other time dimensions such as the valid time and the decision time I recommend to use standardized tooling to apply DML on temporal data.

↧

Trivadis PL/SQL & SQL CodeChecker for SQL Developer Released

April 29, 2014, 3:24 pm

≫ Next: Ready for Oracle 12c

≪ Previous: Multi-temporal Features in Oracle 12c

A half a year ago Trivadis released a command line utility to scan code within a directory tree for guideline violations of the Trivadis PL/SQL & SQL Coding Guidelines Version 2.0. This is tool is perfectly suited to process millions of lines of code, but an integration into Oracle SQL Developer was missing until now.

This SQL Developer extension checks the editor content per mouse click or keyboard shortcut. Simply navigate through the issues using the cursor keys to highlight the linked code sections in the corresponding editor.

Additionally a detailed HTML report tab is populated containing all metrics you know from the command line tool, such as McCabe’s cyclomatic complexity, Halstead’s volume, the maintainability index or the number of statements.

If you do not like all guideline checks you may configure a whitelist and blacklist in the SQL Developer preferences to shape the output according your needs.

Trivadis PL/SQL & SQL CodeChecker for SQL Developer is available for free and licensed under a Creative Commons Attribution- NonCommercial-NoDerivs 3.0 Unported License. The full functionality is provided and is not limited in time and volume.

See Download for more information or simply register the TVDCC update center http://www.salvis.com/update/tvdcc in SQL Developer.

↧

Ready for Oracle 12c

May 22, 2014, 8:51 am

≫ Next: Cannot Install Extensions in SQL Developer 4 on Mac OS X

≪ Previous: Trivadis PL/SQL & SQL CodeChecker for SQL Developer Released

The Oracle 12c grammar is now supported in the new versions of the Trivadis CodeChecker, CodeChecker for SQL Developer and CodeAnalyzer. The following example code, copied from a colleague at Trivadis, shows how to insert rows while querying a view. This might not be the most appropriate way to implement auditing, but it shows in a few lines of code the power of plsql_declarations within a 12c SELECT statement.

tvdcc_sqldev_12c

The Trivadis CodeChecker for SQL Developer processes this example flawlessly. It works with the new row_pattern_clause, row_limiting_clause, cross_outer_apply_clause or LATERAL clause as well. However, TVDCC might be a bit picky about the coding style.

Get ready for 12c and grab your copy of the command line tools or the SQL Developer extension from the download area.

↧

Cannot Install Extensions in SQL Developer 4 on Mac OS X

May 5, 2015, 8:19 am

≫ Next: Introducing PL/SQL Unwrapper for SQL Developer

≪ Previous: Ready for Oracle 12c

Today I could not install any SQL Developer extension on my Mac OS X machine. I did not get an error message during the installation. After a restart of SQL Developer the extension simply was missing. When I tried to re-install it – selecting “Check for updates…” in the “Help” menu – I’ve got the following message:

Restarting SQL Developer did not help. This message was shown again and no extension was installed. I’ve tried to remove the $HOME/.sqldeveloper directory and reinstalled SQL Developer, but the problem persisted. I’ve tried SQL Developer version 4.0.3.16.84 and the brand new version 4.1.0.19.07. Same result.

What was the problem?

After some analysis I found the root cause. SQL Developer creates a file named jdeveloper-deferred-updates.txt in the directory $HOME/.sqldeveloper (e.g. /Users/phs/.sqldeveloper). This file is read and copied into a temporary directory as part of the installation process. On non-Windows platforms the name of the temporary directory is $TMPDIR/$USER (e.g. /var/folders/lf/8g3r0ts900gfdfn2xxkn9yz00000gn/T/phs). If a file with such a name already exists, the directory cannot be created and the whole installation of the extension fails.

What is the solution (workaround)?

Open a terminal window (e.g. type terminal in the spotlight window) and execute the following command to delete the existing temporary file, which is causing the name conflict:

rm $TMPDIR/$USER

Afterwards restart SQL Developer and install the extension. Restart SQL Developer once again to complete the installation.

↧

Introducing PL/SQL Unwrapper for SQL Developer

May 17, 2015, 8:53 am

≫ Next: Update for PL/SQL Cop and PL/SQL Analyzer

≪ Previous: Cannot Install Extensions in SQL Developer 4 on Mac OS X

I’m using from time to time the free service Unwrap it! or Niels Teusink’s Python script unwrap.py to unwrap PL/SQL code. Recently I’m confronted more with wrapped code since a customer is about to migrate to a new banking platform which is using wrapped PL/SQL code extensively. While investigating migration errors we experienced that unwrapping the called PL/SQL packages helped us a lot to identify the root cause faster. But since the unwrapping and debugging process is still a bit cumbersome for a series of PL/SQL packages a colleague asked me: “Wouldn’t it be nice if we could unwrap PL/SQL packages directly in SQL Developer?” and I answered “This should be simple. I’ve already written an extension for SQL Developer and the code in unwrap.py does not look too complicated.”

And on a rainy weekend I analyzed Niels Teusink’s public domain Phyton script unwrap.py and used it as a starting point for the development of a PL/SQL Unwrapper for SQL Developer.

#!/usr/bin/python
#
# This script unwraps Oracle wrapped plb packages, does not support 9g
# Contact: niels at teusink net / blog.teusink.net
#
# License: Public domain
#
import re
import base64
import zlib
import sys

# simple substitution table
charmap = [0x3d, 0x65, 0x85, 0xb3, 0x18, 0xdb, 0xe2, 0x87, 0xf1, 0x52, 0xab, 0x63, 0x4b, 0xb5, 0xa0, 0x5f, 0x7d, 0x68, 0x7b, 0x9b, 0x24, 0xc2, 0x28, 0x67, 0x8a, 0xde, 0xa4, 0x26, 0x1e, 0x03, 0xeb, 0x17, 0x6f, 0x34, 0x3e, 0x7a, 0x3f, 0xd2, 0xa9, 0x6a, 0x0f, 0xe9, 0x35, 0x56, 0x1f, 0xb1, 0x4d, 0x10, 0x78, 0xd9, 0x75, 0xf6, 0xbc, 0x41, 0x04, 0x81, 0x61, 0x06, 0xf9, 0xad, 0xd6, 0xd5, 0x29, 0x7e, 0x86, 0x9e, 0x79, 0xe5, 0x05, 0xba, 0x84, 0xcc, 0x6e, 0x27, 0x8e, 0xb0, 0x5d, 0xa8, 0xf3, 0x9f, 0xd0, 0xa2, 0x71, 0xb8, 0x58, 0xdd, 0x2c, 0x38, 0x99, 0x4c, 0x48, 0x07, 0x55, 0xe4, 0x53, 0x8c, 0x46, 0xb6, 0x2d, 0xa5, 0xaf, 0x32, 0x22, 0x40, 0xdc, 0x50, 0xc3, 0xa1, 0x25, 0x8b, 0x9c, 0x16, 0x60, 0x5c, 0xcf, 0xfd, 0x0c, 0x98, 0x1c, 0xd4, 0x37, 0x6d, 0x3c, 0x3a, 0x30, 0xe8, 0x6c, 0x31, 0x47, 0xf5, 0x33, 0xda, 0x43, 0xc8, 0xe3, 0x5e, 0x19, 0x94, 0xec, 0xe6, 0xa3, 0x95, 0x14, 0xe0, 0x9d, 0x64, 0xfa, 0x59, 0x15, 0xc5, 0x2f, 0xca, 0xbb, 0x0b, 0xdf, 0xf2, 0x97, 0xbf, 0x0a, 0x76, 0xb4, 0x49, 0x44, 0x5a, 0x1d, 0xf0, 0x00, 0x96, 0x21, 0x80, 0x7f, 0x1a, 0x82, 0x39, 0x4f, 0xc1, 0xa7, 0xd7, 0x0d, 0xd1, 0xd8, 0xff, 0x13, 0x93, 0x70, 0xee, 0x5b, 0xef, 0xbe, 0x09, 0xb9, 0x77, 0x72, 0xe7, 0xb2, 0x54, 0xb7, 0x2a, 0xc7, 0x73, 0x90, 0x66, 0x20, 0x0e, 0x51, 0xed, 0xf8, 0x7c, 0x8f, 0x2e, 0xf4, 0x12, 0xc6, 0x2b, 0x83, 0xcd, 0xac, 0xcb, 0x3b, 0xc4, 0x4e, 0xc0, 0x69, 0x36, 0x62, 0x02, 0xae, 0x88, 0xfc, 0xaa, 0x42, 0x08, 0xa6, 0x45, 0x57, 0xd3, 0x9a, 0xbd, 0xe1, 0x23, 0x8d, 0x92, 0x4a, 0x11, 0x89, 0x74, 0x6b, 0x91, 0xfb, 0xfe, 0xc9, 0x01, 0xea, 0x1b, 0xf7, 0xce]

def decode_base64_package(base64str):
	base64dec = base64.decodestring(base64str)[20:] # we strip the first 20 chars (SHA1 hash, I don't bother checking it at the moment)
	decoded = ''
	for byte in range(0, len(base64dec)):
		decoded += chr(charmap[ord(base64dec[byte])])
	return zlib.decompress(decoded)
	

sys.stderr.write("=== Oracle 10g/11g PL/SQL unwrapper 0.2 - by Niels Teusink - blog.teusink.net ===\n\n" )
if len(sys.argv) < 2:
	sys.stderr.write("Usage: %s infile.plb [outfile]\n" % sys.argv[0])
	sys.exit(1)

infile = open(sys.argv[1])
outfile = None
if len(sys.argv) == 3:
	outfile = open(sys.argv[2], 'w')

lines = infile.readlines()
for i in range(0, len(lines)):
	# this is really naive parsing, but works on every package I've thrown at it
	matches = re.compile(r"^[0-9a-f]+ ([0-9a-f]+)$").match(lines[i])
	if matches:
		base64len = int(matches.groups()[0], 16)
		base64str = ''
		j = 0
		while len(base64str) < base64len:
			j+=1
			base64str += lines[i+j]
		base64str = base64str.replace("\n","")
		if outfile:
			outfile.write(decode_base64_package(base64str) + "\n")
		else:
			print decode_base64_package(base64str)

Even if this code looked straight forward on the first sight, it took me a moment or two to understand it. In fact I googled and found the following information helpful:

David Litchfield’s The Oracle Hacker’s Handbook: Hacking and Defending Oracle (chapter 5)
Pete Finnigan’s How to Unwrap PL/SQL. Pete provides a lot of useful resources on his blog and has shown in this post, that he owns unwrappers for 9i and 10g, which should be capable to handle every wrapped code.
Anton Scheffler’s blog post about Unwrapping 10g wrapped PL/SQL
Automatic Detection of Vulnerabilities in Wrapped Packages in Oracle by Yaron Gur-Arieh, Nikita Zubrilov and Ilya Kolchinsky
Marcel Lambrechts’ blog about Unwrapping Wrapped PL/SQL in Oracle 10gR2 – 12cR1

After flipping through all these pages I had some second thoughts about publishing an unwrapper, especially since David, Pete and Anton were a bit secretive about certain details such as the substitution table. Obviously I decided to publish it nonetheless. Is this really harmful? There are already a couple of other 10g unwrapper available, such as:

In the end this is just another PL/SQL Unwrapper. However, I believe it delivers some additional value, if Oracle’s SQL Developer is the IDE of your choice. This is how it looks like on Windows:

The wrapped code will be replaced in the editor by the unwrapped code…

…you have to pay attention to not save the unwrapped code by accident.

Grab your copy of Trivadis PL/SQL Unwrapper from the download area. I hope it is useful.

↧

Update for PL/SQL Cop and PL/SQL Analyzer

November 23, 2015, 11:57 am

≫ Next: Outer Join Operator (+) Restrictions in 12.1.0.2?

≪ Previous: Introducing PL/SQL Unwrapper for SQL Developer

Some people asked me to announce the availability of new versions of products on my web site. I guess a blog entry and a Twitter announcement should do the job. Today I’ve released the following three updates:

These products are always affected by a grammar change to SQL*Plus, SQL or PL/SQL. The goal is to process all all valid SQL*Plus, SQL and PL/SQL code, however some limitations are documented here (e.g. a table alias named “inner” is not supported).

The links on the products above will show the associated changelog. The latest entries are mostly about bug fixing. If you are using the trial/preview version of PL/SQL Cop or PL/SQL Analyzer you might be glad to hear, that the included license is valid thru April, 30 2016.

Download the newest version from here.

↧

Outer Join Operator (+) Restrictions in 12.1.0.2?

December 7, 2015, 11:43 am

≫ Next: Monitoring PL/SQL Code Evolution With PL/SQL Cop for SonarQube

≪ Previous: Update for PL/SQL Cop and PL/SQL Analyzer

I’m currently reviewing a draft of Roger Troller’s updated PL/SQL and SQL Coding Guidelines version 3.0. One guideline recommends to use ANSI join syntax. The mentioned reasons are

ANSI join syntax does not have as many restrictions as the ORACLE join syntax has. Furthermore ANSI join syntax supports the full outer join. A third advantage of the ANSI join syntax is the separation of the join condition from the query filters.

While I read this I wondered which restrictions still exist for ORACLE join syntax nowadays and searched for “(+)” in the current Error Messages documentation (E49325-06) and found the following error messages:

ORA-01417: a table may be outer joined to at most one other table
ORA-01719: outer join operator (+) not allowed in operand of OR or IN
ORA-01799: a column may not be outer-joined to a subquery
ORA-25156: old style outer join (+) cannot be used with ANSI joins
ORA-30563: outer join operator (+) is not allowed here

In the 9.2 documentation (A96525-01) I found the following additional messages:

ORA-01416: two tables cannot be outer-joined to each other
ORA-01468: a predicate may reference only one outer-joined table

I’ve written SQL statements to produce the error message listed above on a 9.2.0.8 Oracle database and ran them on a 12.1.0.2 database as well to see which restrictions still exist for the outer join operator (+) as basis for my feedback to Roger. While writing the queries I thought this might be an interesting topic to blog about.

Examples

SELECT s.*, p.*
  FROM sh.sales s, sh.products p
 WHERE p.prod_id = s.prod_id(+)
       AND p.supplier_id(+) = s.channel_id;

An ORA-01416 is thrown in 9.2.0.8 and in 12.1.0.2. You cannot formulate such a query using ANSI join. Doing something like that does not make sense. It is not a relevant restriction. But it is interesting to see that an ORA-01416 is thrown in Oracle 12.1.0.2, even if this error message is not documented anymore.

SELECT s.*, c.*, p.*
  FROM sh.sales s, sh.customers c, sh.products p
 WHERE p.prod_id = s.prod_id(+)
       AND c.cust_id = s.cust_id(+);

An ORA-01417 is thrown in 9.2.0.8 but not in 12.1.0.2.

SELECT s.*, p.*
  FROM sh.sales s, sh.products p
 WHERE p.prod_id(+) = s.prod_id(+);

An ORA-01468 is thrown in 9.2.0.8 and in 12.1.0.2. You cannot formulate such a query using ANSI join. It could have been a way to formulate a full outer join, but something like that is not supported with Oracle join syntax. ORA-01468 is not documented in Oracle 12.1.0.2, but nonetheless this error is thrown. I do not consider this a relevant restriction for Oracle join-syntax.

SELECT s.*, p.*
  FROM sh.sales s, sh.products p
 WHERE p.prod_id(+) = s.prod_id
   AND p.prod_category(+) IN ('Boys', 'Girls');

An ORA-01719 is thrown in 9.2.0.8 but not in 12.1.0.2.

SELECT s.*
  FROM sh.sales s
 WHERE s.time_id(+) = (SELECT MAX(t.time_id)
                         FROM sh.times t);

An ORA-01799 is thrown in 9.2.0.8 and in 12.1.0.2. You cannot formulate such a query using ANSI join. Of course you may rewrite this to a valid Oracle join or ANSI join query. Here’s an example:

SELECT s.*, t.max_time_id
  FROM sh.sales s,
       (SELECT MAX(t.time_id) AS max_time_id
          FROM sh.times t) t
 WHERE s.time_id(+) = t.max_time_id;

Because the restriction applies to ANSI join as well, I do not consider this a relevant restriction for Oracle join syntax.

SELECT s.*, c.*, p.*
  FROM sh.sales s, sh.customers c
  JOIN sh.products p
    ON (p.prod_id = s.prod_id)
 WHERE c.cust_id = s.cust_id(+);

An ORA-25156 is thrown in 9.2.0.8 and in 12.1.0.2. This is not really a restriction for Oracle join syntax. The grammar simply does not support to mix join syntax variants.

SELECT lpad(' ', (LEVEL - 1) * 3) || to_char(e.empno) || ' ' ||
       e.ename(+) ||
       ' ' || d.dname AS emp_name
  FROM scott.emp e, scott.dept d
 WHERE e.deptno(+) = d.deptno
CONNECT BY PRIOR e.empno(+) = e.mgr
 START WITH e.ename(+) = 'KING'
 ORDER BY rownum, e.empno(+);

An ORA-30563 is thrown in 9.2.0.8 and 12.1.0.2. Interesting is that if you remove the (+) on the highlighted line 2 the query works on 9.2.0.8 but not on 12.1.0.2. Using the (+) in a CONNECT BY clause, START WITH clause, or ORDER BY clause does not make sense. It is not possible using ANSI-join as well. The important part is the join itself on line 5 and this is working in conjunction with a CONNECT BY. Therefore I do consider this an irrelevant restriction for the Oracle join syntax.

Summary

The results of the example relevant statements are summarized in the following table.

Error message by test SQL	Relevant outer join restriction?	Result in 9.2.0.8	Result in 12.1.0.2
ORA-01416 two tables cannot be outer-joined to each other	No	Error	Error
ORA-01417: a table may be outer joined to at most one other table	Yes	Error	OK
ORA-01468 a predicate may reference only one outer-joined table	No	Error	Error
ORA-01719: outer join operator (+) not allowed in operand of OR or IN	Yes	Error	OK
ORA-01799: a column may not be outer-joined to a subquery	No	Error	Error
ORA-25156: old style outer join (+) cannot be used with ANSI joins	No	Error	Error
ORA-30563: outer join operator (+) is not allowed here	No	Error	Error

Table 1: Outer join operator (+) restrictions in 9.2.0.8 and 12.1.0.2

In the most current Oracle version no relevant limitations exist regarding the Oracle join syntax. Hence choosing ANSI join syntax just because in the past some limitations existed is doing the right for the wrong reasons… I favor the ANSI join syntax because filter and join conditions are clearly separated. For full outer joins, there is simply no better performance option than to use ANSI join syntax. See also also Chris Antognini’s post about native full outer join.

↧

Monitoring PL/SQL Code Evolution With PL/SQL Cop for SonarQube

April 23, 2016, 5:48 am

≫ Next: PL/SQL Cop Meets oddgen

≪ Previous: Outer Join Operator (+) Restrictions in 12.1.0.2?

Last week I’ve presented the PL/SQL Cop tool suite to a customer in Germany. While preparing the demo I had taken my first deeper look at the PL/SQL Cop SonarQube plugin, written by Peter Rohner, a fellow Trivadian. I was impressed how well the additional PL/SQL Cop metrics integrate into SonarQube and how easy it is to monitor the code evolution.

Before I show the code evolution I will go through the metric definitions based on fairly simple example. If you are not interested in the math, then feel free to skip reading the metric sections.

Password_Check, Version 0.1

As a starting point I use the following simplified password verification procedure, which ensures that every password contains a digit. I know this procedure is not a candidate for “good PL/SQL code”, but nonetheless it is based on a real live example. The goal of this piece of code is to explain some metrics, before starting to improve the code.

CREATE OR REPLACE PROCEDURE PASSWORD_CHECK (in_password IN VARCHAR2) IS -- NOSONAR
   co_digitarray CONSTANT STRING(10)     := '0123456789';
   co_one        CONSTANT SIMPLE_INTEGER := 1;
   co_errno      CONSTANT SIMPLE_INTEGER := -20501;
   co_errmsg     CONSTANT STRING(100)    := 'Password must contain a digit.';
   l_isdigit     BOOLEAN;
   l_len_pw      PLS_INTEGER;
   l_len_array   PLS_INTEGER;
BEGIN
   -- initialize variables
   l_isdigit := FALSE;
   l_len_pw := LENGTH(in_password);
   l_len_array := LENGTH(co_digitarray);
   <<check_digit>>
   FOR i IN co_one .. l_len_array
   LOOP
      <<check_pw_char>>
      FOR j IN co_one .. l_len_pw
      LOOP
         IF SUBSTR(in_password, j, co_one) = SUBSTR(co_digitarray, i, co_one) THEN
            l_isdigit := TRUE;
            GOTO check_other_things;
         END IF;
      END LOOP check_pw_char;
   END LOOP check_digit;
   <<check_other_things>>
   NULL;

   IF NOT l_isdigit THEN
      raise_application_error(co_errno, co_errmsg);
   END IF;
END password_check;
/

After running this code through PL/SQL Cop I get the following metrics. I show here just the SonarQube output, but the results are the same for the command line utility and the SQL Developer extension of SQL Cop.

Simple Metrics

Here are the definitions of the simple metrics shown above.

Bytes – the number of bytes (1039)
Lines – the number of physical lines – lines separated by OS specific line separator (33)
Comment Lines – the number of comment lines – see line 10 (1)
Blank Lines – the number of empty lines – see line 28 (1)
Lines Of Code – Lines minus comment lines minus blank lines (31)
Commands – the number of commands from a SQL*Plus point of view – see CREATE OR REPLACE PROCEDURE (1)
Functions – the number of program units – the password_check procedure (1)
Statements – the number of PL/SQL statements – 4 assignments, 2 FOR loops, 2 IF statements, 1 GOTO statement, 1 NULL statement, 1 procedure call (11)
Files – the number of files processed (1)
Directories – the number of directories processed (1)
Issues – the number of Trivadis PL/SQL & SQL Coding Guideline violations – Guideline 39: Never use GOTO statements in your code (1)

Simple metrics such as Lines of Code are an easy way to categorise the programs in a project. But the program with the most lines of code does not necessarily have to be the most complex one. Other metrics are better suited to identify the complex parts of a project. But it is important to have a good idea how such metrics are calculated, because no single metric is perfect. I typically identify programs to have a closer look at by a combination of metrics such as lines of code, statements, cyclomatic complexity and the number of severe issues.

See SonarQube documentation for the further metric definitions. Please note, that PL/SQL Cop does not calculate all metrics and some metrics are calculated a bit differently, e.g. Comment Lines.

SQALE Rating

SonarQube rates a project using the SQALE Rating which is is based on the Technical Dept Ratio and calculated as follows:

$$\text'tdr' = 100⋅{\text'Technical Debt'}/{\text'Development Cost'}$$

where

$\text'tdr'$ is defined as the technical dept ratio (1.6%)
$\text'Technical Debt'$ is defined as the estimated time to fix the issues, PL/SQL Cop defines the time to fix per issue type (0.25 hours)
$\text'Development Cost'$ is defined as the estimated time to develop the source code from scratch, the SonarQube default configuration is 30 minutes per Line of Code, you may amend the value on the Technical Dept page (15.5 hours)

The ranges for the SQALE Rating values A (very good) to E (very bad) are based on the SQALE Method Definition Document. SonarQube uses the default rating scheme “0.1,0.2,0.5,1” which may be amended on the Technical Debt page. The rating scheme defines the rating thresholds for A, B, C and D. Higher values lead to an E rating. Here is another way to represent the default rating scheme:

A: $\text'tdr'$ <= 10%
B: $\text'tdr'$ > 10% and $\text'tdr'$ <= 20%
C: $\text'tdr'$ > 20% and $\text'tdr'$ <= 50%
D: $\text'tdr'$ > 50% and $\text'tdr'$ <= 100%
E: $\text'tdr'$ > 100%

Based on the default SQALE Rating scheme, a project rated as “E” should be rewritten from scratch, since it would take more time to fix all issues.

McCabe’s Cyclomatic Complexity

Thomas J. McCabe introduced 1976 the metric Cyclomatic Complexity which counts the number of paths in the source code. SonarQube uses this metric to represent the complexity of a program. PL/SQL Cop calculates the cyclomatic complexity as follows:

$$M=E-N+2P$$

where

$M$ is defined as the cyclomatic complexity (6)
$E$ is defined as the number of edges (15)
$N$ is defined as the number of nodes (11)
$P$ is defined as the number of connected components/programs (1)

The higher the cyclomatic complexity, the more difficult it is to maintain the code.

Please note that PL/SQL Cop V1.0.16 adds an additional edge for ELSE branches in IF/CASE statements, for PL/SQL blocks and for GOTO statements. I consider this a bug. However, Toad Code Analysis (Xpert) calculates the Cyclomatic Complexity the very same way.

PL/SQL Cop calculates the Cyclomatic Complexity per program unit and provides the aggregated Max. Cyclomatic Complexity on file level.

Halstead Volume

Maurice H. Halstead introduced 1977 the metric Halstead Volume which defines the complexity based on the vocabulary and the total number of words/elements used within a program. In his work Halstead showed also how to express the complexity of academic abstracts using his metrics. PL/SQL Cop calculates the Halstead volume as follows:

$$V=N⋅log_2n$$

where

$V$ is defined as the Halstead Volume. (489.7)
$N$ is defined as the program length. $N=N_1+N_2$ (94)
$n$ is defined as the program vocabulary. $n=n_1+n_2$ (37)
$N_1$ is defined as the total number of operators (42)
$N_2$ is defined as the total number of operands (52)
$n_1$ is defined as the number of distinct operators (11)
$n_2$ is defined as the number of distinct operands (26)

using

the following operators: if, then, elsif, case, when, else, loop, for-loop, forall-loop, while-loop, exit, exit-when, goto, return, close, fetch, open, open-for, open-for-using, pragma, exception, procedure-call, assignment, function-call, sub-block, parenthesis, and, or, not, eq, ne, gt, lt, ge, le, semicolon, comma, colon, dot, like, between, minus, plus, star, slash, percent
the following operands: identifier, string, number

The higher the Halstead volume, the more difficult it is to maintain the code.

PL/SQL Cop calculates the Halstead Volume per program unit and provides the aggregated Max. Halstead Volume on file level.

Maintainability Index

Paul Oman and Jack Hagemeister introduced 1991 the metric Maintainability Index which weighs comments and combines it with Halstead Volume and Cyclomatic Complexity. PL/SQL Cop calculates the maintainability index as follows:

$$\text'MI'=\text'MI'woc+\text'MI'cw$$

where

$\text'MI'$ is defined as the Maintainablity Index (102.2)
$\text'MI'woc$ is defined as the $\text'MI'$ without comments. $\text'MI'woc=171−5.2⋅log_eaveV−0.23⋅aveM−16.2⋅log_eaveLOC$ (86.617)
$\text'MI'cw$ is defined as the $\text'MI'$ comment weight. $\text'MI'cw=50⋅sin(√{{2.4⋅aveC}/{aveLOC}})$ (15.549)
$aveV$ is defined as the average Halstead volume. $aveV={∑unitLOC⋅V}/{fileLOC}$ (489.7)
$aveM$ is defined as the average cyclomatic complexity. $aveM={∑unitLOC⋅M}/{fileLOC}$ (6)
$aveLOC$ is defined as the average lines of code including comments. $aveLOC={∑unitLOC}/{units}$ (24)
$aveC$ is defined as the average lines of comment. $aveC={∑unitC}/{units}$ (1)
$unitLOC$ is defined as the number of lines in a PL/SQL unit, without declare section (24)
$fileLOC$ is defined as the number of lines in source file (33)
$units$ is defined as the number of PL/SQL units in a file (1)
$unitC$ is defined as the number of comment lines in a PL/SQL unit (1)

The lower the maintainability index, the more difficult it is to maintain the code.

PL/SQL Cop calculates the Maintainability Index per program unit and provides the aggregated Min. Maintainability Index on file level.

Password_Check, Version 0.2 – Better

To get rid of the GOTO I’ve rewritten the procedure to use regular expressions to look for digits within the password. The code looks now as follows:

CREATE OR REPLACE PROCEDURE PASSWORD_CHECK (in_password IN VARCHAR2) IS
BEGIN
   IF NOT REGEXP_LIKE(in_password, '\d') THEN
      raise_application_error(-20501, 'Password must contain a digit.');
   END IF;
END;
/

After loading the new version into SonarQube, the dashboard looks as follows:

Almost all metrics look better now. But instead of 1 major issue I have now 5 minor ones. This leads to a higher Technical Dept Ratio and a bad trend in this area. So let’s see what these minor issues are.

I consider all guideline violations as not worth to fix and marked them as “won’t fix”. After reloading the unchanged password_check.sql the SonarQube dashboard looks as follows:

The differences/improvement to the previous version is shown in parenthesis.

Password_Check, Version 0.3 – Even Better?

The version 0.2 code looks really good. No technical debt, no issues. A complexity of 2 and just 7 lines of code. But is it possible to improve this code further? Technically yes, especially since we know how the Maintainability Index is calculated. We could simply reduce the Lines of Code as follows:

CREATE OR REPLACE PROCEDURE PASSWORD_CHECK(in_password IN VARCHAR2)IS BEGIN IF NOT REGEXP_LIKE(in_password,'\d')THEN raise_application_error(-20501,'Password must contain a digit.');END IF;END;
/

And after loading the new version into SonarQube the dashboard looks as follows:

Reducing the number of lines from 7 to 2 leads to better Maintainability Index, but the number of statements, the Cyclomatic Complexity and Halstead Volume are still the same. The change from version 0.2 to 0.3 reduces the readability of the code and has a negative value. That clearly shows, that the Maintainability Index has its flaws (see also https://avandeursen.com/2014/08/29/think-twice-before-using-the-maintainability-index/). There are various ways to discourage such kind of changes in a project. Using a formatter/beautifier with agreed settings is my favourite.

Code Evolution

SonarQube shows the metrics of the latest two versions in the dashboard. Use the Time Machine page to show metrics of more than two versions of project.

Or use the Compare page to compare metrics between versions or projects.

Conclusion

Every metric has its flaws, for example

Lines of Code does not account for the code complexity
Cyclomatic Complexity does not account for the length of a program and the complexity of a statement
Halstead Volume does not account for the number of paths in the program
Maintainability index cannot distinguish between useful and useless comments and does not account for code formatting

But these metrics are still useful to identify complex programs, to measure code evolution (improvements, degradations) and to help you writing better PL/SQL, if you do not trust in metrics blindly.

↧

PL/SQL Cop Meets oddgen

April 26, 2016, 1:50 pm

≫ Next: How to Integrate Your PL/SQL Generators in SQL Developer

≪ Previous: Monitoring PL/SQL Code Evolution With PL/SQL Cop for SonarQube

Until August 2015 it never occurred to me that one could use non-PL/SQL code within conditional compilation blocks. Back than we discussed various template engine options as foundation for oddgen – the Oracle community’s dictionary-driven code generator.

oddgen supports nowadays the in-database template engines FTLDB and tePLSQL. Both tools may access templates stored in PL/SQL packages using a selection directive. Here’s the package body of a generator using FTLDB:

CREATE OR REPLACE PACKAGE BODY ftldb_hello_world IS

$IF FALSE $THEN
--%begin generate_ftl
<#assign object_type = template_args[0]/>
<#assign object_name = template_args[1]/>
BEGIN
   sys.dbms_output.put_line('Hello ${object_type} ${object_name}!');
END;
${"/"}
--%end generate_ftl
$END

   FUNCTION generate(in_object_type IN VARCHAR2,
                     in_object_name IN VARCHAR2) RETURN CLOB IS
      l_result CLOB;
      l_args varchar2_nt;
   BEGIN
      l_args := NEW varchar2_nt(in_object_type, in_object_name);
      l_result := ftldb_api.process_to_clob(in_templ_name => $$PLSQL_UNIT || '%generate_ftl',
                                            in_templ_args => l_args);
      RETURN l_result;
   END generate;
END ftldb_hello_world;
/

The template is stored within the lines 4 to 11. It’s easy to see that the target code is PL/SQL, but the template itself contains various parts which do not comply with the PL/SQL language. The $IF on line 3 ensures that the template is compiled only when the condition is met. Never, in this case. You may be surprised, but yes, this trick really works.

However, if I check this code with PL/SQL Cop for SQL Developer 1.0.12 I get the following result:

Bad. This version of PL/SQL Cop cannot parse this code successfully, since it expects valid PL/SQL code within the conditional compilation blocks. While it has some advantages to include conditional PL/SQL code in a code analysis, it is simply worthless if the code cannot be parsed at all.

Therefore I released today new versions of all PL/SQL parser based products supporting non-PL/SQL code within conditional compilation blocks. And the result in PL/SQL Cop for SQL Developer 1.0.13 is:

Good. You see, this version parses such code without problems. There are still some limitations regarding the support of conditional compilation in DECLARE sections, but I’m glad that the parser is becoming more and more complete.

So it is time to update PL/SQL Analyzer, PL/SQL Cop and PL/SQL Cop for SQL Developer.

Thanks oddgen for driving this improvement.

↧

How to Integrate Your PL/SQL Generators in SQL Developer

May 22, 2016, 11:40 am

≫ Next: PL/SQL Bulk Unwrap

≪ Previous: PL/SQL Cop Meets oddgen

About three weeks ago Steven Feuerstein tweeted in his tip #501 a link to a generator for the WHEN clause in DML triggers on Oracle Live SQL. Back than I refactored the generator for oddgen – the Oracle community’s dictionary-driven code generator – and published the result on Oracle Live SQL as well. Some days ago Steven tweeted in tip #514 about generating a standardised table DDL and I thought about a short moment to refactor this generator as well, but decided against it. There are a lot of generators around which write their result to DBMS_OUTPUT or into intermediate/helper tables and I believe that it would be more helpful to show how such generators could be integrated into oddgen for SQL Developer. If you are overwhelmed by the length of this blog post (as I was) then I suggest that you scroll down to the bottom and look at the 39 seconds of audio-less video to see a generator in action.

1. Install oddgen for SQL Developer

I assume that you are already using SQL Developer 4.x. If not then it is about time that you grab the latest version from here and install it. It’s important to note that oddgen requires version 4 of SQL Developer and won’t run on older versions 3.x, 2.x and 1.x.

SQL Developer comes with a lot of “internal” extensions, but third party extensions need to be installed explicitly. To install oddgen for SQL Developer I recommend to follow the steps in installation via update center on oddgen.org. If this is not feasible because your company’s network restricts the internet access then download the latest version and install it from file.

To enable the oddgen window, select “Generators” from the “View” menu as shown in the following picture:

You are ready for the next steps, when the Generators window appears in the lower left corner within SQL Developer.

2. Install the Original Generator

If you are going to integrate your existing generator into SQL Developer this step sounds irrelevant. However, I find it useful to install and try generators in a fresh environment to ensure I have not missed some dependencies. I’ve got Steven Feuerstein’s permission to use his generator for this post. It’s a standalone PL/SQL procedure without dependencies. We install the generator “as is” in our database. I will use a schema named oddgen, but you may use another user/schema of course. See the create user DDL on Github if you are interested to know how I’ve setup the oddgen user.

-- 1:1 from https://livesql.oracle.com/apex/livesql/file/content_DBEO1MIGH5ZQOILUVV85I1UQC.html
CREATE OR REPLACE PROCEDURE gen_table_ddl (
   entity_in     IN VARCHAR2,
   entities_in   IN VARCHAR2 DEFAULT NULL,
   add_fky_in    IN BOOLEAN DEFAULT TRUE,
   prefix_in     IN VARCHAR2 DEFAULT NULL,
   in_apex_in    IN BOOLEAN DEFAULT FALSE)
IS
   c_table_name    CONSTANT VARCHAR2 (100)
      := prefix_in || NVL (entities_in, entity_in || 's') ;

   c_pkycol_name   CONSTANT VARCHAR2 (100) := entity_in || '_ID';

   c_user_code     CONSTANT VARCHAR2 (100)
      := CASE
            WHEN in_apex_in THEN 'NVL (v (''APP_USER''), USER)'
            ELSE 'USER'
         END ;

   PROCEDURE pl (str_in                   IN VARCHAR2,
                 indent_in                IN INTEGER DEFAULT 3,
                 num_newlines_before_in   IN INTEGER DEFAULT 0)
   IS
   BEGIN
      FOR indx IN 1 .. num_newlines_before_in
      LOOP
         DBMS_OUTPUT.put_line ('');
      END LOOP;

      DBMS_OUTPUT.put_line (LPAD (' ', indent_in) || str_in);
   END;
BEGIN
   pl ('CREATE TABLE ' || c_table_name || '(', 0);
   pl (c_pkycol_name || ' INTEGER NOT NULL,');
   pl ('created_by VARCHAR2 (132 BYTE) NOT NULL,');
   pl ('changed_by VARCHAR2 (132 BYTE) NOT NULL,');
   pl ('created_on DATE NOT NULL,');
   pl ('changed_on DATE NOT NULL');
   pl (');');

   pl ('CREATE SEQUENCE ' || c_table_name || '_SEQ;', 0, 1);
   pl (
         'CREATE UNIQUE INDEX '
      || c_table_name
      || ' ON '
      || c_table_name
      || '('
      || c_pkycol_name
      || ');',
      0,
      1);
   pl (
         'CREATE OR REPLACE TRIGGER '
      || c_table_name
      || '_bir
      BEFORE INSERT ON '
      || c_table_name,
      0,
      1);
   pl ('FOR EACH ROW DECLARE', 3);
   pl ('BEGIN', 3);
   pl ('IF :new.' || c_pkycol_name || ' IS NULL', 6);
   pl (
         'THEN :new.'
      || c_pkycol_name
      || ' := '
      || c_table_name
      || '_seq.NEXTVAL; END IF;',
      6);

   pl (':new.created_on := SYSDATE;', 6);
   pl (':new.created_by := ' || c_user_code || ';', 6);
   pl (':new.changed_on := SYSDATE;', 6);
   pl (':new.changed_by := ' || c_user_code || ';', 6);
   pl ('END ' || c_table_name || '_bir;', 3);

   pl ('CREATE OR REPLACE TRIGGER ' || c_table_name || '_bur', 0, 1);
   pl ('BEFORE UPDATE ON ' || c_table_name || ' FOR EACH ROW', 3);
   pl ('DECLARE', 3);
   pl ('BEGIN', 3);
   pl (':new.changed_on := SYSDATE;', 6);
   pl (':new.changed_by := ' || c_user_code || ';', 6);
   pl ('END ' || c_table_name || '_bur;', 3);

   pl ('ALTER TABLE ' || c_table_name || ' ADD
      (CONSTRAINT ' || c_table_name,
       0,
       1);
   pl (
         'PRIMARY KEY ('
      || c_pkycol_name
      || ')
       USING INDEX '
      || c_table_name
      || ' ENABLE VALIDATE);',
      3);

   IF add_fky_in
   THEN
      pl (
            'ALTER TABLE '
         || c_table_name
         || ' ADD (CONSTRAINT fk_'
         || c_table_name,
         0,
         1);
      pl ('FOREIGN KEY (REPLACE_id)
     REFERENCES qdb_REPLACE (REPLACE_id)', 3);
      pl ('ON DELETE CASCADE ENABLE VALIDATE);', 3);
   END IF;
END;
/

3. Understand the Input and Output of the Original Generator

Before we start writing a wrapper for the original generator we need to understand its API. The purpose of the generator is described on Oracle Live SQL by Steven Feuerstein as follows:

I follow a few standards for table definitions, including: table name is plural; four standard audit columns (created by/when, updated by/when) with associated triggers; primary key name is [entity]_id, and more. This procedure (refactored from PL/SQL Challenge, the quiz website plsqlchallenge.oracle.com) gives me a consisting start point, from which I then add entity-specific columns, additional foreign keys, etc. Hopefully you will find it useful, too!

The procedure gen_table_ddl expects the following 5 input parameters (see highlighted lines 3 to 7 above):

Parameter Name	Datatype	Optional?	Default	Comments
entity_in	varchar2	No		used to name the primary key column
entities_in	varchar2	Yes	entity_in \|\| 's'	used to name table, sequence, index, triggers and constraints
add_fky_in	boolean	Yes	true	true: generates a template for a foreign key constraint false: does not generate a foreign key constraint template
prefix_in	varchar2	Yes	null	prefix for all object names (named by entities_in)
in_apex_in	boolean	Yes	false	true: uses apex built-in variable APP_USER to populate created_by and changed_by. Uses pseudo column USERS only if APP_USER is empty false: always use pseudo column USER to populate created_by and changed_by

By now we should have a decent understanding of the procedure input. But how is the output generated? It’s a procedure after all and there are no output parameters defined. Line 30 reveals the output mechanism. Every line is produced by the nested procedure pl which writes the result to the server output using the DBMS_OUTPUT package.

4. Understand the Basics of the oddgen PL/SQL Interface

When selecting a database connection, the oddgen extension searches the data dictionary for PL/SQL packages implementing the oddgen PL/SQL interface. Basically it looks for a package functions with the following signature:

FUNCTION generate(in_object_type IN VARCHAR2,
                  in_object_name IN VARCHAR2,
                  in_params      IN t_param) RETURN CLOB;

The interface is designed for generators based on existing database object types such as tables. Therefore it expects the object_type and the object_name as parameter one and two. For our generator the third parameter is the most interesting one. It allows us to pass additional parameters to the generator. The data type t_param is an associative array and based on the following definition:

SUBTYPE string_type IS VARCHAR2(1000 CHAR);
SUBTYPE param_type IS VARCHAR2(60 CHAR);
TYPE t_param IS TABLE OF string_type INDEX BY param_type;

Through in_params we may pass an unlimited number of key-value pairs to an oddgen generator.

But the oddgen interface is also responsible to define the representation in the GUI. Let’s look at an example of another generator named “Dropall”:

The “Dropall” node is selected and in the status bar its description is displayed. Under this node you find the object types “Indexes” and “Tables” but also an artificial object type named “All”. Under the object type nodes you find the list of all associated object names. This structure supports the following features:

Generate code through a simple double-click on an object name node
Select multiple object name nodes of the same object type to generate code via context-menu
Show a dialog via context menu for selected object name nodes to change generator parameters

When a generator is called, the selected object name and its associated object type are passed to the generator. Always, without exception. However, for artificial object types and object names it might be okay to ignore these parameters in the generator implementation.

See the oddgen PL/SQL interface documentation on oddgen.org if you are interested in the details.

For the next steps it’s just important to know that we have to define the default behaviour of a generator and that a generator provides some information for the GUI only.

5. Write the Wrapper

The following screenshot shows our generator in SQL Developer after selecting “Generate…” from the context menu on the node “Snippet”:

The package specification for this generator looks as follows:

CREATE OR REPLACE PACKAGE gen_table_ddl_oddgen_wrapper IS
   SUBTYPE string_type IS VARCHAR2(1000 CHAR);
   SUBTYPE param_type IS VARCHAR2(60 CHAR);
   TYPE t_string IS TABLE OF string_type;
   TYPE t_param IS TABLE OF string_type INDEX BY param_type;
   TYPE t_lov IS TABLE OF t_string INDEX BY param_type;

   FUNCTION get_name RETURN VARCHAR2;

   FUNCTION get_description RETURN VARCHAR2;

   FUNCTION get_object_types RETURN t_string;

   FUNCTION get_object_names(in_object_type IN VARCHAR2) RETURN t_string;

   FUNCTION get_params RETURN t_param;

   FUNCTION get_ordered_params RETURN t_string;

   FUNCTION get_lov RETURN t_lov;

   FUNCTION generate(in_object_type IN VARCHAR2,
                     in_object_name IN VARCHAR2,
                     in_params      IN t_param) RETURN CLOB;
END gen_table_ddl_oddgen_wrapper;
/

I’m going to explain some parts of the of the wrapper implementation based on the package body for this oddgen wrapper:

CREATE OR REPLACE PACKAGE BODY gen_table_ddl_oddgen_wrapper IS
   co_entity   CONSTANT param_type := 'Entity name (singular, for PK column)';
   co_entities CONSTANT param_type := 'Entity name (plural, for object names)';
   co_add_fky  CONSTANT param_type := 'Add foreign key?';
   co_prefix   CONSTANT param_type := 'Object prefix';
   co_in_apex  CONSTANT param_type := 'Data populated through APEX?';

   FUNCTION get_name RETURN VARCHAR2 IS
   BEGIN
      RETURN 'Table DDL snippet';
   END get_name;

   FUNCTION get_description RETURN VARCHAR2 IS
   BEGIN
      RETURN 'Steven Feuerstein''s starting point, from which he adds entity-specific columns, additional foreign keys, etc.';
   END get_description;

   FUNCTION get_object_types RETURN t_string IS
   BEGIN
      RETURN NEW t_string('TABLE');
   END get_object_types;

   FUNCTION get_object_names(in_object_type IN VARCHAR2) RETURN t_string IS
   BEGIN
      RETURN NEW t_string('Snippet');
   END get_object_names;

   FUNCTION get_params RETURN t_param IS
      l_params t_param;
   BEGIN
      l_params(co_entity) := 'employee';
      l_params(co_entities) := NULL;
      l_params(co_add_fky) := 'Yes';
      l_params(co_prefix) := NULL;
      l_params(co_in_apex) := 'No';
      RETURN l_params;
   END get_params;

   FUNCTION get_ordered_params RETURN t_string IS
   BEGIN
      RETURN NEW t_string(co_entity, co_entities, co_add_fky, co_prefix);
   END get_ordered_params;

   FUNCTION get_lov RETURN t_lov IS
      l_lov t_lov;
   BEGIN
      l_lov(co_add_fky) := NEW t_string('Yes', 'No');
      l_lov(co_in_apex) := NEW t_string('Yes', 'No');
      RETURN l_lov;
   END get_lov;

   FUNCTION generate(in_object_type IN VARCHAR2,
                     in_object_name IN VARCHAR2,
                     in_params      IN t_param) RETURN CLOB IS
      l_lines    sys.dbms_output.chararr;
      l_numlines INTEGER := 10; -- buffer size
      l_result   CLOB;

      PROCEDURE enable_output IS
      BEGIN
         sys.dbms_output.enable(buffer_size => NULL); -- unlimited size
      END enable_output;

      PROCEDURE disable_output IS
      BEGIN
         sys.dbms_output.disable;
      END disable_output;

      PROCEDURE call_generator IS
      BEGIN
         gen_table_ddl(entity_in   => in_params(co_entity),
                       entities_in => in_params(co_entities),
                       add_fky_in  => CASE
                                         WHEN in_params(co_add_fky) = 'Yes' THEN
                                          TRUE
                                         ELSE
                                          FALSE
                                      END,
                       prefix_in   => in_params(co_prefix),
                       in_apex_in  => CASE
                                         WHEN in_params(co_in_apex) = 'Yes' THEN
                                          TRUE
                                         ELSE
                                          FALSE
                                      END);
      END call_generator;

      PROCEDURE copy_dbms_output_to_result IS
      BEGIN
         sys.dbms_lob.createtemporary(l_result, TRUE);
         <<read_dbms_output_into_buffer>>
         WHILE l_numlines > 0
         LOOP
            sys.dbms_output.get_lines(l_lines, l_numlines);
            <<copy_buffer_to_clob>>
            FOR i IN 1 .. l_numlines
            LOOP
               sys.dbms_lob.append(l_result, l_lines(i) || chr(10));
            END LOOP copy_buffer_to_clob;
         END LOOP read_dbms_output_into_buffer;
      END copy_dbms_output_to_result;
   BEGIN
      enable_output;
      call_generator;
      copy_dbms_output_to_result;
      disable_output;
      RETURN l_result;
   END generate;
END gen_table_ddl_oddgen_wrapper;
/

On line 1 to 6 constants for every parameter are defined. The values are used as labels in the GUI.

The function get_name (line 10) defines the name used in the GUI for this generator.

The function get_description (line 15) returns a description of the generator. The description is shown in the status bar, as tool-tip and in the generator dialog.

The function get_object_types (line 20) defines the valid object types. We’ve chosen the object type “TABLE” because it represents the target code quite well. Using a known object types leads also to a nice icon representation.

The function get_object_names (line 25) defines the valid object names for an object type. The parameter in_object_type is not used since our list is static and contains just one value “Snippet”.

The function get_params (line 31-35) defines the list of input parameters for the generator (beside object type and object name). Here we define our five parameters with their default values. The default values are important to generate meaningful code when double clicking on the object name node “Snippet”. So, by default a table named “employees” with a foreign key template and triggers for non-APEX usage is generated.

The function get_ordered_params (line 41) defines the order of the parameters in the generator dialog. Such a definition is necessary since the original order ist lost. That’s expected behaviour for an associative array indexed by string. The default order by name is not very intuitive in this case.

The function get_lovs (line 47-48) defines the list-of-values per input parameter. We use “Yes” and “No” for the boolean parameters co_add_fky and co_in_apex since oddgen supports string parameters only. However, in the GUI typical boolean value pairs such as “Yes-No”, “1-0”, “true-false” are recognised based on the list-of-values definition and are represented as checkboxes. Hence, for the user it does not matter that technically no boolean parameters are used.

The function generate (line 103-107) defines the steps to produce the generated code. Each step is represented by a nested procedure call:

enable output – enables the dbms_output with unlimited buffer size
call_generator – calls the original generator code using the parameters passed by oddgen
copy_dbms_output_to_result – copies the output of the original generator into the result CLOB
disable_output – disables dbms_output

Finally the generated code is returned as CLOB.

6. Grant Access

To ensure that the generator is available for every user connecting to the instance, we have to grant access rights on the PL/SQL wrapper package. Granting the package to public is probably the easiest way.

GRANT EXECUTE ON gen_table_ddl_oddgen_wrapper TO PUBLIC;

7. Run in SQL Developer

Now we may run the generator in SQL Developer. The following video is 39 seconds long, contains no audio signal and shows how to generate the DDLs with default parameters and how to run the generator with amended parameters to generate the DDLs for a table to be used in an APEX application.

8. Conclusion

Every PL/SQL based code generator producing a document (CLOB, XMLTYPE, JSON), messages via DBMS_OUTPUT or records in tables can be integrated into SQL Developer using the oddgen extension. The effort depends on the number of parameters and their valid values. For simple generators this won’t take more than a few minutes, especially if you are an experienced oddgen user.

I hope you found this post useful. Your comments and feedback is very much appreciated.

↧

PL/SQL Bulk Unwrap

June 26, 2016, 2:44 pm

≫ Next: Trivadis PL/SQL & SQL Coding Guidelines Version 3.1

≪ Previous: How to Integrate Your PL/SQL Generators in SQL Developer

406 days ago I’ve released PL/SQL Unwrapper for SQL Developer version 0.1.1 and blogged about it. With this extension you can unwrap the content of a SQL Developer window. Time for an update. With the new verison 1.0 you can unwrap multiple selected objects with a few mouse clicks. In this blog post I show how.

1. Install Extensions

I assume that you are already using SQL Developer 4.0.2 or higher. If not then it is about time that you grab the latest version from here and install it. It’s important to note that the extensions won’t run in older versions of SQL Developer.

Configure the update centers http://update.salvis.com/ and http://update.oddgen.org/ to install the extensions for SQL Developer:

If you cannot use update center because your company’s network restricts the internet access then download the latest versions and install them from file.

Why download oddgen for SQL Developer? Because the bulk unwrap feature is implemented as oddgen plugin. Unwrapping an editor content works without oddgen, but for bulk unwrap you need oddgen.

2. Setup Test Environment

If you have a schema in your Oracle database with wrapped code you may skip this step and use this schema for bulk unwrap.

For the test environment I’ve used Morten Braten’s Alexandria PL/SQL Utility Library. Clone or download the library from GitHub. To install the library you need a dedicated user. Create such a user as SYS on your Oracle database instance as follows:

CREATE USER ax IDENTIFIED BY ax
DEFAULT TABLESPACE users
TEMPORARY TABLESPACE temp;

ALTER USER ax QUOTA UNLIMITED ON users;

GRANT connect, resource TO ax;
GRANT execute ON dbms_crypto TO ax;

Then run the install.sql script in the setup directory of the Alexandria PL/SQL Utility Library as user AX.

@install.sql

Wrap the PL/SQL code except package and type specifications in schema AX by running the script wrap_schema.sql:

3. Bulk Unwrap

Start SQL Developer and open a connection as user AX on your database.

If the oddgen window is not visible then select “Generators” from the “View” menu as shown in the following picture:

Afterwards the Generators window appears in the lower left corner within SQL Developer.

Select the open connection in combo box of the Generators window. Open the “PL/SQL Unwrapper” node and the “Package Bodies” node to show all wrapped package body names.

Select some or all package body nodes and press Return to generate the unwrapped code in a new worksheet. Afterwards you just may execute the generated code. Add “SET DEFINE OFF” at the start of the script to ensure unwrapped code containing ampersand (&) characters is processed correctly. Another option is to configure a connection startup script (login.sql) to change the default behaviour.

The following audioless video shows in just 56 seconds the whole bulk unwrapping process in detail.

I hope you find this new feature useful.

↧

Trivadis PL/SQL & SQL Coding Guidelines Version 3.1

June 28, 2016, 11:47 am

≫ Next: Bitemp Remodeler v0.1.0 Released

≪ Previous: PL/SQL Bulk Unwrap

The latest version 3.1 of the Trivadis PL/SQL & SQL Coding Guidelines has 150 pages. More than 90 additional pages compared to version 2.0. Roger Troller did a tremendous job in updating and extending an already comprehensive document while making it simpler to read and easier understand. In this post I will emphasise some changes I consider relevant.

New Guideline Categorisation Scheme

In version 2.0 coding guidelines are categorised by icons for information, caution, performance relevance, maintainability and readability. A guideline is associated exactly with one icon. Here’s an example:

In version 3.1 the characteristics for changeability, efficiency, maintainability, portability, reliability, reusability, security and testability as defined by the Software Quality Assessment based on Lifecycle Expectations (SQALE) methodology are used to categorise guidelines. A guideline is associated with one or more SQALE characteristics. Additionally a guideline is assigned to a severity (blocker, critical, major, minor, info). So guidelines are categorised in two dimensions: SQALE characteristics and severity. These categorisations are used to filter guidelines in SonarQube or PL/SQL Cop to be enabled or disabled. It’s not by chance that SonarQube is using exactly these categorisations.

Here’s the same example as above using this new guideline categorisation scheme:

In this excerpt you see other changes as well. The reference to the CodeXpert rule is gone, the guideline 12 got a new identifier 2150 and there is a good and bad example.

Good and Bad Examples for Every Guideline

In version 2.0 some guidelines had no examples, some just an excerpt of an example, some just a good and some just a bad example. Now in version 3.1 almost every guideline has a complete bad and a complete good example. With complete I mean that they are executable in SQL*Plus, SQLcl or within your IDE of choice. Why “almost”? For example, there is this guideline 65/7210 which says “Try to keep your packages small. Include only few procedures and functions that are used in the same context”. So, in some cases it is just not feasible/helpful to include a complete example.

For me as the guy who is responsible to write rules to check the compliance of the guidelines, good and bad examples are essentials for unit testing. Such examples also help the developer to understand guidelines. That’s why we include these examples in PL/SQL Cop.

New Guidelines

Beside some changes in categorisation and presentation of the guidelines, there are some new guidelines which I’d like to mention here:

ID	Guideline	Severity	SQALE Characteristics
2230	Try to use SIMPLE_INTEGER datatype when appropriate.	Minor	Efficiency
3150	Try to use identity columns for surrogate keys.	Minor	Maintainability, Reliability
3160	Avoid virtual columns to be visible.	Major	Maintainability, Reliability
3170	Always use DEFAULT ON NULL declarations to assign default values to table columns if you refuse to store NULL values.	Major	Reliability
3180	Always specify column names instead of positional references in ORDER BY clauses.	Major	Changeability, Reliability
3190	Avoid using NATURAL JOIN.	Major	Changeability, Reliability
5010	Try to use a error/logging framework for your application.	Critical	Reliability, Reusability, Testability
7460	Try to define your packaged/standalone function to be deterministic if appropriate.	Major	Efficiency
7810	Do not use SQL inside PL/SQL to read sequence numbers (or SYSDATE)	Major	Efficiency, Maintainability
8120	Never check existence of a row to decide whether to create it or not.	Major	Efficiency, Reliability
8310	Always validate input parameter size by assigning the parameter to a size limited variable in the declaration section of program unit.	Minor	Maintainability, Reliability, Reusability, Testability
8410	Always use application locks to ensure a program unit only running once at a given time.	Minor	Efficiency, Reliability
8510	Always use dbms_application_info to track programm process transiently	Minor	Efficiency, Reliability

Deprecated Guidelines

The guideline 54 “Avoid use of EXCEPTION_INIT pragma for a -20,NNN error” is not part of the document anymore.

New Guideline Identifiers

All guidelines got a new identifier. The first digit identifies the chapter of the document, e.g. “1” for “4.1 General”, “2” for “4.2 Variables & Types”, etc. The second digit is reserved for the sub-chapters and the remaining digits are just for ordering purposes. The gaps in the numbering scheme should allow to add future guidelines at the right place without renumbering everything (again).

There is an appendix to map old guideline identifiers to new ones. This should simplify the change to version 3.1. Here’s an excerpt:

Tool Support

PL/SQL Cop is mentioned in the guidelines. However, currently only the Trivadis PL/SQL & SQL Guidelines Version 2.0 are supported. But sometime in Q4 of 2016 an update supporting version 3.1 should be available.

Download

Get your copy of the Trivadis PL/SQL & SQL Guidelines Version 3.1 from here.

↧

Bitemp Remodeler v0.1.0 Released

September 25, 2016, 2:21 pm

≫ Next: PL/SQL Cop for Trivadis PL/SQL & SQL Coding Guidelines Version 3.2

≪ Previous: Trivadis PL/SQL & SQL Coding Guidelines Version 3.1

I’ve been working on a flexible table API generator for Oracle Databases since several months. A TAPI generator doesn’t sound like a real innovation. But this one contains some features you probably have not seen before in TAPI generator and hopefully will like it as much as I do.

In this post I will not explain the feature set thoroughly. Instead I will more or less focus on one of my favourite features.

Four models

The generators knows the following four data models.

If your table is based on one of these four models you may

simply generate a table API for it or
switch to another model and optionally generate a table API as well.

Option 2) is extraordinary, since it will preserve the existing data. E.g. it will preserve the content of the flashback data archive when you switch your model from uni-temporal transaction-time to a bi-temporal model even if the flashback archive tables need to be moved to another table. Furthermore it will keep the interface for the latest table the same. No application change required. Everything with just a few mouse clicks. If this sounds interesting for you, then have a look at https://github.com/oddgen/bitemp/blob/master/README.md where the concept is briefly explained or join me my session “oddgen – Bi-temporal Table API in Action” at the More than just – Performance Days 2016. Remote participation is still possible.

Option 1) is what we had since years. It was part of Oracle Designer, it’s part of SQL Developer in a simplified way and there are a some more or less simple table API generators around. So no big deal. However, when you choose option 1), there is one part which is really cool. The hook API package concept.

The Hook API

The problem with a lot of table API solution is, that there is typically no developer friendly way to include the business logic. I’ve seen the following:

Manual changes of the generated code, which is for various reason not a good solution.
External hooks, e.g. in XML files, in INI files, relational tables, etc. and merged at generation time into the final code. Oracle Designer worked that way.
Code which is dynamically executed by the generator at runtime, e.g. code snippets are stored in an pre-defined way in relational tables.

But what I’ve never seen, was business logic implemented in manually crafted PL/SQL packages, separated from the PL/SQL generated code. That’s strange, because this is a common practice in Java based projects.

In Java you typically define an interface for that and configure at runtime the right implementation. In PL/SQL we may do that similarly. A PL/SQL specification is an interface definition. That just one implementation may exist for an interface is not a limiting factor in this case.

Bitemp Remodeler generates the following hook API package specification for the famous EMP table in schema SCOTT:

CREATE OR REPLACE PACKAGE emp_hook AS
   /**
   * Hooks called by non-temporal API for table emp_lt (see package body of emp_api)
   * generated by Bitemp Remodeler for SQL Developer.
   * The body of this package is not generated. It has to be crafted and maintained manually.
   * Since the API for table emp_lt ignores errors caused by a missing hook package body, the implementation is optional.
   *
   * @headcom
   */

   /**
   * Hook called before insert into non-temporal table emp_lt.
   *
   * @param io_new_row new Row to be inserted
   */
   PROCEDURE pre_ins (
      io_new_row IN OUT emp_ot
   );

   /**
   * Hook called after insert into non-temporal table emp_lt.
   *
   * @param in_new_row new Row to be inserted
   */
   PROCEDURE post_ins (
      in_new_row IN emp_ot
   );

   /**
   * Hook called before update non-temporal table emp_lt.
   *
   * @param io_new_row Row with updated column values
   * @param in_old_row Row with original column values
   */
   PROCEDURE pre_upd (
      io_new_row IN OUT emp_ot,
      in_old_row IN emp_ot
   );

   /**
   * Hook called after update non-temporal table emp_lt.
   *
   * @param in_new_row Row with updated column values
   * @param in_old_row Row with original column values
   */
   PROCEDURE post_upd (
      in_new_row IN emp_ot,
      in_old_row IN emp_ot
   );

   /**
   * Hook called before delete from non-temporal table emp_lt.
   *
   * @param in_old_row Row with original column values
   */
   PROCEDURE pre_del (
      in_old_row IN emp_ot
   );

   /**
   * Hook called after delete from non-temporal table emp_lt.
   *
   * @param in_old_row Row with original column values
   */
   PROCEDURE post_del (
      in_old_row IN emp_ot
   );

END emp_hook;
/

The generated table API calls before an INSERT the pre_ins procedure and after the INSERT the post_ins procedures. For DELETE and UPDATE this works the same way. On the highlighted line 5 and 6 two interested things are pointed out. The body is not generated and the body does not need to be implemented since the API ignores errors caused by a missing PL/SQL hook package body.

Technically this is solved as follows in the API package body:

CREATE OR REPLACE PACKAGE BODY emp_api AS
   --
   -- Note: SQL Developer 4.1.3 cannot produce a complete outline of this package body, because it cannot handle
   --       the complete flashback_query_clause. The following expression breaks SQL Developer:
   --
   --          VERSIONS PERIOD FOR vt$ BETWEEN MINVALUE AND MAXVALUE
   --
   --       It's expected that future versions will be able to handle the flashback_query_clause accordingly.
   --       See "Bug 24608738 - OUTLINE OF PL/SQL PACKAGE BODY BREAKS WHEN USING PERIOD FOR OF FLASHBACK_QUERY_"
   --       on MOS for details.
   --

   --
   -- Declarations to handle 'ORA-06508: PL/SQL: could not find program unit being called: "SCOTT.EMP_HOOK"'
   --
   e_hook_body_missing EXCEPTION;
   PRAGMA exception_init(e_hook_body_missing, -6508);

   --
   -- Debugging output level
   --
   g_debug_output_level dbms_output_level_type := co_off;

   --
   -- print_line
   --
   PROCEDURE print_line (
      in_proc  IN VARCHAR2,
      in_level IN dbms_output_level_type,
      in_line  IN VARCHAR2
   ) IS
   BEGIN
      IF in_level <= g_debug_output_level THEN
         sys.dbms_output.put(to_char(systimestamp, 'HH24:MI:SS.FF6'));
         CASE in_level
            WHEN co_info THEN
               sys.dbms_output.put(' INFO  ');
            WHEN co_debug THEN
               sys.dbms_output.put(' DEBUG ');
            ELSE
               sys.dbms_output.put(' TRACE ');
         END CASE;
         sys.dbms_output.put(substr(rpad(in_proc,27), 1, 27) || ' ');
         sys.dbms_output.put_line(substr(in_line, 1, 250));
      END IF;
   END print_line;

   --
   -- print_lines
   --
   PROCEDURE print_lines (
      in_proc  IN VARCHAR2,
      in_level IN dbms_output_level_type,
      in_lines IN CLOB
   ) IS
   BEGIN
      IF in_level <= g_debug_output_level THEN
         <<all_lines>>
         FOR r_line IN (
            SELECT regexp_substr(in_lines, '[^' || chr(10) || ']+', 1, level) AS line
              FROM dual
           CONNECT BY instr(in_lines, chr(10), 1, level - 1) BETWEEN 1 AND length(in_lines) - 1
         ) LOOP
            print_line(in_proc => in_proc, in_level => in_level, in_line => r_line.line);
         END LOOP all_lines;
      END IF;
   END print_lines;


   --
   -- do_ins
   --
   PROCEDURE do_ins (
      io_row IN OUT emp_ot
   ) IS
   BEGIN
      INSERT INTO emp_lt (
                     empno,
                     ename,
                     job,
                     mgr,
                     hiredate,
                     sal,
                     comm,
                     deptno
                  )
           VALUES (
                     io_row.empno,
                     io_row.ename,
                     io_row.job,
                     io_row.mgr,
                     io_row.hiredate,
                     io_row.sal,
                     io_row.comm,
                     io_row.deptno
                  )
        RETURNING empno
             INTO io_row.empno;
      print_line(
         in_proc  => 'do_ins',
         in_level => co_debug,
         in_line  => SQL%ROWCOUNT || ' rows inserted.'
      );
   END do_ins;

   --
   -- do_upd
   --
   PROCEDURE do_upd (
      io_new_row IN OUT emp_ot,
      in_old_row IN emp_ot
   ) IS
   BEGIN
      UPDATE emp_lt
         SET empno = io_new_row.empno,
             ename = io_new_row.ename,
             job = io_new_row.job,
             mgr = io_new_row.mgr,
             hiredate = io_new_row.hiredate,
             sal = io_new_row.sal,
             comm = io_new_row.comm,
             deptno = io_new_row.deptno
       WHERE empno = in_old_row.empno
         AND (
                 (ename != io_new_row.ename OR ename IS NULL AND io_new_row.ename IS NOT NULL OR ename IS NOT NULL AND io_new_row.ename IS NULL) OR
                 (job != io_new_row.job OR job IS NULL AND io_new_row.job IS NOT NULL OR job IS NOT NULL AND io_new_row.job IS NULL) OR
                 (mgr != io_new_row.mgr OR mgr IS NULL AND io_new_row.mgr IS NOT NULL OR mgr IS NOT NULL AND io_new_row.mgr IS NULL) OR
                 (hiredate != io_new_row.hiredate OR hiredate IS NULL AND io_new_row.hiredate IS NOT NULL OR hiredate IS NOT NULL AND io_new_row.hiredate IS NULL) OR
                 (sal != io_new_row.sal OR sal IS NULL AND io_new_row.sal IS NOT NULL OR sal IS NOT NULL AND io_new_row.sal IS NULL) OR
                 (comm != io_new_row.comm OR comm IS NULL AND io_new_row.comm IS NOT NULL OR comm IS NOT NULL AND io_new_row.comm IS NULL) OR
                 (deptno != io_new_row.deptno OR deptno IS NULL AND io_new_row.deptno IS NOT NULL OR deptno IS NOT NULL AND io_new_row.deptno IS NULL)
             );
      print_line(
         in_proc  => 'do_upd',
         in_level => co_debug,
         in_line  => SQL%ROWCOUNT || ' rows updated.'
      );
   END do_upd;

   --
   -- do_del
   --
   PROCEDURE do_del (
      in_row IN emp_ot
   ) IS
   BEGIN
      DELETE
        FROM emp_lt
       WHERE empno = in_row.empno;
      print_line(
         in_proc  => 'do_del',
         in_level => co_debug,
         in_line  => SQL%ROWCOUNT || ' rows deleted.'
      );
   END do_del;

   --
   -- ins
   --
   PROCEDURE ins (
      in_new_row IN emp_ot
   ) IS
      l_new_row emp_ot;
   BEGIN
      print_line(in_proc => 'ins', in_level => co_info, in_line => 'started.');
      l_new_row := in_new_row;
      <<pre_ins>>
      BEGIN
         emp_hook.pre_ins(io_new_row => l_new_row);
      EXCEPTION
         WHEN e_hook_body_missing THEN
            NULL;
      END pre_ins;
      do_ins(io_row => l_new_row);
      <<post_ins>>
      BEGIN
         emp_hook.post_ins(in_new_row => l_new_row);
      EXCEPTION
         WHEN e_hook_body_missing THEN
            NULL;
      END post_ins;
      print_line(in_proc => 'ins', in_level => co_info, in_line => 'completed.');
   END ins;

   --
   -- upd
   --
   PROCEDURE upd (
      in_new_row IN emp_ot,
      in_old_row IN emp_ot
   ) IS
      l_new_row emp_ot;
   BEGIN
      print_line(in_proc => 'upd', in_level => co_info, in_line => 'started.');
      l_new_row := in_new_row;
      <<pre_upd>>
      BEGIN
         emp_hook.pre_upd(io_new_row => l_new_row, in_old_row => in_new_row);
      EXCEPTION
         WHEN e_hook_body_missing THEN
            NULL;
      END pre_upd;
      do_upd(io_new_row => l_new_row, in_old_row => in_old_row);
      <<post_upd>>
      BEGIN
         emp_hook.post_upd(in_new_row => l_new_row, in_old_row => in_old_row);
      EXCEPTION
         WHEN e_hook_body_missing THEN
            NULL;
      END post_upd;
      print_line(in_proc => 'upd', in_level => co_info, in_line => 'completed.');
   END upd;

   --
   -- del
   --
   PROCEDURE del (
      in_old_row IN emp_ot
   ) IS
   BEGIN
      print_line(in_proc => 'del', in_level => co_info, in_line => 'started.');
      <<pre_del>>
      BEGIN
         emp_hook.pre_del(in_old_row => in_old_row);
      EXCEPTION
         WHEN e_hook_body_missing THEN
            NULL;
      END pre_del;
      do_del(in_row => in_old_row);
      <<post_del>>
      BEGIN
         emp_hook.post_del(in_old_row => in_old_row);
      EXCEPTION
         WHEN e_hook_body_missing THEN
            NULL;
      END post_del;
      print_line(in_proc => 'del', in_level => co_info, in_line => 'completed.');
   END del;

   --
   -- set_debug_output
   --
   PROCEDURE set_debug_output (
      in_level IN dbms_output_level_type DEFAULT co_off
   ) IS
   BEGIN
      g_debug_output_level := in_level;
   END set_debug_output;

END emp_api;
/

CREATE OR REPLACE PACKAGE BODY emp_api AS
   --
   -- Note: SQL Developer 4.1.3 cannot produce a complete outline of this package body, because it cannot handle
   --       the complete flashback_query_clause. The following expression breaks SQL Developer:
   --
   --          VERSIONS PERIOD FOR vt$ BETWEEN MINVALUE AND MAXVALUE
   --
   --       It's expected that future versions will be able to handle the flashback_query_clause accordingly.
   --       See "Bug 24608738 - OUTLINE OF PL/SQL PACKAGE BODY BREAKS WHEN USING PERIOD FOR OF FLASHBACK_QUERY_"
   --       on MOS for details.
   --

   --
   -- Declarations to handle 'ORA-06508: PL/SQL: could not find program unit being called: "SCOTT.EMP_HOOK"'
   --
   e_hook_body_missing EXCEPTION;
   PRAGMA exception_init(e_hook_body_missing, -6508);

   --
   -- Debugging output level
   --
   g_debug_output_level dbms_output_level_type := co_off;

   --
   -- print_line
   --
   PROCEDURE print_line (
      in_proc  IN VARCHAR2,
      in_level IN dbms_output_level_type,
      in_line  IN VARCHAR2
   ) IS
   BEGIN
      IF in_level <= g_debug_output_level THEN
         sys.dbms_output.put(to_char(systimestamp, 'HH24:MI:SS.FF6'));
         CASE in_level
            WHEN co_info THEN
               sys.dbms_output.put(' INFO  ');
            WHEN co_debug THEN
               sys.dbms_output.put(' DEBUG ');
            ELSE
               sys.dbms_output.put(' TRACE ');
         END CASE;
         sys.dbms_output.put(substr(rpad(in_proc,27), 1, 27) || ' ');
         sys.dbms_output.put_line(substr(in_line, 1, 250));
      END IF;
   END print_line;

   --
   -- print_lines
   --
   PROCEDURE print_lines (
      in_proc  IN VARCHAR2,
      in_level IN dbms_output_level_type,
      in_lines IN CLOB
   ) IS
   BEGIN
      IF in_level <= g_debug_output_level THEN
         <<all_lines>>
         FOR r_line IN (
            SELECT regexp_substr(in_lines, '[^' || chr(10) || ']+', 1, level) AS line
              FROM dual
           CONNECT BY instr(in_lines, chr(10), 1, level - 1) BETWEEN 1 AND length(in_lines) - 1
         ) LOOP
            print_line(in_proc => in_proc, in_level => in_level, in_line => r_line.line);
         END LOOP all_lines;
      END IF;
   END print_lines;


   --
   -- do_ins
   --
   PROCEDURE do_ins (
      io_row IN OUT emp_ot
   ) IS
   BEGIN
      INSERT INTO emp_lt (
                     empno,
                     ename,
                     job,
                     mgr,
                     hiredate,
                     sal,
                     comm,
                     deptno
                  )
           VALUES (
                     io_row.empno,
                     io_row.ename,
                     io_row.job,
                     io_row.mgr,
                     io_row.hiredate,
                     io_row.sal,
                     io_row.comm,
                     io_row.deptno
                  )
        RETURNING empno
             INTO io_row.empno;
      print_line(
         in_proc  => 'do_ins',
         in_level => co_debug,
         in_line  => SQL%ROWCOUNT || ' rows inserted.'
      );
   END do_ins;

   --
   -- do_upd
   --
   PROCEDURE do_upd (
      io_new_row IN OUT emp_ot,
      in_old_row IN emp_ot
   ) IS
   BEGIN
      UPDATE emp_lt
         SET empno = io_new_row.empno,
             ename = io_new_row.ename,
             job = io_new_row.job,
             mgr = io_new_row.mgr,
             hiredate = io_new_row.hiredate,
             sal = io_new_row.sal,
             comm = io_new_row.comm,
             deptno = io_new_row.deptno
       WHERE empno = in_old_row.empno
         AND (
                 (ename != io_new_row.ename OR ename IS NULL AND io_new_row.ename IS NOT NULL OR ename IS NOT NULL AND io_new_row.ename IS NULL) OR
                 (job != io_new_row.job OR job IS NULL AND io_new_row.job IS NOT NULL OR job IS NOT NULL AND io_new_row.job IS NULL) OR
                 (mgr != io_new_row.mgr OR mgr IS NULL AND io_new_row.mgr IS NOT NULL OR mgr IS NOT NULL AND io_new_row.mgr IS NULL) OR
                 (hiredate != io_new_row.hiredate OR hiredate IS NULL AND io_new_row.hiredate IS NOT NULL OR hiredate IS NOT NULL AND io_new_row.hiredate IS NULL) OR
                 (sal != io_new_row.sal OR sal IS NULL AND io_new_row.sal IS NOT NULL OR sal IS NOT NULL AND io_new_row.sal IS NULL) OR
                 (comm != io_new_row.comm OR comm IS NULL AND io_new_row.comm IS NOT NULL OR comm IS NOT NULL AND io_new_row.comm IS NULL) OR
                 (deptno != io_new_row.deptno OR deptno IS NULL AND io_new_row.deptno IS NOT NULL OR deptno IS NOT NULL AND io_new_row.deptno IS NULL)
             );
      print_line(
         in_proc  => 'do_upd',
         in_level => co_debug,
         in_line  => SQL%ROWCOUNT || ' rows updated.'
      );
   END do_upd;

   --
   -- do_del
   --
   PROCEDURE do_del (
      in_row IN emp_ot
   ) IS
   BEGIN
      DELETE
        FROM emp_lt
       WHERE empno = in_row.empno;
      print_line(
         in_proc  => 'do_del',
         in_level => co_debug,
         in_line  => SQL%ROWCOUNT || ' rows deleted.'
      );
   END do_del;

   --
   -- ins
   --
   PROCEDURE ins (
      in_new_row IN emp_ot
   ) IS
      l_new_row emp_ot;
   BEGIN
      print_line(in_proc => 'ins', in_level => co_info, in_line => 'started.');
      l_new_row := in_new_row;
      <<pre_ins>>
      BEGIN
         emp_hook.pre_ins(io_new_row => l_new_row);
      EXCEPTION
         WHEN e_hook_body_missing THEN
            NULL;
      END pre_ins;
      do_ins(io_row => l_new_row);
      <<post_ins>>
      BEGIN
         emp_hook.post_ins(in_new_row => l_new_row);
      EXCEPTION
         WHEN e_hook_body_missing THEN
            NULL;
      END post_ins;
      print_line(in_proc => 'ins', in_level => co_info, in_line => 'completed.');
   END ins;

   --
   -- upd
   --
   PROCEDURE upd (
      in_new_row IN emp_ot,
      in_old_row IN emp_ot
   ) IS
      l_new_row emp_ot;
   BEGIN
      print_line(in_proc => 'upd', in_level => co_info, in_line => 'started.');
      l_new_row := in_new_row;
      <<pre_upd>>
      BEGIN
         emp_hook.pre_upd(io_new_row => l_new_row, in_old_row => in_new_row);
      EXCEPTION
         WHEN e_hook_body_missing THEN
            NULL;
      END pre_upd;
      do_upd(io_new_row => l_new_row, in_old_row => in_old_row);
      <<post_upd>>
      BEGIN
         emp_hook.post_upd(in_new_row => l_new_row, in_old_row => in_old_row);
      EXCEPTION
         WHEN e_hook_body_missing THEN
            NULL;
      END post_upd;
      print_line(in_proc => 'upd', in_level => co_info, in_line => 'completed.');
   END upd;

   --
   -- del
   --
   PROCEDURE del (
      in_old_row IN emp_ot
   ) IS
   BEGIN
      print_line(in_proc => 'del', in_level => co_info, in_line => 'started.');
      <<pre_del>>
      BEGIN
         emp_hook.pre_del(in_old_row => in_old_row);
      EXCEPTION
         WHEN e_hook_body_missing THEN
            NULL;
      END pre_del;
      do_del(in_row => in_old_row);
      <<post_del>>
      BEGIN
         emp_hook.post_del(in_old_row => in_old_row);
      EXCEPTION
         WHEN e_hook_body_missing THEN
            NULL;
      END post_del;
      print_line(in_proc => 'del', in_level => co_info, in_line => 'completed.');
   END del;

   --
   -- set_debug_output
   --
   PROCEDURE set_debug_output (
      in_level IN dbms_output_level_type DEFAULT co_off
   ) IS
   BEGIN
      g_debug_output_level := in_level;
   END set_debug_output;

END emp_api;
/

Now you may ask what the performance impact of these e_hook_body_missing exceptions are. I’ve done a small test and called a procedure without and with implemented body 1 million times. The overhead of the missing body exception is about 7 microseconds per call. Here’s the test output from SQL Developer, the relevant lines 51 and 89 are highlighted.

SQL> SET FEEDBACK ON
SQL> SET ECHO ON
SQL> SET TIMING ON
SQL> DROP PACKAGE dummy_api;

Package DUMMY_API dropped.

Elapsed: 00:00:00.027
SQL> DROP PACKAGE dummy_hook;

Package DUMMY_HOOK dropped.

Elapsed: 00:00:00.030
SQL> CREATE OR REPLACE PACKAGE dummy_hook AS
   PROCEDURE pre_ins;
END dummy_hook;
/

Package DUMMY_HOOK compiled

Elapsed: 00:00:00.023
SQL> CREATE OR REPLACE PACKAGE dummy_api AS
   PROCEDURE ins;
END dummy_api;
/

Package DUMMY_API compiled

Elapsed: 00:00:00.034
SQL> CREATE OR REPLACE PACKAGE BODY dummy_api AS
   e_hook_body_missing EXCEPTION;
   PRAGMA exception_init(e_hook_body_missing, -6508);
   PROCEDURE ins IS
   BEGIN
      BEGIN
         dummy_hook.pre_ins;
      EXCEPTION
         WHEN e_hook_body_missing THEN
            NULL;
      END pre_ins;
      dbms_output.put('.');
   END ins;
END dummy_api;
/

Package body DUMMY_API compiled

Elapsed: 00:00:00.040
SQL> -- without hook body
SQL> BEGIN
   FOR i IN 1..1E6 LOOP
      dummy_api.ins;
   END LOOP;
END;
/

PL/SQL procedure successfully completed.

Elapsed: 00:00:07.878
SQL> CREATE OR REPLACE PACKAGE BODY dummy_hook AS
   PROCEDURE pre_ins IS
   BEGIN
      dbms_output.put('-');
   END pre_ins;
END dummy_hook;
/

Package body DUMMY_HOOK compiled

Elapsed: 00:00:00.029
SQL> -- with hook body
SQL> BEGIN
   FOR i IN 1..1E6 LOOP
      dummy_api.ins;
   END LOOP;
END;
/

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.632

It make sense to provide a body with a NULL implementation to avoid the small overhead of handling the missing body exception.

Nonetheless, the way how the business logic is separated from the generated code, is one of the many things I like about Bitemp Remodeler.

Download Bitemp Remodeler from the Download section on my blog or install it directly via the SQL Developer update site http://update.oddgen.org/

↧