Last week I had a talk at Oracle’s OpenWorld 2011 titled Modern PL/SQL Code Checking and Dependency Analysis.
The problem I described in chapter 4 was to find all view columns using the column UNIT_COST of the table COSTS in the SH schema. Other usages of this column (e.g. in where or order by clauses) have to be ignored. To solve this problem within the Oracle Database Server 11.2 a parser is necessary (at least I’m not aware of another solution). Even a DBA_DEPENDENCY_COLUMNS view as described in Rob van Wijk’s post is not enough to solve this problem.
However, in this particular case no custom or 3rd party parser is necessary. Oracle provides a procedure named PARSEQUERY in the PL/SQL package UTL_XML which is in fact well suited to solve this problem as I will show later. First, I’d like explain which columns should be found by a dependency analysis procedure based on some sample views.
Oracle’s sales history demo schema SH provides view named PROFITS, which is defined as follows:
CREATE OR REPLACE VIEW PROFITS AS SELECT s.channel_id, s.cust_id, s.prod_id, s.promo_id, s.time_id, c.unit_cost, c.unit_price, s.amount_sold, s.quantity_sold, c.unit_cost * s.quantity_sold TOTAL_COST FROM costs c, sales s WHERE c.prod_id = s.prod_id AND c.time_id = s.time_id AND c.channel_id = s.channel_id AND c.promo_id = s.promo_id;
The columns using COSTS.UNIT_COSTS are highlighted.
The following view uses the column TOTAL_COSTS in GROSS_MARGIN (line 14) and GROSS_MARGIN_PERCENT (lines 14 and 15). The usage is not evident at the first glance since it is based on the column GROSS_MARGIN (line 4) of the named query GM and the column COST (line 8) in GM’s subquery. This kind of dependencies need to be identified.
CREATE OR REPLACE VIEW GROSS_MARGINS AS WITH gm AS ( SELECT time_id, revenue, revenue - cost AS gross_margin FROM ( SELECT time_id, unit_price * quantity_sold AS revenue, total_cost AS cost FROM profits ) ) SELECT t.fiscal_year, SUM(revenue) AS revenue, SUM(gross_margin) AS gross_margin, round(100 * SUM(gross_margin) / SUM(revenue), 2) AS gross_margin_percent FROM gm INNER JOIN times t ON t.time_id = gm.time_id GROUP BY t.fiscal_year ORDER BY t.fiscal_year;
The next view does not present the data of COSTS.UNIT_COST as a column, even if the view depends on the table COSTS
CREATE OR REPLACE VIEW REVENUES AS SELECT fiscal_year, revenue FROM gross_margins;
The last view uses COSTS.UNIT_COST but not as part of a column expression and therefore has not to be reported. The usage in the order by clause is considered save.
CREATE OR REPLACE VIEW SALES_ORDERED_BY_GM AS SELECT channel_id, cust_id, prod_id, promo_id, time_id, amount_sold, quantity_sold FROM profits ORDER BY (unit_price - unit_cost) DESC;
So, the following result of the dependency analysis is expected:
SCHEMA | VIEW | COLUMN |
---|---|---|
SH | PROFITS | UNIT_COST |
SH | PROFITS | TOTAL_COST |
SH | GROSS_MARGINS | GROSS_MARGIN |
SH | GROSS_MARGINS | GROSS_MARGIN_PERCENT |
Exactly this result is created by the following query
SELECT * FROM TABLE(coldep_pkg.get_dep('sh', 'costs', 'unit_cost'));
Now I just list all the code snippets I’ve written to create this result. Please note, that this is considered just a proof-of-concept code to show how UTL_XML.PARSEQUERY could be used for SQL dependency analysis in conjunction with Oracle dictionary views. This means that this is not a complete implementation. For example wild cards (*) is not handled which may lead to missing dependencies. Additionally table/view sources are not checked which may lead to false positives (in case a column is used in multiple view/table sources). – Please feel free to complete the code. However, an update is highly appreciated ;-)
GRANT EXECUTE ON SYS.UTL_XML TO SH;
CREATE OR REPLACE TYPE "SH"."COLDEP_TYP" AS OBJECT (schema_name VARCHAR2(30), view_name varchar2(30), column_name VARCHAR2(30)) / CREATE OR REPLACE TYPE "SH"."COLDEP_L" IS TABLE OF coldep_typ /
CREATE OR REPLACE PACKAGE "SH"."COLDEP_PKG" IS FUNCTION parse_query(p_query IN VARCHAR2) RETURN xmltype; FUNCTION get_dep(p_schema_name IN VARCHAR2, p_object_name IN VARCHAR2, p_column_name IN VARCHAR2) RETURN coldep_l PIPELINED; FUNCTION process_view(p_schema_name IN VARCHAR2, p_view_name IN VARCHAR2, p_column_name IN VARCHAR2, p_query IN CLOB) RETURN coldep_l; END coldep_pkg; / CREATE OR REPLACE PACKAGE BODY "SH"."COLDEP_PKG" IS FUNCTION parse_query(p_query IN VARCHAR2) RETURN xmltype IS v_clob CLOB; v_xml xmltype; BEGIN dbms_lob.createtemporary(v_clob, TRUE); -- parse query and get XML as CLOB sys.utl_xml.parsequery(USER, p_query, v_clob); -- create XMLTYPE from CLOB v_xml := xmltype.createxml(v_clob); dbms_lob.freetemporary(v_clob); RETURN v_xml; END parse_query; FUNCTION get_dep(p_schema_name IN VARCHAR2, p_object_name IN VARCHAR2, p_column_name IN VARCHAR2) RETURN coldep_l PIPELINED IS BEGIN -- query dictionary dependencies FOR v_dep IN (SELECT d.owner AS schema_name, d.name AS view_name, v.text AS query_text FROM all_dependencies d INNER JOIN all_views v ON v.owner = d.owner AND v.view_name = d.name WHERE d.referenced_owner = upper(p_schema_name) AND d.referenced_name = upper(p_object_name) AND d.type = 'VIEW') LOOP -- process every fetched view FOR v_views IN ( SELECT VALUE(pv) coldep FROM TABLE(process_view(v_dep.schema_name, v_dep.view_name, p_column_name, v_dep.query_text)) pv) LOOP -- return column usages in v_dep.view_name PIPE ROW(v_views.coldep); -- get column usages of views using v_dep.view_name (recursive calls) FOR v_recursive IN ( SELECT VALUE(dep) coldep FROM TABLE(get_dep(v_views.coldep.schema_name, v_views.coldep.view_name, v_views.coldep.column_name)) dep) LOOP -- return column usages of recursive call PIPE ROW(v_recursive.coldep); END LOOP; END LOOP; END LOOP; END get_dep; FUNCTION process_view(p_schema_name IN VARCHAR2, p_view_name IN VARCHAR2, p_column_name IN VARCHAR2, p_query IN CLOB) RETURN coldep_l IS v_search_l coldep_l := coldep_l(coldep_typ(NULL, NULL, p_column_name)); v_xml xmltype; v_previous_count INTEGER := 0; v_coldep_l coldep_l := coldep_l(); BEGIN -- parse view query v_xml := parse_query(p_query); -- get inline dependencies from secondary select lists -- TODO: handle table/view source and wildcard properly WHILE v_previous_count < v_search_l.count LOOP v_previous_count := v_search_l.count; FOR v_secondary IN ( SELECT nvl(x.alias_name, x.column_reference) AS alias_name FROM (SELECT t.select_list_item, t.alias_name, extractvalue(VALUE(c), 'COLUMN') AS column_reference FROM xmltable('//SELECT_LIST_ITEM[ancestor::FROM or ancestor::WITH]' passing v_xml columns select_list_item xmltype path '//SELECT_LIST_ITEM', alias_name VARCHAR2(30) path '//COLUMN_ALIAS') t, TABLE(xmlsequence(extract(select_list_item, '//COLUMN'))) c) x WHERE upper(x.column_reference) IN (SELECT upper(column_name) FROM TABLE(v_search_l)) AND upper(alias_name) NOT IN (SELECT upper(column_name) FROM TABLE(v_search_l))) LOOP -- add internal column usage v_search_l.extend; v_search_l(v_search_l.count) := coldep_typ(NULL, NULL, v_secondary.alias_name); END LOOP; END LOOP; -- analyze primary select list -- TODO: handle table/view source and wildcard properly FOR v_primary IN ( SELECT x.column_id, atc.column_name FROM (SELECT t.select_list_item, t.column_id, extractvalue(VALUE(c), 'COLUMN') AS column_reference FROM xmltable('//SELECT_LIST_ITEM[not (ancestor::FROM) and not (ancestor::WITH)]' passing v_xml columns column_id FOR ordinality, select_list_item xmltype path '//SELECT_LIST_ITEM') t, TABLE(xmlsequence(extract(select_list_item, '//COLUMN'))) c) x INNER JOIN all_tab_columns atc ON atc.owner = p_schema_name AND atc.table_name = p_view_name AND atc.column_id = x.column_id WHERE upper(x.column_reference) IN (SELECT upper(column_name) FROM TABLE(v_search_l)) ORDER BY x.column_id) LOOP -- add external column usage v_coldep_l.extend; v_coldep_l(v_coldep_l.count) := coldep_typ(p_schema_name, p_view_name, v_primary.column_name); END LOOP; -- return column dependencies RETURN v_coldep_l; END process_view; END coldep_pkg; /
Below you find the XML parser output of the query defined in the view GROSS_MARGINS. The model becomes quite clear, even if I could not find a schema description.
<QUERY> <WITH> <WITH_ITEM> <QUERY_ALIAS>GM</QUERY_ALIAS> <QUERY> <SELECT> <SELECT_LIST> <SELECT_LIST_ITEM> <COLUMN_REF> <COLUMN>TIME_ID</COLUMN> </COLUMN_REF> </SELECT_LIST_ITEM> <SELECT_LIST_ITEM> <COLUMN_REF> <COLUMN>REVENUE</COLUMN> </COLUMN_REF> </SELECT_LIST_ITEM> <SELECT_LIST_ITEM> <SUB> <COLUMN_REF> <COLUMN>REVENUE</COLUMN> </COLUMN_REF> <COLUMN_REF> <COLUMN>COST</COLUMN> </COLUMN_REF> </SUB> <COLUMN_ALIAS>GROSS_MARGIN</COLUMN_ALIAS> </SELECT_LIST_ITEM> </SELECT_LIST> </SELECT> <FROM> <FROM_ITEM> <QUERY> <SELECT> <SELECT_LIST> <SELECT_LIST_ITEM> <COLUMN_REF> <TABLE>PROFITS</TABLE> <COLUMN>TIME_ID</COLUMN> </COLUMN_REF> </SELECT_LIST_ITEM> <SELECT_LIST_ITEM> <MUL> <COLUMN_REF> <TABLE>PROFITS</TABLE> <COLUMN>UNIT_PRICE</COLUMN> </COLUMN_REF> <COLUMN_REF> <TABLE>PROFITS</TABLE> <COLUMN>QUANTITY_SOLD</COLUMN> </COLUMN_REF> </MUL> <COLUMN_ALIAS>REVENUE</COLUMN_ALIAS> </SELECT_LIST_ITEM> <SELECT_LIST_ITEM> <COLUMN_REF> <TABLE>PROFITS</TABLE> <COLUMN>TOTAL_COST</COLUMN> </COLUMN_REF> <COLUMN_ALIAS>COST</COLUMN_ALIAS> </SELECT_LIST_ITEM> </SELECT_LIST> </SELECT> <FROM> <FROM_ITEM> <TABLE>PROFITS</TABLE> </FROM_ITEM> </FROM> </QUERY> </FROM_ITEM> </FROM> </QUERY> </WITH_ITEM> </WITH> <SELECT> <SELECT_LIST> <SELECT_LIST_ITEM> <COLUMN_REF> <TABLE_ALIAS>T</TABLE_ALIAS> <COLUMN>FISCAL_YEAR</COLUMN> </COLUMN_REF> </SELECT_LIST_ITEM> <SELECT_LIST_ITEM> <SUM> <COLUMN_REF> <COLUMN>REVENUE</COLUMN> </COLUMN_REF> </SUM> <COLUMN_ALIAS>REVENUE</COLUMN_ALIAS> </SELECT_LIST_ITEM> <SELECT_LIST_ITEM> <SUM> <COLUMN_REF> <COLUMN>GROSS_MARGIN</COLUMN> </COLUMN_REF> </SUM> <COLUMN_ALIAS>GROSS_MARGIN</COLUMN_ALIAS> </SELECT_LIST_ITEM> <SELECT_LIST_ITEM> <ROUND> <DIV> <MUL> <LITERAL>100</LITERAL> <SUM> <COLUMN_REF> <COLUMN>GROSS_MARGIN</COLUMN> </COLUMN_REF> </SUM> </MUL> <SUM> <COLUMN_REF> <COLUMN>REVENUE</COLUMN> </COLUMN_REF> </SUM> </DIV> <LITERAL>2</LITERAL> </ROUND> <COLUMN_ALIAS>GROSS_MARGIN_PERCENT</COLUMN_ALIAS> </SELECT_LIST_ITEM> </SELECT_LIST> </SELECT> <FROM> <FROM_ITEM> <JOIN> <INNER/> <JOIN_TABLE_1> <QUERY_ALIAS>GM</QUERY_ALIAS> </JOIN_TABLE_1> <JOIN_TABLE_2> <TABLE>TIMES</TABLE> <TABLE_ALIAS>T</TABLE_ALIAS> </JOIN_TABLE_2> <ON> <EQ> <COLUMN_REF> <TABLE>TIMES</TABLE> <TABLE_ALIAS>T</TABLE_ALIAS> <COLUMN>TIME_ID</COLUMN> </COLUMN_REF> <COLUMN_REF> <TABLE_ALIAS>GM</TABLE_ALIAS> <COLUMN>TIME_ID</COLUMN> </COLUMN_REF> </EQ> </ON> </JOIN> </FROM_ITEM> </FROM> <GROUP_BY> <EXPRESSION_LIST> <EXPRESSION_LIST_ITEM> <COLUMN_REF> <TABLE_ALIAS>T</TABLE_ALIAS> <COLUMN>FISCAL_YEAR</COLUMN> </COLUMN_REF> </EXPRESSION_LIST_ITEM> </EXPRESSION_LIST> </GROUP_BY> <ORDER_BY> <ORDER_BY_LIST> <ORDER_BY_LIST_ITEM> <COLUMN_REF> <TABLE_ALIAS>T</TABLE_ALIAS> <COLUMN>FISCAL_YEAR</COLUMN> </COLUMN_REF> </ORDER_BY_LIST_ITEM> </ORDER_BY_LIST> </ORDER_BY> </QUERY>
Please note that UTL_XML.PARSEQUERY is suited for extended query dependency analysis only. DML may be parsed, but the resulting model is incomplete with 11.2.0.2 (e.g. clauses missing in the select statement are not included in the model, like the SET clause in an update statement). If you need to analyze PL/SQL beyond PL/Scope you still may need a 3rd party parser.