Introduction
I started using SQL Developer in 2013. Back then version 4.0 was the latest and greatest. But the capabilities of the formatter were disappointing. In 2017 Oracle released version 4.2 with a new formatter and has been improving it ever since. Version 19.2 brought us dynamic JavaScript actions within the parse-tree query language Arbori. And now I must admit that I’m really impressed with the formatting capabilities of the latest versions of SQL Developer. Arbori is a hidden gem.
In this blog post I explain how the formatter works and how the output can be tweaked using two simple SQL queries.
If you only want to activate the coding styles suggested by the Trivadis PL/SQL & SQL Coding Guidelines, install the settings as described here.
How Does Formatting Work?
Formatting is all about adding (or removing) whitespaces (line breaks, spaces or tabs) between significant tokens. That sounds easy. Well, it’s not. Because the formatting requirements are very different. Ultimately, it’s all about beautifying the code. And almost every developer has his own views on what makes code look good. Furthermore, it is technically demanding to provide a tool suite that is able to handle different coding styles via configuration.
The following figure illustrates the formatting process in SQL Developer.
I will explain each step and component in the next chapters.
Please note that these are conceptual components, the actual implementation might look different.
1. Parser
The parser reads the unformatted plain SQL or PL/SQL input and generates a parse-tree. The parse-tree is a hierarchical representation of the significant tokens of the input. In other words, there are neither whitespaces nor comments in a parse-tree.
Each node in the parse-tree includes the start and end position within the plain SQL input.
2. Formatter
The formatter needs the parse-tree and the code formatting configuration as input.
SQL Developer stores the configuration in the preferences.
- Under
Code Editor
->Format
->Advanced Format
for configuration properties such asLine breaks on comma
(after
,before
,none
). - And under
Code Editor
->Format
->Advanced Format
->Custom Format
for the Arbori program used to handle whitespaces.
2.1 Provided Java Callback Functions
The formatter provides the following Java callback functions (in the order how they are expected to be called):
- indentedNodes1
- indentedNodes2
- skipWhiteSpaceBeforeNode
- skipWhiteSpaceAfterNode
- identifiers
- extraBrkBefore
- extraBrkAfter
- brkX2
- rightAlignments
- paddedIdsInScope
- incrementalAlignments
- pairwiseAlignments
- ignoreLineBreaksBeforeNode
- ignoreLineBreaksAfterNode
- dontFormatNode
Each callback functions gets the parameters target
(the parse-tree) and tuple
(node to be processed). As an Arbori developer you do not have to care about how to populate these parameters. It’s done automatically. target
is a global variable and tuple
is the result row of an Arbori query. Basically, you only need to query the nodes and call the callback functions. The position in an Arbori program defines the execution order.
These provided Java callback functions have two issues.
First of all, you don’t know what they do. Granted, there are some comments in the provided Arbori program, and also a description in the SQL Developer Users Guide, but this will only give you a rough idea. For example, it leaves you in the dark why indentedNodes
has two callback functions and both must be called.
Second, you cannot process selected nodes differently. You must write an enhancement request so that the SQL development team can provide the necessary callback functionality in a future release. This is cumbersome.
2.2 JavaScript Callback Functions
Thankfully, the SQL development team has added a JavaScript callback feature in version 19.2. This allows you to embed callback functions directly into your Arbori program. Now you can really add and remove whitespaces wherever you want. The global variable struct
gives you access to the instance of the formatter and the configuration properties. As a result, you can manage the whitespaces before a position of a node through the methods getNewline
and putNewline
.
2.3 The Result
Basically, the result of this process is a list of whitespaces per position.
3. Serializer
The serializer loops through the leaf nodes of the parse-tree. It retrieves the leading whitespaces for a node’s start position and extracts the token text from the pure SQL input using the node’s start and end position. And then the serializer writes the whitespaces and the token text to the final result. The formatted SQL.
In fact, the process is actually a bit more complicated. It adds whitespaces to mandatory nodes, for instance.
Moreover, the serializer performs some “formatting” without Arbori. For example, it converts the case of identifiers and keywords according to the configuration (properties). Therefore, it is not possible to change the case of a token with an Arbori program. It might be possible by configuring a custom Java formatter class, but that’s another story.
Example Using Provided Java Callback Function
Setup
For this example I use the Advanced Format
according the trivadis_advanced_format.xml file. Here’s a screenshot of the configuration settings of my SQL Developer 19.4.0 installation:
The default is used for the Custom Format
.
Default Formatter Result
SELECT e.ename, e.deptno, d.dname FROM dept d LEFT JOIN emp e ON d.deptno = e.deptno ORDER BY e.ename NULLS FIRST;
The result looks good, beside the missing indentation on line 6.
Expected Formatter Result
What we expect is this:
SELECT e.ename, e.deptno, d.dname FROM dept d LEFT JOIN emp e ON d.deptno = e.deptno ORDER BY e.ename NULLS FIRST;
The ON
keyword right-aligned as SELECT
, FROM
, LEFT
and ORDER
.
Code Outline
SQL Developoer’s code outline is in fact a representation of the full parse-tree. Disable all filters to show all nodes.
The highlighted information is important for the next step.
Arbori Editor
Type arbori
in the search field and press enter as shown below:
This will open the Arbori Editor
. Type the following query in the editor window:
query: [node) 'ON' & [node^) on_using_condition ;
Press Run
to display the query result:
What have we done? We query the parse-tree (outline) for all ON
nodes where the parent node is an on_using_condition
. A node is represented as [node)
. And a parent node is represented as [node^)
. A boolean AND is represented as &
. See these links for more information about the Arbori grammar.
Click on the query result cell [19,20) 'ON'
to highlight the node in the Code Outline
window and the corresponding text in the worksheet. You can do the same with the cell [19,27) on_using_condition
.
Change in Arbori Program
Now open the Preferences
for Custom Format
and search for the query named rightAlignments
(it’s usually easier to change the Arbori program in separate editor). It looks like this:
Here some explanation of the query:
- The predicate
:alighRight
means that the optionRight-Align Query Keywords
must be checked (true). - We know the boolean AND
&
, the current node[node)
and the parent node[node^)
from the previous query. - The parenthesis
(
and)
are part of the boolean expression. - The
|
is a boolean OR. - The
->
at the end means the callback function named as the query (rightAlignments
) is called for matching nodes. --
is used for single-line comments as in SQL and PL/SQL.
We extend the query by the predicate | [node) 'ON' & [node^) on_using_condition
to right-align the ON
token.
Here’s the amended query:
Press OK
to save the preferences. Now, the query is formatted correctly.
Example Using JavaScript Callback Function
Default Formatter Result
We use the same setup as for the previous example.
SELECT * FROM dept d WHERE EXISTS ( SELECT * FROM emp e WHERE e.deptno = d.deptno AND e.sal > 2900 ) ORDER BY d.deptno;
The result does not look too bad. But the indentation feels wrong. Especially when I look at the missing indentation of the )
on line 8. Therefore, I’d like to increase the indentation of the highlighted lines by 7.
Expected Formatter Result
What we expect is this:
SELECT * FROM dept d WHERE EXISTS ( SELECT * FROM emp e WHERE e.deptno = d.deptno AND e.sal > 2900 ) ORDER BY d.deptno;
Look at the indentation on line 8. )
matches now the indentation of EXISTS (
.
Change in Arbori Program
The highlighted code block is already indented. Therefore we cannot use the same mechanism as previously. We want an additional indentation. We can achieve that with an additional query and a JavaScript callback function.
Add the following query at the end of the existing Arbori program in Custom Format
of the Preferences
:
indentExistsSubqueries: :breakOnSubqueries & ( [node) subquery & [node-1) '(' & [node+1) ')' & [node^) exists_condition -- the subquery | [node-1) subquery & [node-2) '(' & [node) ')' & [node^) exists_condition -- close parenthesis ) -> { var parentNode = tuple.get("node"); var descendants = parentNode.descendants(); var prevPos = 0 var indentSpaces = struct.options.get("identSpaces") // read preferences for "Indent spaces" var alignRight = struct.options.get("alignRight") // read preferences for "Right-align query keywords" var baseIndent if (alignRight) { baseIndent = "SELECT ".length; // align to SELECT keyword } else { baseIndent = "WHERE ".length; // align to WHERE keyword } // result of addIndent varies based on number of "Indent spaces" var addIndent = "" for (j = indentSpaces - baseIndent; j < indentSpaces; j++) { addIndent = addIndent + " "; } // addIndent to all nodes with existing indentation for (i = 0, len = descendants.length; i < len; i++) { var node = descendants.get(i); var pos = node.from; var nodeIndent = struct.getNewline(pos); if (nodeIndent != null && pos > prevPos) { struct.putNewline(pos, nodeIndent + addIndent); prevPos = pos } } } ;
Here are some explanation:
- On line 3 and 4 the predicates are defined, for the
subquery
and the closing parenthesis)
of anexists_condition
. - The JavaScript callback starts on line 6 and ends on line 33.
- The current indentation of a node (position) is read on line 27 and updated on line 29.
Save the preferences to enable the new formatting rules. This is a reduced example. See the PL/SQL & SQL Formatter Settings repository on GitHub for a more complete Arbori program.
Summary
Arbori is the flux capacitor of SQL Developer’s Formatter. Arbori is what makes highly customized code formatting possible.
The Arbori Editor and Code Outline are very useful tools for developing code snippets for an Arbori program. However, it is not easy to get started with Arbori. The information in Vadim Tropashko’s blog is extensive, but it is a challenging and time-consuming read. For me, it was definitely worth it. I hope this blog post helps others to understand Arbori and its potential a bit better.
Any feedback is welcome. Regarding this blog post or the PL/SQL & SQL Formatter Settings on GitHub. Thank you.
The post Formatting Code With SQL Developer appeared first on Philipp Salvisberg's Blog.