Formatting Code With SQL Developer

Introduction

I started using SQL Developer in 2013. Back then version 4.0 was the latest and greatest. But the capabilities of the formatter were disappointing. In 2017 Oracle released version 4.2 with a new formatter and has been improving it ever since. Version 19.2 brought us dynamic JavaScript actions within the parse-tree query language Arbori. And now I must admit that I’m really impressed with the formatting capabilities of the latest versions of SQL Developer. Arbori is a hidden gem.

In this blog post I explain how the formatter works and how the output can be tweaked using two simple SQL queries.

If you only want to activate the coding styles suggested by the Trivadis PL/SQL & SQL Coding Guidelines, install the settings as described here.

How Does Formatting Work?

Formatting is all about adding (or removing) whitespaces (line breaks, spaces or tabs) between significant tokens. That sounds easy. Well, it’s not. Because the formatting requirements are very different. Ultimately, it’s all about beautifying the code. And almost every developer has his own views on what makes code look good. Furthermore, it is technically demanding to provide a tool suite that is able to handle different coding styles via configuration.

The following figure illustrates the formatting process in SQL Developer.

I will explain each step and component in the next chapters.

Please note that these are conceptual components, the actual implementation might look different.

1. Parser

The parser reads the unformatted plain SQL or PL/SQL input and generates a parse-tree. The parse-tree is a hierarchical representation of the significant tokens of the input. In other words, there are neither whitespaces nor comments in a parse-tree.

Each node in the parse-tree includes the start and end position within the plain SQL input.

2. Formatter

The formatter needs the parse-tree and the code formatting configuration as input.

SQL Developer stores the configuration in the preferences.

Under Code Editor -> Format -> Advanced Format for configuration properties such as Line breaks on comma (after, before, none).
And under Code Editor -> Format -> Advanced Format -> Custom Format for the Arbori program used to handle whitespaces.

2.1 Provided Java Callback Functions

The formatter provides the following Java callback functions (in the order how they are expected to be called):

indentedNodes1
indentedNodes2
skipWhiteSpaceBeforeNode
skipWhiteSpaceAfterNode
identifiers
extraBrkBefore
extraBrkAfter
brkX2
rightAlignments
paddedIdsInScope
incrementalAlignments
pairwiseAlignments
ignoreLineBreaksBeforeNode
ignoreLineBreaksAfterNode
dontFormatNode

Each callback functions gets the parameters target (the parse-tree) and tuple (node to be processed). As an Arbori developer you do not have to care about how to populate these parameters. It’s done automatically. target is a global variable and tuple is the result row of an Arbori query. Basically, you only need to query the nodes and call the callback functions. The position in an Arbori program defines the execution order.

These provided Java callback functions have two issues.

First of all, you don’t know what they do. Granted, there are some comments in the provided Arbori program, and also a description in the SQL Developer Users Guide, but this will only give you a rough idea. For example, it leaves you in the dark why indentedNodes has two callback functions and both must be called.

Second, you cannot process selected nodes differently. You must write an enhancement request so that the SQL development team can provide the necessary callback functionality in a future release. This is cumbersome.

2.2 JavaScript Callback Functions

Thankfully, the SQL development team has added a JavaScript callback feature in version 19.2. This allows you to embed callback functions directly into your Arbori program. Now you can really add and remove whitespaces wherever you want. The global variable struct gives you access to the instance of the formatter and the configuration properties. As a result, you can manage the whitespaces before a position of a node through the methods getNewline and putNewline.

2.3 The Result

Basically, the result of this process is a list of whitespaces per position.

3. Serializer

The serializer loops through the leaf nodes of the parse-tree. It retrieves the leading whitespaces for a node’s start position and extracts the token text from the pure SQL input using the node’s start and end position. And then the serializer writes the whitespaces and the token text to the final result. The formatted SQL.

In fact, the process is actually a bit more complicated. It adds whitespaces to mandatory nodes, for instance.

Moreover, the serializer performs some “formatting” without Arbori. For example, it converts the case of identifiers and keywords according to the configuration (properties). Therefore, it is not possible to change the case of a token with an Arbori program. It might be possible by configuring a custom Java formatter class, but that’s another story.

Example Using Provided Java Callback Function

Setup

For this example I use the Advanced Format according the trivadis_advanced_format.xml file. Here’s a screenshot of the configuration settings of my SQL Developer 19.4.0 installation:

The default is used for the Custom Format.

Default Formatter Result

SELECT e.ename,
       e.deptno,
       d.dname
  FROM dept d
  LEFT JOIN emp e
ON d.deptno = e.deptno
 ORDER BY e.ename NULLS FIRST;

The result looks good, beside the missing indentation on line 6.

Expected Formatter Result

What we expect is this:

SELECT e.ename,
       e.deptno,
       d.dname
  FROM dept d
  LEFT JOIN emp e
    ON d.deptno = e.deptno
 ORDER BY e.ename NULLS FIRST;

The ON keyword right-aligned as SELECT, FROM, LEFT and ORDER.

Code Outline

SQL Developoer’s code outline is in fact a representation of the full parse-tree. Disable all filters to show all nodes.

The highlighted information is important for the next step.

Arbori Editor

Type arbori in the search field and press enter as shown below:

This will open the Arbori Editor. Type the following query in the editor window:

query:
   [node) 'ON' & [node^) on_using_condition
;

Press Run to display the query result:

What have we done? We query the parse-tree (outline) for all ON nodes where the parent node is an on_using_condition. A node is represented as [node). And a parent node is represented as [node^). A boolean AND is represented as &. See these links for more information about the Arbori grammar.

Click on the query result cell [19,20) 'ON' to highlight the node in the Code Outline window and the corresponding text in the worksheet. You can do the same with the cell [19,27) on_using_condition.

Change in Arbori Program

Now open the Preferences for Custom Format and search for the query named rightAlignments (it’s usually easier to change the Arbori program in separate editor). It looks like this:

Here some explanation of the query:

The predicate :alighRight means that the option Right-Align Query Keywords must be checked (true).
We know the boolean AND & , the current node [node) and the parent node [node^) from the previous query.
The parenthesis ( and ) are part of the boolean expression.
The | is a boolean OR.
The -> at the end means the callback function named as the query (rightAlignments) is called for matching nodes.
-- is used for single-line comments as in SQL and PL/SQL.

We extend the query by the predicate | [node) 'ON' & [node^) on_using_condition to right-align the ON token.

Here’s the amended query:

Press OK to save the preferences. Now, the query is formatted correctly.

Example Using JavaScript Callback Function

Default Formatter Result

We use the same setup as for the previous example.

SELECT *
  FROM dept d
 WHERE EXISTS (
   SELECT *
     FROM emp e
    WHERE e.deptno = d.deptno
      AND e.sal > 2900
)
 ORDER BY d.deptno;

The result does not look too bad. But the indentation feels wrong. Especially when I look at the missing indentation of the ) on line 8. Therefore, I’d like to increase the indentation of the highlighted lines by 7.

Expected Formatter Result

What we expect is this:

SELECT *
  FROM dept d
 WHERE EXISTS (
          SELECT *
            FROM emp e
           WHERE e.deptno = d.deptno
             AND e.sal > 2900
       )
 ORDER BY d.deptno;

Look at the indentation on line 8. ) matches now the indentation of EXISTS (.

Change in Arbori Program

The highlighted code block is already indented. Therefore we cannot use the same mechanism as previously. We want an additional indentation. We can achieve that with an additional query and a JavaScript callback function.

Add the following query at the end of the existing Arbori program in Custom Format of the Preferences:

indentExistsSubqueries:
  :breakOnSubqueries & (
      [node)   subquery & [node-1) '(' & [node+1) ')' & [node^)  exists_condition -- the subquery
    | [node-1) subquery & [node-2) '(' & [node)   ')' & [node^)  exists_condition -- close parenthesis
  )
  -> {
    var parentNode = tuple.get("node");
    var descendants = parentNode.descendants();
    var prevPos = 0
    var indentSpaces = struct.options.get("identSpaces")  // read preferences for "Indent spaces"
    var alignRight = struct.options.get("alignRight")     // read preferences for "Right-align query keywords"
    var baseIndent
    if (alignRight) {
      baseIndent = "SELECT ".length;  // align to SELECT keyword
    } else {
      baseIndent = "WHERE ".length;   // align to WHERE keyword
    }
    // result of addIndent varies based on number of "Indent spaces"
    var addIndent = "" 
    for (j = indentSpaces - baseIndent; j < indentSpaces; j++) {
      addIndent = addIndent + " ";
    }
    // addIndent to all nodes with existing indentation
    for (i = 0, len = descendants.length; i < len; i++) {
      var node = descendants.get(i);
      var pos = node.from;
      var nodeIndent = struct.getNewline(pos);
      if (nodeIndent != null && pos > prevPos) {
        struct.putNewline(pos, nodeIndent + addIndent);
        prevPos = pos
      }
    }
  }
;

Here are some explanation:

On line 3 and 4 the predicates are defined, for the subquery and the closing parenthesis ) of an exists_condition.
The JavaScript callback starts on line 6 and ends on line 33.
The current indentation of a node (position) is read on line 27 and updated on line 29.

Save the preferences to enable the new formatting rules. This is a reduced example. See the PL/SQL & SQL Formatter Settings repository on GitHub for a more complete Arbori program.

Summary

Arbori is the flux capacitor of SQL Developer’s Formatter. Arbori is what makes highly customized code formatting possible.

The Arbori Editor and Code Outline are very useful tools for developing code snippets for an Arbori program. However, it is not easy to get started with Arbori. The information in Vadim Tropashko’s blog is extensive, but it is a challenging and time-consuming read. For me, it was definitely worth it. I hope this blog post helps others to understand Arbori and its potential a bit better.

Any feedback is welcome. Regarding this blog post or the PL/SQL & SQL Formatter Settings on GitHub. Thank you.

The post Formatting Code With SQL Developer appeared first on Philipp Salvisberg's Blog.

Introduction

How Does Formatting Work?

1. Parser

2. Formatter

2.1 Provided Java Callback Functions

2.2 JavaScript Callback Functions

2.3 The Result

3. Serializer

Example Using Provided Java Callback Function

Setup

Default Formatter Result

Expected Formatter Result

Code Outline

Arbori Editor

Change in Arbori Program

Example Using JavaScript Callback Function

Default Formatter Result

Expected Formatter Result

Change in Arbori Program

Summary

Trending Articles