The Formatted Query
Applying the formatter to the input above produces:
SELECT
u.id,
u.name,
u.email,
o.total,
o.created_at
FROM users u
INNER JOIN orders o
ON u.id = o.user_id
WHERE
o.created_at >= '2024-01-01'
AND o.status = 'completed'
GROUP BY
u.id,
u.name,
u.email,
o.total,
o.created_at
HAVING COUNT(o.id) > 5
ORDER BY o.total DESC
LIMIT 100;
Each clause starts on its own line. Column lists are one item per line. The ON condition aligns under the JOIN keyword. This layout makes every clause scannable at a glance.
Why SQL Formatting Matters
A single-line query works fine when you write it. It becomes a maintenance problem the moment someone else needs to read it, or when you return to debug it three months later.
Formatted SQL makes several things faster:
- Code review. A reviewer can check JOIN conditions, WHERE predicates, and GROUP BY columns independently when each is on its own line. A single-line query requires parsing the entire string to verify any part of it.
- Debugging. When a query returns wrong results, you usually isolate the problem by commenting out clauses one at a time. Properly indented SQL makes that process straightforward.
- Spotting logic errors. Misaligned GROUP BY columns and SELECT columns are a common source of aggregation bugs. When both lists are formatted vertically, mismatches are visible immediately.
- Pull request diffs. A change to one WHERE condition should show as a single changed line in a diff. In a single-line query, any change touches the entire line, making the diff useless for review.
SQL Formatting Conventions
The conventions that appear across most SQL style guides:
Keywords in uppercase: SELECT, FROM, WHERE, JOIN, ON, GROUP BY, HAVING, ORDER BY, LIMIT. This separates the SQL structure from identifiers.
One column per line in SELECT lists. This holds even for short queries if the query is checked into version control, because future additions to the SELECT list do not require reformatting.
JOIN conditions on a new line, indented under the JOIN keyword. The ON clause belongs visually to the JOIN, not to the WHERE clause.
WHERE conditions each on their own line, with AND or OR at the start of the continuation line rather than the end. Starting continuation lines with AND makes it easy to comment out one condition without editing adjacent lines.
CTE Formatting
CTEs replace subqueries for queries that reference the same derived dataset more than once, or where naming intermediate results improves readability. The standard formatting pattern:
WITH
active_users AS (
SELECT id, name, email
FROM users
WHERE status = 'active'
),
recent_orders AS (
SELECT user_id, SUM(total) AS total_spend
FROM orders
WHERE created_at >= '2024-01-01'
GROUP BY user_id
)
SELECT
u.name,
u.email,
r.total_spend
FROM active_users u
INNER JOIN recent_orders r
ON u.id = r.user_id
ORDER BY r.total_spend DESC;
Each CTE is indented one level inside the WITH block. The final SELECT reads as a normal query against named datasets rather than a wall of nested subqueries.
Window Functions
Window functions require their own formatting treatment because OVER clauses can be long:
SELECT
user_id,
order_date,
total,
SUM(total) OVER (
PARTITION BY user_id
ORDER BY order_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_total,
RANK() OVER (
PARTITION BY user_id
ORDER BY total DESC
) AS order_rank
FROM orders;
The OVER clause opens on the same line as the function. PARTITION BY, ORDER BY, and frame clauses go on separate lines, indented inside the parentheses.
Command Line and IDE Alternatives
For automated formatting in CI pipelines, SQLFluff is the most capable option. It lints and fixes SQL files, supports multiple dialects, and integrates with pre-commit hooks:
pip install sqlfluff
sqlfluff fix --dialect postgres query.sql
For PostgreSQL specifically, pg_format formats SQL with PostgreSQL syntax awareness:
brew install pgformatter
pg_format query.sql
For IDE integration, the SQL Tools extension in VS Code and DataGrip’s built-in formatter both handle standard SQL well. The practical tradeoff: IDE formatters are convenient for ad hoc queries, but a CI linter is the only way to enforce consistent formatting across a team. Use both: format locally as you write, enforce in CI to catch anything that slips through.