Subquery & CTE: When Your Queries Need a Sidekick!

Because even SQL queries need a little teamwork!

Introduction: Why Use Subqueries & CTEs?

Imagine trying to cook a multi-layered cake.
You need to:

  1. Bake the layers first (gather data in steps).
  2. Assemble the cake (combine results).
  3. Decorate it beautifully (optimize readability).

Now, if you try to do everything at once, it’s gonna be a big, messy disaster. 

That’s why we have Subqueries and Common Table Expressions (CTEs) in PostgreSQL!

 Subquery: A query inside another query (a mini chef working inside a bigger kitchen).
 CTE (Common Table Expression): A temporary, reusable query (like prepping ingredients separately).

Both help simplify complex queries, make them readable, and boost performance.

What is a Subquery?

A subquery is a SQL query nested inside another SQL query.
It can be used in:
  SELECT (to calculate values dynamically)
  FROM (to create a temporary table)
  WHERE (to filter results dynamically)

Example Scenario: Finding Customers Who Placed Orders

You have two tables:

  customers (customer details)
  orders (order details)

SELECT name FROM customers
WHERE id IN (SELECT customer_id FROM orders);

How It Works:

  The subquery inside () runs first → gets customer_id from orders.
  The outer query then finds customer names that match those IDs.

Analogy:
A subquery is like checking "Which friends attended my last party?" before inviting them again. 

Types of Subqueries in PostgreSQL

Subquery Type What It Does
Scalar Subquery Returns a single value (used in SELECT)
Multi-Row Subquery Returns multiple values (used in IN, ANY, ALL)
Correlated Subquery Uses values from the outer query (runs once per row)
EXISTS Subquery Checks if rows exist (for filtering data)

Scalar Subquery: The One-Value Wonder

A Scalar Subquery returns only one value.
Let’s find the latest order date for a specific customer:

SELECT name, 
       (SELECT MAX(order_date) FROM orders WHERE customer_id = customers.id) AS last_order
FROM customers;

Tip:
Scalar subqueries should only return one value, or PostgreSQL will throw an error!

Multi-Row Subquery: Handling More Than One Result

A Multi-Row Subquery returns multiple values.
Example: Finding customers who placed at least one order:

SELECT name FROM customers
WHERE id IN (SELECT customer_id FROM orders);

Analogy:
Like asking: “Which students are on the honor roll?” → You get a list, not just one name.

Correlated Subquery: The Query That Thinks

A Correlated Subquery runs once per row in the outer query.
Example: Finding customers with orders above their average spend:

SELECT name, order_amount
FROM orders o1
WHERE order_amount > 
      (SELECT AVG(order_amount) FROM orders o2 WHERE o1.customer_id = o2.customer_id);

 Downside:
Correlated subqueries run for each row, making them slow on large datasets.

EXISTS Subquery: The Boolean Bouncer

The EXISTS subquery checks if at least one row exists in the subquery.
Example: Finding customers who placed orders:

SELECT name FROM customers c
WHERE EXISTS (SELECT 1 FROM orders o WHERE c.id = o.customer_id);

 Faster than IN for large datasets

What is a Common Table Expression (CTE)?

A CTE (Common Table Expression) is like a temporary, named subquery that makes queries easier to read and reuse.

CTE Syntax:

WITH cte_name AS (
    -- Your subquery here
)
SELECT * FROM cte_name;

Example: Using CTE to Get Customers & Total Orders

WITH customer_orders AS (
    SELECT customer_id, COUNT(*) AS total_orders
    FROM orders
    GROUP BY customer_id
)
SELECT c.name, co.total_orders
FROM customers c
JOIN customer_orders co ON c.id = co.customer_id;

Why Use CTEs?
Improves readability
Reusable multiple times
Faster than correlated subqueries

Analogy:
A CTE is like preparing all your ingredients before cooking. Way easier than juggling everything at once!

When to Use Subquery vs CTE? 

Feature Subquery CTE
Readability  Hard to read  Easy to read
Reusability  One-time use  Can be used multiple times
Performance  Can be slow  Often optimized by PostgreSQL
Query Complexity  Good for simple queries  Best for complex queries

General Rule:
Use Subqueries for quick, one-time operations.
Use CTEs when you need clean, reusable logic.

Recursive CTE: When You Need to Loop!

PostgreSQL CTEs can be recursive—useful for hierarchical data like company org charts!

Example: Finding Employee Hierarchies 

WITH RECURSIVE employee_hierarchy AS (
    SELECT id, name, manager_id FROM employees WHERE manager_id IS NULL
    UNION ALL
    SELECT e.id, e.name, e.manager_id
    FROM employees e
    JOIN employee_hierarchy eh ON e.manager_id = eh.id
)
SELECT * FROM employee_hierarchy;

 Finds all employees reporting to a given manager
 Great for handling tree-like structures

Conclusion: Subquery vs CTE—Which One Should You Use?

 Subquery is great for:

  •   Simple calculations (SELECT, WHERE)
  •   Filtering (IN, EXISTS)
  •   When you need a one-time calculation

 CTE is better when:

  •   Query is complex and needs to be readable
  •   You want to reuse the results multiple times
  •   You need recursive queries


Post a Comment

0 Comments