Because even SQL queries need a little teamwork!
Introduction: Why Use Subqueries & CTEs?
Imagine trying to cook a multi-layered cake.
You need to:
- Bake the layers first (gather data in steps).
- Assemble the cake (combine results).
- Decorate it beautifully (optimize readability).
Now, if you try to do everything at once, it’s gonna be a big, messy disaster.
That’s why we have Subqueries and Common Table Expressions (CTEs) in PostgreSQL!
Subquery: A query inside another query (a mini chef working inside a bigger kitchen).
CTE (Common Table Expression): A temporary, reusable query (like prepping ingredients separately).
Both help simplify complex queries, make them readable, and boost performance.
What is a Subquery?
A subquery is a SQL query nested inside another SQL query.
It can be used in:
SELECT (to calculate values dynamically)
FROM (to create a temporary table)
WHERE (to filter results dynamically)
Example Scenario: Finding Customers Who Placed Orders
You have two tables:
customers (customer details)
orders (order details)
SELECT name FROM customers
WHERE id IN (SELECT customer_id FROM orders);
How It Works:
The subquery inside ()
runs first → gets customer_id
from orders.
The outer query then finds customer names that match those IDs.
Analogy:
A subquery is like checking "Which friends attended my last party?" before inviting them again.
Types of Subqueries in PostgreSQL
Subquery Type | What It Does |
---|---|
Scalar Subquery | Returns a single value (used in SELECT ) |
Multi-Row Subquery | Returns multiple values (used in IN , ANY , ALL ) |
Correlated Subquery | Uses values from the outer query (runs once per row) |
EXISTS Subquery | Checks if rows exist (for filtering data) |
Scalar Subquery: The One-Value Wonder
A Scalar Subquery returns only one value.
Let’s find the latest order date for a specific customer:
SELECT name,
(SELECT MAX(order_date) FROM orders WHERE customer_id = customers.id) AS last_order
FROM customers;
Tip:
Scalar subqueries should only return one value, or PostgreSQL will throw an error!
Multi-Row Subquery: Handling More Than One Result
A Multi-Row Subquery returns multiple values.
Example: Finding customers who placed at least one order:
SELECT name FROM customers
WHERE id IN (SELECT customer_id FROM orders);
Analogy:
Like asking: “Which students are on the honor roll?” → You get a list, not just one name.
Correlated Subquery: The Query That Thinks
A Correlated Subquery runs once per row in the outer query.
Example: Finding customers with orders above their average spend:
SELECT name, order_amount
FROM orders o1
WHERE order_amount >
(SELECT AVG(order_amount) FROM orders o2 WHERE o1.customer_id = o2.customer_id);
Downside:
Correlated subqueries run for each row, making them slow on large datasets.
EXISTS Subquery: The Boolean Bouncer
The EXISTS
subquery checks if at least one row exists in the subquery.
Example: Finding customers who placed orders:
SELECT name FROM customers c
WHERE EXISTS (SELECT 1 FROM orders o WHERE c.id = o.customer_id);
Faster than IN
for large datasets
What is a Common Table Expression (CTE)?
A CTE (Common Table Expression) is like a temporary, named subquery that makes queries easier to read and reuse.
CTE Syntax:
WITH cte_name AS (
-- Your subquery here
)
SELECT * FROM cte_name;
Example: Using CTE to Get Customers & Total Orders
WITH customer_orders AS (
SELECT customer_id, COUNT(*) AS total_orders
FROM orders
GROUP BY customer_id
)
SELECT c.name, co.total_orders
FROM customers c
JOIN customer_orders co ON c.id = co.customer_id;
Why Use CTEs?
Improves readability
Reusable multiple times
Faster than correlated subqueries
Analogy:
A CTE is like preparing all your ingredients before cooking. Way easier than juggling everything at once!
When to Use Subquery vs CTE?
Feature | Subquery | CTE |
---|---|---|
Readability | Hard to read | Easy to read |
Reusability | One-time use | Can be used multiple times |
Performance | Can be slow | Often optimized by PostgreSQL |
Query Complexity | Good for simple queries | Best for complex queries |
General Rule:
Use Subqueries for quick, one-time operations.
Use CTEs when you need clean, reusable logic.
Recursive CTE: When You Need to Loop!
PostgreSQL CTEs can be recursive—useful for hierarchical data like company org charts!
Example: Finding Employee Hierarchies
WITH RECURSIVE employee_hierarchy AS (
SELECT id, name, manager_id FROM employees WHERE manager_id IS NULL
UNION ALL
SELECT e.id, e.name, e.manager_id
FROM employees e
JOIN employee_hierarchy eh ON e.manager_id = eh.id
)
SELECT * FROM employee_hierarchy;
Finds all employees reporting to a given manager
Great for handling tree-like structures
Conclusion: Subquery vs CTE—Which One Should You Use?
Subquery is great for:
- Simple calculations (
SELECT
,WHERE
) - Filtering (
IN
,EXISTS
) - When you need a one-time calculation
CTE is better when:
- Query is complex and needs to be readable
- You want to reuse the results multiple times
- You need recursive queries
0 Comments