Mastering the SQL UNION ALL Operator: Combining Query Results with Speed
The SQL UNION ALL operator is a fast and efficient tool for combining the results of two or more SELECT queries into a single result set, keeping all rows, including duplicates. It’s ideal for scenarios where you need to merge data from multiple sources, like appending sales records from different regions or consolidating logs, without the overhead of deduplication. As part of SQL’s data manipulation language (DML), UNION ALL is a must-know for anyone working with relational databases. In this blog, we’ll explore the UNION ALL operator in depth, covering its syntax, use cases, and practical applications with clear examples. By the end, you’ll be using UNION ALL confidently to unify your query results with optimal performance.
What Is the SQL UNION ALL Operator?
The UNION ALL operator combines the result sets of two or more SELECT statements into a single result set, preserving all rows, including duplicates. Unlike UNION, which removes duplicates and can be slower, UNION ALL is faster because it skips deduplication, making it perfect when duplicates are acceptable or guaranteed not to exist. Each SELECT statement must return the same number of columns with compatible data types, and columns are matched by position, not name.
For example, if you have separate tables for online_sales and store_sales, UNION ALL can merge them into a single list of all sales, keeping every record. Unlike joins like INNER JOIN or LEFT JOIN, which combine columns based on a condition, UNION ALL stacks rows vertically—see INNER JOIN and LEFT JOIN for comparisons. Supported across major databases like MySQL, PostgreSQL, SQL Server, and Oracle, UNION ALL is a go-to for high-performance data merging. Let’s dive into how it works.
Basic Syntax of the UNION ALL Operator
The UNION ALL operator combines multiple SELECT statements. Here’s the basic syntax:
SELECT column1, column2, ...
FROM table1
WHERE condition
UNION ALL
SELECT column1, column2, ...
FROM table2
WHERE condition;
- SELECT column1, column2, ...: The columns to retrieve from each table or query. Each SELECT must have the same number of columns.
- FROM table1, table2, ...: The tables or subqueries providing the data.
- WHERE condition: Optional filters for each SELECT.
- UNION ALL: Combines the results, keeping all rows, including duplicates.
For example, to combine sales from two tables:
SELECT sale_id, amount
FROM online_sales
UNION ALL
SELECT sale_id, amount
FROM store_sales;
This returns all sales from both tables, including any duplicates. For more on querying basics, see SELECT Statement.
How UNION ALL Works
UNION ALL executes each SELECT statement independently, then appends their results into a single set, preserving all rows without checking for duplicates. The column names in the final result come from the first SELECT statement, and data types must be compatible across corresponding columns (e.g., VARCHAR with VARCHAR, DECIMAL with DECIMAL).
Example: Combining Sales Records
Consider two tables:
online_sales:
sale_id | amount |
---|---|
1 | 100.00 |
2 | 200.00 |
3 | 150.00 |
store_sales:
sale_id | amount |
---|---|
4 | 200.00 |
5 | 300.00 |
6 | 150.00 |
To merge all sales:
SELECT sale_id, amount
FROM online_sales
UNION ALL
SELECT sale_id, amount
FROM store_sales;
Result:
sale_id | amount |
---|---|
1 | 100.00 |
2 | 200.00 |
3 | 150.00 |
4 | 200.00 |
5 | 300.00 |
6 | 150.00 |
All 6 rows are included, with duplicates like 200.00 and 150.00 preserved, unlike UNION, which would remove them—see UNION Operator.
UNION ALL vs. UNION
The key difference between UNION ALL and UNION is duplicate handling:
- UNION: Removes duplicate rows, requiring sorting or hashing, which can slow performance, especially for large datasets.
- UNION ALL: Keeps all rows, including duplicates, making it faster since it avoids deduplication.
Use UNION ALL when:
- Duplicates are acceptable (e.g., log data where duplicates are valid).
- You know the data has no duplicates (e.g., disjoint datasets).
- Performance is critical, and deduplication isn’t needed.
Example: UNION vs. UNION ALL
Using the sales tables:
SELECT amount
FROM online_sales
UNION
SELECT amount
FROM store_sales;
Result:
amount |
---|
100.00 |
150.00 |
200.00 |
300.00 |
With UNION, duplicates like 150.00 and 200.00 are removed, yielding 4 rows. UNION ALL keeps all 6 rows, as shown earlier.
UNION ALL with Multiple Columns
UNION ALL can combine queries with multiple columns, as long as the number and data types match. This is useful for merging complex datasets.
Example: Combining Customer Orders
Suppose you have online_orders and store_orders:
online_orders:
order_id | customer_name | total |
---|---|---|
101 | John | 500.00 |
102 | Jane | 300.00 |
store_orders:
order_id | customer_name | total |
---|---|---|
201 | Jane | 300.00 |
202 | Bob | 400.00 |
Query:
SELECT order_id, customer_name, total
FROM online_orders
UNION ALL
SELECT order_id, customer_name, total
FROM store_orders;
Result:
order_id | customer_name | total |
---|---|---|
101 | John | 500.00 |
102 | Jane | 300.00 |
201 | Jane | 300.00 |
202 | Bob | 400.00 |
Jane’s identical order (300.00) appears twice, as UNION ALL keeps duplicates.
UNION ALL with Different Tables and Conditions
UNION ALL can merge data from different tables or apply different conditions to the same table, as long as the column structure matches.
Example: Combining Logs from Different Sources
Suppose you have web_logs and app_logs:
web_logs:
log_id | event |
---|---|
1 | Page View |
2 | Click |
app_logs:
log_id | event |
---|---|
3 | App Open |
4 | Click |
Query:
SELECT log_id, event
FROM web_logs
UNION ALL
SELECT log_id, event
FROM app_logs;
Result:
log_id | event |
---|---|
1 | Page View |
2 | Click |
3 | App Open |
4 | Click |
The Click event appears twice, as UNION ALL preserves all rows.
Example: Different Conditions on One Table
In an orders table, to combine orders from different time periods:
SELECT order_id, total
FROM orders
WHERE order_date < '2025-01-01'
UNION ALL
SELECT order_id, total
FROM orders
WHERE order_date >= '2025-01-01';
This stacks orders from before and after 2025, keeping all records. For more on filtering, see WHERE Clause.
UNION ALL with ORDER BY and Other Clauses
UNION ALL can be paired with ORDER BY, WHERE, or LIMIT/FETCH to sort or limit the combined result set. The ORDER BY applies to the final result and is placed after the last SELECT.
Example: Sorting Combined Results
Using the orders example:
SELECT order_id, total
FROM online_orders
UNION ALL
SELECT order_id, total
FROM store_orders
ORDER BY total DESC;
Result:
order_id | total |
---|---|
101 | 500.00 |
202 | 400.00 |
102 | 300.00 |
201 | 300.00 |
ORDER BY sorts the unified list by total in descending order. See ORDER BY Clause.
Example: Limiting Results
To get the top 3 orders by value:
SELECT order_id, total
FROM online_orders
UNION ALL
SELECT order_id, total
FROM store_orders
ORDER BY total DESC
FETCH FIRST 3 ROWS ONLY;
Result:
order_id | total |
---|---|
101 | 500.00 |
202 | 400.00 |
102 | 300.00 |
For row limiting, see FETCH Clause.
UNION ALL with Subqueries and Joins
UNION ALL can combine complex queries, including subqueries or joins, as long as the column structure matches.
Example: UNION ALL with Joins
To combine orders from different regions with customer details:
SELECT
o.order_id,
c.first_name
FROM online_orders AS o
INNER JOIN customers AS c
ON o.customer_id = c.customer_id
UNION ALL
SELECT
o.order_id,
c.first_name
FROM store_orders AS o
INNER JOIN customers AS c
ON o.customer_id = c.customer_id;
This merges orders from both channels, including customer names, keeping all rows. For more, see INNER JOIN.
Example: UNION ALL with Subqueries
To combine high-value orders from two periods:
SELECT order_id, total
FROM (
SELECT order_id, total
FROM orders
WHERE order_date < '2025-01-01' AND total > 500
) AS old_orders
UNION ALL
SELECT order_id, total
FROM (
SELECT order_id, total
FROM orders
WHERE order_date >= '2025-01-01' AND total > 500
) AS new_orders;
This uses subqueries to filter high-value orders, then combines them. See Subqueries.
Practical Example: Managing a Logistics Database
Let’s apply UNION ALL to a real-world scenario. Suppose you’re managing a logistics database with truck_deliveries, rail_deliveries, truck_logs, and rail_logs tables. Here’s how you’d use UNION ALL:
- Unified Delivery List: Combine all deliveries from trucks and rails:
SELECT delivery_id, weight
FROM truck_deliveries
UNION ALL
SELECT delivery_id, weight
FROM rail_deliveries;
- All Logs: Merge activity logs from both transport modes:
SELECT log_id, event
FROM truck_logs
UNION ALL
SELECT log_id, event
FROM rail_logs;
- High-Weight Deliveries: Find deliveries with weight over 1000 from either mode:
SELECT
d.delivery_id,
d.weight
FROM truck_deliveries AS d
WHERE d.weight > 1000
UNION ALL
SELECT
d.delivery_id,
d.weight
FROM rail_deliveries AS d
WHERE d.weight > 1000;
- Top 5 Deliveries by Weight: List the top 5 heaviest deliveries:
SELECT delivery_id, weight
FROM truck_deliveries
UNION ALL
SELECT delivery_id, weight
FROM rail_deliveries
ORDER BY weight DESC
FETCH FIRST 5 ROWS ONLY;
This example shows UNION ALL’s utility for logistics data aggregation. For querying basics, see SELECT Statement.
Performance Considerations
While we’re not covering best practices, a few performance notes can help you use UNION ALL effectively:
- Speed Advantage: UNION ALL is faster than UNION because it skips deduplication, making it ideal for large datasets—see UNION Operator.
- Indexes: Indexes on columns in WHERE or ORDER BY clauses can speed up individual SELECT statements. See Creating Indexes.
- Filter Early: Apply WHERE conditions in each SELECT to reduce rows before combining. See WHERE Clause.
- Query Plans: Use EXPLAIN to analyze performance, especially with complex queries or joins. See EXPLAIN Plan.
For large datasets, optimizing each SELECT is crucial—check out SQL Best Practices for general tips. According to W3Schools, UNION ALL is preferred for performance-critical tasks.
Common Pitfalls and How to Avoid Them
UNION ALL is straightforward but can trip you up. Here are some common issues:
- Mismatched Columns: Each SELECT must have the same number of columns with compatible data types. Verify column counts and types (e.g., DECIMAL vs. INT) before running.
- Column Name Confusion: The final column names come from the first SELECT. Use aliases to clarify output—see Aliases with AS.
- Unexpected Duplicates: If duplicates are unwanted, consider UNION instead, but weigh the performance cost. Check data for overlap before choosing UNION ALL.
- ORDER BY Placement: ORDER BY applies to the final result and must appear after the last SELECT. Placing it in individual SELECTs can cause errors.
- Large Result Sets: Combining large datasets can produce massive results. Test with smaller subsets and use FETCH or LIMIT to manage output—see FETCH Clause.
Testing each SELECT independently can help verify data and ensure compatibility.
Wrapping Up
The SQL UNION ALL operator is a fast and versatile tool for combining query results, preserving all rows for maximum performance. By mastering its syntax, using it with ORDER BY, joins, or subqueries, and applying it in scenarios like our logistics database, you’ll streamline data aggregation. Just watch out for pitfalls like mismatched columns or large result sets, and you’ll be using UNION ALL like a pro.
For more SQL fundamentals, explore related topics like UNION Operator or SELECT Statement. Ready for advanced techniques? Check out INTERSECT Operator or Subqueries for more ways to manipulate data.