Mastering the SQL UNION Operator: Combining Query Results Seamlessly
The SQL UNION operator is a powerful tool for combining the results of two or more SELECT queries into a single result set, eliminating duplicates by default. It’s perfect for scenarios where you need to merge data from different tables or queries, like combining customer lists from multiple sources or aggregating sales data across regions. As part of SQL’s data manipulation language (DML), UNION is a must-know for anyone working with relational databases. In this blog, we’ll explore the UNION operator in depth, covering its syntax, use cases, and practical applications with clear examples. By the end, you’ll be using UNION confidently to unify your query results.
What Is the SQL UNION Operator?
The UNION operator combines the result sets of two or more SELECT statements, removing duplicate rows to produce a single, unique set of results. Each SELECT statement must return the same number of columns with compatible data types, and the columns are matched by position, not name. UNION is distinct from joins like INNER JOIN or LEFT JOIN, which combine columns from tables based on a condition—see INNER JOIN and LEFT JOIN for comparisons.
For example, if you have separate tables for active_customers and inactive_customers, UNION can merge them into a single list of unique customers. Supported across major databases like MySQL, PostgreSQL, SQL Server, and Oracle, UNION is a versatile tool for data consolidation. Let’s dive into how it works.
Basic Syntax of the UNION Operator
The UNION operator is used to combine multiple SELECT statements. Here’s the basic syntax:
SELECT column1, column2, ...
FROM table1
WHERE condition
UNION
SELECT column1, column2, ...
FROM table2
WHERE condition;
- SELECT column1, column2, ...: The columns to retrieve from each table or query. Each SELECT must have the same number of columns.
- FROM table1, table2, ...: The tables or subqueries providing the data.
- WHERE condition: Optional filters for each SELECT.
- UNION: Combines the results, removing duplicates.
For example, to combine customer names from two tables:
SELECT first_name
FROM active_customers
UNION
SELECT first_name
FROM inactive_customers;
This returns a single list of unique first_name values from both tables. For more on querying basics, see SELECT Statement.
How UNION Works
UNION executes each SELECT statement independently, then combines their results into a single set, automatically removing duplicate rows based on all selected columns. The column names in the final result come from the first SELECT statement, and data types must be compatible across corresponding columns (e.g., VARCHAR with VARCHAR, INT with INT).
Example: Combining Customer Lists
Consider two tables:
active_customers:
customer_id | first_name |
---|---|
101 | John |
102 | Jane |
103 | Alice |
inactive_customers:
customer_id | first_name |
---|---|
201 | Bob |
202 | Alice |
203 | Clara |
To create a unified list of customer names:
SELECT first_name
FROM active_customers
UNION
SELECT first_name
FROM inactive_customers;
Result:
first_name |
---|
John |
Jane |
Alice |
Bob |
Clara |
“Alice” appears only once, as UNION removes duplicates. The result has 5 rows instead of 6, showing UNION’s deduplication.
UNION vs. UNION ALL
By default, UNION removes duplicates, which can be computationally expensive for large datasets. The UNION ALL operator, in contrast, keeps all rows, including duplicates, and is faster because it skips deduplication. Use UNION ALL when duplicates are acceptable or guaranteed not to exist—see UNION ALL Operator.
Example: UNION ALL
Using the same tables:
SELECT first_name
FROM active_customers
UNION ALL
SELECT first_name
FROM inactive_customers;
Result:
first_name |
---|
John |
Jane |
Alice |
Bob |
Alice |
Clara |
“Alice” appears twice, as UNION ALL retains duplicates, producing 6 rows.
UNION with Multiple Columns
UNION can combine queries with multiple columns, as long as the number and data types match. This is useful for merging complex datasets.
Example: Combining Customer Details
Suppose you want to include email alongside first_name:
active_customers:
customer_id | first_name | |
---|---|---|
101 | John | john@example.com |
102 | Jane | jane@example.com |
inactive_customers:
customer_id | first_name | |
---|---|---|
201 | Bob | bob@example.com |
202 | Jane | jane@example.com |
Query:
SELECT first_name, email
FROM active_customers
UNION
SELECT first_name, email
FROM inactive_customers;
Result:
first_name | |
---|---|
John | john@example.com |
Jane | jane@example.com |
Bob | bob@example.com |
“Jane” appears once, as UNION considers the entire row (first_name, email) for deduplication.
UNION with Different Tables and Conditions
UNION can combine data from different tables or apply different conditions to the same table, as long as the column structure matches.
Example: Combining Sales by Region
Suppose you have north_sales and south_sales tables:
north_sales:
sale_id | amount |
---|---|
1 | 100.00 |
2 | 200.00 |
south_sales:
sale_id | amount |
---|---|
3 | 150.00 |
4 | 200.00 |
To combine all sales:
SELECT sale_id, amount
FROM north_sales
UNION
SELECT sale_id, amount
FROM south_sales;
Result:
sale_id | amount |
---|---|
1 | 100.00 |
2 | 200.00 |
3 | 150.00 |
4 | 200.00 |
The amount 200.00 appears twice, as the sale_id differs, making the rows unique.
Example: Different Conditions on One Table
In an orders table, to combine high-value and recent orders:
SELECT order_id, total
FROM orders
WHERE total > 500
UNION
SELECT order_id, total
FROM orders
WHERE order_date >= '2025-05-01';
This merges orders meeting either condition, removing duplicates. For more on filtering, see WHERE Clause.
UNION with ORDER BY and Other Clauses
UNION can be paired with ORDER BY, WHERE, or LIMIT/FETCH to sort or limit the combined result set. The ORDER BY applies to the final result and is placed after the last SELECT.
Example: Sorting Combined Results
Using the customer example:
SELECT first_name
FROM active_customers
UNION
SELECT first_name
FROM inactive_customers
ORDER BY first_name;
Result:
first_name |
---|
Alice |
Bob |
Clara |
Jane |
John |
ORDER BY sorts the unified list alphabetically. See ORDER BY Clause.
Example: Limiting Results
To get the top 3 names:
SELECT first_name
FROM active_customers
UNION
SELECT first_name
FROM inactive_customers
ORDER BY first_name
FETCH FIRST 3 ROWS ONLY;
Result:
first_name |
---|
Alice |
Bob |
Clara |
For row limiting, see FETCH Clause.
UNION with Subqueries and Joins
UNION can combine complex queries, including subqueries or joins, as long as the column structure matches.
Example: UNION with Joins
To combine customer names from orders in different regions:
SELECT
c.first_name
FROM customers AS c
INNER JOIN north_orders AS no
ON c.customer_id = no.customer_id
UNION
SELECT
c.first_name
FROM customers AS c
INNER JOIN south_orders AS so
ON c.customer_id = so.customer_id;
This merges customers who placed orders in either region, removing duplicates. For more, see INNER JOIN.
Example: UNION with Subqueries
To combine high-spending customers from two periods:
SELECT first_name
FROM (
SELECT c.first_name
FROM customers AS c
JOIN orders AS o
ON c.customer_id = o.customer_id
WHERE o.order_date < '2025-01-01'
GROUP BY c.first_name
HAVING SUM(o.total) > 1000
) AS high_spenders_2024
UNION
SELECT first_name
FROM (
SELECT c.first_name
FROM customers AS c
JOIN orders AS o
ON c.customer_id = o.customer_id
WHERE o.order_date >= '2025-01-01'
GROUP BY c.first_name
HAVING SUM(o.total) > 1000
) AS high_spenders_2025;
This uses subqueries to identify high spenders, then combines them. See Subqueries and HAVING Clause.
Practical Example: Managing a Retail Database
Let’s apply UNION to a real-world scenario. Suppose you’re managing a retail database with online_customers, store_customers, online_orders, and store_orders tables. Here’s how you’d use UNION:
- Unified Customer List: Combine unique customers from online and store channels:
SELECT first_name, email
FROM online_customers
UNION
SELECT first_name, email
FROM store_customers;
- All Orders: Merge online and store orders:
SELECT order_id, total
FROM online_orders
UNION
SELECT order_id, total
FROM store_orders;
- High-Value Customers: Find customers with high-value orders from either channel:
SELECT
c.first_name
FROM online_customers AS c
JOIN online_orders AS o
ON c.customer_id = o.customer_id
WHERE o.total > 500
UNION
SELECT
c.first_name
FROM store_customers AS c
JOIN store_orders AS o
ON c.customer_id = o.customer_id
WHERE o.total > 500;
- Top 5 Orders by Value: List the top 5 orders from both channels:
SELECT order_id, total
FROM online_orders
UNION
SELECT order_id, total
FROM store_orders
ORDER BY total DESC
FETCH FIRST 5 ROWS ONLY;
This example shows UNION’s utility for retail data consolidation. For querying basics, see SELECT Statement.
Performance Considerations
While we’re not covering best practices, a few performance notes can help you use UNION effectively:
- Deduplication Overhead: UNION removes duplicates, which can be slow for large datasets. Use UNION ALL if duplicates are okay—see UNION ALL Operator.
- Indexes: Indexes on columns in WHERE or ORDER BY clauses can speed up individual SELECT statements. See Creating Indexes.
- Filter Early: Apply WHERE conditions in each SELECT to reduce rows before combining. See WHERE Clause.
- Query Plans: Use EXPLAIN to analyze performance, especially with complex queries or joins. See EXPLAIN Plan.
For large datasets, optimizing each SELECT is key—check out SQL Best Practices for general tips. According to W3Schools, UNION is widely used for merging datasets.
Common Pitfalls and How to Avoid Them
UNION is straightforward but can trip you up. Here are some common issues:
- Mismatched Columns: Each SELECT must have the same number of columns with compatible data types. Check column counts and types (e.g., VARCHAR vs. INT) before running.
- Column Name Confusion: The final column names come from the first SELECT. Use aliases to clarify output—see Aliases with AS.
- Performance with UNION: Deduplication can be slow for large datasets. Use UNION ALL if duplicates aren’t a concern.
- Unintended Duplicates: If you expect unique rows but get duplicates, check if UNION ALL was used or if column selections allow duplicates (e.g., excluding unique IDs).
- ORDER BY Placement: ORDER BY applies to the final result and must appear after the last SELECT. Placing it in individual SELECTs can cause errors.
Testing each SELECT independently can help verify data and ensure compatibility.
Wrapping Up
The SQL UNION operator is a versatile tool for combining query results, producing a single, unique set of rows from multiple sources. By mastering its syntax, using it with ORDER BY, joins, or subqueries, and applying it in scenarios like our retail database, you’ll streamline data consolidation. Just watch out for pitfalls like mismatched columns or performance issues, and you’ll be using UNION like a pro.
For more SQL fundamentals, explore related topics like SELECT Statement or UNION ALL Operator. Ready for advanced techniques? Check out INTERSECT Operator or Subqueries for more ways to manipulate data.