Understanding PostgreSQL Compound Indexes: A Deep Dive
Introduction
When it comes to database performance optimization, proper indexing is crucial. While single-column indexes are straightforward, compound indexes (also known as composite or multi-column indexes) require a deeper understanding. In this post, we’ll explore how compound indexes work in PostgreSQL, when to use them, and how to optimize your queries.
What is a Compound Index?
A compound index is an index on multiple columns of a table. Think of it like a phone book organized by last name and then first name. The order of columns in the index definition is crucial for how PostgreSQL can use it.
Let’s create a sample table and index to demonstrate:
1
2
3
4
5
6
7
8
9
CREATE TABLE users (
userid INTEGER,
email VARCHAR(255),
age INTEGER,
name VARCHAR(255),
created_at TIMESTAMP
);
CREATE INDEX idx_users_composite ON users (userid, email, age);
How PostgreSQL Uses Compound Indexes
The key principle is that PostgreSQL can only use index columns from left to right. Let’s examine different query patterns and their effectiveness:
1. Perfect Match (Using All Columns)
1
2
3
4
SELECT * FROM users
WHERE userid = 123
AND email = '[email protected]'
AND age = 25;
This query can fully utilize the index because it uses all columns in the same order as the index.
2. Left-Most Columns Only
1
2
3
SELECT * FROM users
WHERE userid = 123
AND email = '[email protected]';
This works efficiently because it uses the leftmost columns of the index.
3. Skipping Middle Column
1
2
3
SELECT * FROM users
WHERE userid = 123
AND age = 25;
This can only use the userid part of the index. The age condition must be checked separately because we skipped the email column.
4. Using Only Later Columns
1
2
3
SELECT * FROM users
WHERE email = '[email protected]'
AND age = 25;
This cannot use the index effectively because it doesn’t include the leftmost column (userid).
Order Matters: Column Arrangement Strategies
When creating compound indexes, consider these factors for column ordering:
- Selectivity First: Put the most selective columns (those with more unique values) first
- Common Query Patterns: Match your most frequent query patterns
- Range Queries Last: Put columns used in range conditions at the end
Example of Order Impact
Poor index for this query:
1
2
3
4
5
6
CREATE INDEX idx_poor_order ON users (age, email, userid);
-- Query performance suffers
SELECT * FROM users
WHERE userid = 123
AND email LIKE 'john%';
Better index:
1
2
3
4
5
6
CREATE INDEX idx_better_order ON users (userid, email, age);
-- Query performs well
SELECT * FROM users
WHERE userid = 123
AND email LIKE 'john%';
Index Only Scans
One powerful feature of compound indexes is their ability to support index-only scans. If all columns needed by a query are in the index, PostgreSQL can retrieve data directly from the index without touching the table.
1
2
3
4
5
6
7
8
9
-- Can be satisfied by index alone
SELECT userid, email, age
FROM users
WHERE userid = 123;
-- Requires table access
SELECT userid, email, age, name
FROM users
WHERE userid = 123;
Performance Monitoring
To verify index usage, use EXPLAIN ANALYZE:
1
2
3
4
EXPLAIN ANALYZE
SELECT * FROM users
WHERE userid = 123
AND email = '[email protected]';
Common Gotchas and Tips
- Size Considerations: Each additional column increases index size
- Maintenance Overhead: More indexes mean slower INSERT/UPDATE operations
- Column Order Flexibility: Consider creating multiple indexes for different query patterns
- NULL Values: Index can include NULL values, but consider their impact on selectivity
When to Use Compound Indexes
Use compound indexes when:
- You frequently query multiple columns together
- You need to support range queries on the last column
- Your queries can benefit from index-only scans
- The selectivity of combined columns is significantly better than single columns