A 40-Line Fix That Eliminated a 400x Performance Gap

• by Petabite
performanceoptimizationdebuggingpostgresql

A 40-Line Fix That Eliminated a 400x Performance Gap

Last month, I found a query that took 2.8 seconds to run. After a 40-line fix, it completed in 7 milliseconds. Same result, 400x faster. Here’s how.

The Problem

We had a dashboard showing user activity. Simple enough:

SELECT
  users.name,
  COUNT(events.id) as event_count,
  MAX(events.created_at) as last_active
FROM users
LEFT JOIN events ON events.user_id = users.id
WHERE users.created_at > NOW() - INTERVAL '30 days'
GROUP BY users.id
ORDER BY event_count DESC
LIMIT 100;

For 50,000 users and 10 million events: 2,800ms.

The kicker? Only 100 rows needed. Yet we were processing millions.

The Investigation

EXPLAIN ANALYZE ...
Limit  (cost=45678.23..45678.48 rows=100 width=52)
       (actual time=2847.234..2847.289 rows=100 loops=1)
  ->  Sort  (cost=45678.23..45803.41 rows=50072 width=52)
            (actual time=2847.231..2847.258 rows=100 loops=1)
        Sort Key: (count(events.id)) DESC
        ->  HashAggregate  (cost=41234.56..41735.28 rows=50072 width=52)
                          (actual time=2789.123..2820.456 rows=50072 loops=1)
              ->  Hash Left Join  (cost=1234.45..38901.23 rows=9827456 width=28)
                                 (actual time=45.678..2234.567 rows=9827456 loops=1)

The problem: Hash Left Join of 10 million event rows.

We’re joining EVERYTHING, then filtering, then sorting, then limiting. Backwards.

The Fix

Flip the logic: get the top users first, THEN join their events.

WITH top_users AS (
  SELECT id, name
  FROM users
  WHERE created_at > NOW() - INTERVAL '30 days'
  LIMIT 100
),
user_events AS (
  SELECT
    user_id,
    COUNT(*) as event_count,
    MAX(created_at) as last_active
  FROM events
  WHERE user_id IN (SELECT id FROM top_users)
  GROUP BY user_id
)
SELECT
  u.name,
  COALESCE(e.event_count, 0) as event_count,
  e.last_active
FROM top_users u
LEFT JOIN user_events e ON e.user_id = u.id
ORDER BY event_count DESC;

Wait, this is wrong. We’re limiting users BEFORE counting events. We need users with the MOST events, but we don’t know that until after counting.

Let me try again:

WITH user_event_counts AS (
  SELECT
    user_id,
    COUNT(*) as event_count,
    MAX(created_at) as last_active
  FROM events
  WHERE user_id IN (
    SELECT id
    FROM users
    WHERE created_at > NOW() - INTERVAL '30 days'
  )
  GROUP BY user_id
  ORDER BY event_count DESC
  LIMIT 100
)
SELECT
  u.name,
  e.event_count,
  e.last_active
FROM user_event_counts e
JOIN users u ON u.id = e.user_id
ORDER BY e.event_count DESC;

Still wrong. The subquery still processes all events.

The REAL fix:

-- Step 1: Pre-aggregate events per user
CREATE INDEX idx_events_user_created
  ON events(user_id, created_at);

-- Step 2: Rewrite query to use index
SELECT
  users.name,
  event_counts.count as event_count,
  event_counts.last_active
FROM users
JOIN LATERAL (
  SELECT
    COUNT(*) as count,
    MAX(created_at) as last_active
  FROM events
  WHERE events.user_id = users.id
) event_counts ON true
WHERE users.created_at > NOW() - INTERVAL '30 days'
ORDER BY event_counts.count DESC
LIMIT 100;

Result: 7ms.

What Changed?

Before:

  1. Load ALL users (50k rows)
  2. Join with ALL events (10M rows)
  3. Aggregate (process 10M rows)
  4. Sort (50k results)
  5. Limit to 100

After:

  1. Scan users with index
  2. For each user, count their events using index (LATERAL join)
  3. Keep running top-100 in memory
  4. Stop after finding top 100

The LATERAL join + smart ordering meant Postgres only processes ~500 users to find the top 100, not all 50k.

The Lesson

Don’t process what you don’t need.

Common antipatterns:

  • SELECT * when you need 3 columns
  • Loading all rows when you need top N
  • Joining all tables when you can filter first
  • Sorting in application when database can do it

How to Find These

1. Profile Everything

EXPLAIN (ANALYZE, BUFFERS) your_query;

Look for:

  • High actual time vs cost
  • Seq scans on big tables
  • Sorts with many rows
  • Nested loops with high loop counts

2. Question Joins

Ask: “Do I need ALL rows from this join?”

  • Often you can filter before joining
  • Sometimes you can aggregate before joining
  • Occasionally you can avoid the join entirely

3. Add Indexes (But Smartly)

-- Bad: Index every column
CREATE INDEX ON events(user_id);
CREATE INDEX ON events(created_at);

-- Good: Composite index for the query pattern
CREATE INDEX ON events(user_id, created_at);

4. Benchmark Edge Cases

Test with:

  • Empty tables
  • Maximum data
  • Worst-case distributions

Our query was fast with 100 users. At 50k? Disaster.

The Code

Before (40 lines of SQL):

-- Original query

After (40 lines of SQL):

-- Optimized query with LATERAL join

Same line count. 400x faster. The best optimizations usually are just better algorithms.

Takeaway

Performance work is often not about making code faster—it’s about making code do less work.

Measure, question assumptions, and optimize the algorithm before optimizing the code.