Sidekiq & Redis Optimization: Reducing Overhead and Scaling Worker Jobs

When you run thousands of background jobs through Sidekiq, Redis becomes the bottleneck. Every job enqueue adds Redis writes, network round-trips, and memory pressure. This post covers a real-world optimization we applied and a broader toolkit for keeping Sidekiq lean.


The Problem: One Job Per Item

Imagine sending weekly emails to 10,000 users. The naive approach:

# ❌ Bad: 10,000 Redis writes, 10,000 scheduled entries
user_ids.each do |id|
WeeklyEmailWorker.perform_async(id)
end

Each perform_async does:

  • A Redis LPUSH (or ZADD for scheduled jobs)
  • Serialization of job payload
  • Network round-trip

At 10,000 users, that’s 10,000 Redis operations and 10,000 scheduled entries. At 1M users, that’s 1M scheduled jobs in Redis. That’s expensive and slow.


The Fix: Batch + Staggered Scheduling

Instead of one job per user, we batch users and schedule each batch with a small delay:

# ✅ Good: 100 Redis writes, 100 scheduled entries
BATCH_SIZE = 100
BATCH_DELAY = 0.2 # seconds
pending_user_ids.each_slice(BATCH_SIZE).with_index do |batch_ids, batch_index|
delay_seconds = batch_index * BATCH_DELAY
WeeklyEmailByWorker.perform_in(delay_seconds, batch_ids)
end

What this achieves:

MetricBefore (1 per user)After (batched)
Redis ops10,000100
Scheduled jobs10,000100
Scheduled jobs at 1M users1,000,00010,000

Each worker still processes one user at a time internally, but we only enqueue one job per batch. Redis overhead drops by roughly 100x.

Why perform_in instead of chaining?

  • perform_in(delay, batch_ids) — all jobs are scheduled immediately with their future timestamps. Sidekiq moves them into the ready queue at the right time regardless of other queue traffic.
  • Chaining (each job enqueues the next) — the next batch only enters the queue after the current one finishes. If other jobs are busy, your email chain sits behind them and can be delayed significantly.

For time-sensitive jobs like “send at 8:46 AM local time,” upfront scheduling is the right choice.


Other Sidekiq Optimization Strategies

1. Bulk Enqueue (Sidekiq Pro/Enterprise)

Sidekiq::Client.push_bulk pushes many jobs in one Redis call:

# Single Redis call instead of N
Sidekiq::Client.push_bulk(
'class' => WeeklyEmailWorker,
'args' => user_ids.map { |id| [id] }
)

Useful when you don’t need per-job delays and want to minimize Redis round-trips.

2. Adjust Concurrency

Default is 10 threads per process. More threads = more concurrency but more memory:

# config/sidekiq.yml
:concurrency: 25 # Tune based on CPU/memory

Higher concurrency helps if jobs are I/O-bound (HTTP, DB, email). For CPU-bound jobs, lower concurrency is usually better.

3. Use Dedicated Queues

Separate heavy jobs from light ones:

# config/sidekiq.yml
:queues:
- [critical, 3] # 3x weight
- [default, 2]
- [low, 1]

Critical jobs get more CPU time. Low-priority jobs don’t block the rest.

4. Rate Limiting (Sidekiq Enterprise)

Throttle jobs that hit external APIs:

class EmailWorker
include Sidekiq::Worker
sidekiq_options throttle: { threshold: 100, period: 1.minute }
end

Prevents hitting rate limits and keeps Redis usage predictable.

5. Unique Jobs (sidekiq-unique-jobs)

Avoid duplicate jobs for the same work:

sidekiq_options lock: :until_executed, on_conflict: :log

Reduces redundant work and Redis load when jobs are retried or triggered multiple times.

6. Dead Job Cleanup

Dead jobs accumulate in Redis. Set retention and cleanup:

# config/initializers/sidekiq.rb
Sidekiq.configure_server do |config|
config.death_handlers << ->(job, ex) {
# Log, alert, or move to DLQ
}
end

Use dead_max_jobs and periodic cleanup so Redis doesn’t grow unbounded.

7. Job Size Limits

Large payloads increase Redis memory and serialization cost:

# Keep payloads small; pass IDs, not full objects
WeeklyEmailWorker.perform_async(user_id) # ✅
WeeklyEmailWorker.perform_async(user.to_json) # ❌

8. Connection Pooling

Ensure each worker process has a bounded Redis connection pool:

# config/initializers/sidekiq.rb
Sidekiq.configure_server do |config|
config.redis = { url: ENV['REDIS_URL'], size: 25 }
end

Prevents connection exhaustion under load.

9. Scheduled Job Limits

Scheduled jobs live in Redis. If you schedule millions of jobs, you may need to cap or paginate:

# Avoid scheduling 1M jobs at once
# Use batch + perform_in with reasonable batch sizes

10. Redis Memory and Eviction

Configure Redis for Sidekiq:

maxmemory 2gb
maxmemory-policy noeviction # or volatile-lru for cache-only keys

Monitor memory and eviction to avoid unexpected data loss.


Summary

StrategyWhen to Use
Batch + perform_inMany similar jobs at a specific time; reduces Redis ops by ~100x
push_bulkLarge batches of jobs without per-job delays
Dedicated queuesDifferent priority levels for job types
Rate limitingExternal APIs or rate-limited services
Unique jobsIdempotent or duplicate-prone jobs
Small payloadsAlways; pass IDs instead of full objects
Connection poolingHigh concurrency or many processes

The batch + perform_in pattern is especially effective for time-sensitive jobs that must run in a narrow window while keeping Redis overhead low.

Happy Coding with Sidekiq!


Part 4 – Redis, cache invalidation, testing, pitfalls, and checklist

10) Optional: Redis caching in a Rails API app (why and how)

Even in API-only apps, application-level caching is useful to reduce DB load for expensive queries or aggregated endpoints.

Common patterns

  • Low-level caching: Rails.cache.fetch
  • Fragment caching: less relevant for API-only (used for views), but you can cache JSON blobs
  • Keyed caching with expiration for computed results

Example — caching an expensive query

class Api::V1::ReportsController < ApplicationController
  def sales_summary
    key = "sales_summary:#{Time.current.utc.strftime('%Y-%m-%d-%H')}"
    data = Rails.cache.fetch(key, expires_in: 5.minutes) do
      # expensive computation
      compute_sales_summary
    end
    render json: data
  end
end

Why Redis?

  • Redis is fast, supports expirations, and can be used as Rails.cache store (config.cache_store = :redis_cache_store, { url: ENV['REDIS_URL'] }).
  • Works well for ephemeral caches that you want to expire automatically.

Invalidation strategies

  • Time-based (TTL) — simplest.
  • Key-based — when related data changes, evict related keys.
    • Example: after a product update, call Rails.cache.delete("top_offers").
  • Versioned keys — embed a version or timestamp in the key (e.g., products:v3:top) and bump the version on deploy/major change.
  • Tagging / Key sets — maintain a set of keys per resource to delete them on change (more manual).

Caveat: Don’t rely solely on Redis caching for user-specific sensitive data. Use private caches when needed.

11) Purging caches and CDN invalidation

When hashed assets are used you rarely need to purge because new filenames mean new URLs. For non-hashed assets or CDN caches you may need purge:

  • CDN invalidation: Cloudflare / Fastly / CloudFront offer purge by URL or cache key. Use CDN APIs or surrogate-key headers to do group purges.
  • Surrogate-Control / Surrogate-Key (Fastly): set headers that help map objects to tags for efficient purging.
  • Nginx cache purge: if you configure proxy_cache, you must implement purge endpoints or TTLs.

Recommended approach:

  • Prefer hashed filenames for assets so you rarely need invalidation.
  • For dynamic content, prefer short TTLs or implement programmatic CDN purges as part of deploy/administration flows.

12) Testing and verifying caching behavior (practical commands)

Check response headers

curl -I https://www.mydomain.com/vite/index-B34XebCm.js
# expect: Cache-Control: public, max-age=31536000, immutable

Conditional request test (ETag)

  1. Get ETag:
curl -I https://api.mydomain.com/api/v1/products/123
# Look for ETag: "..."

  1. Re-request with that ETag:
curl -I -H 'If-None-Match: "the-etag-value"' https://api.mydomain.com/api/v1/products/123
# expect: 304 Not Modified (if unchanged)

Check s-maxage / CDN effect

  • Use curl -I against the CDN domain (if applicable) to inspect Age header (shows time cached at edge) and X-Cache headers from CDN.

Chrome DevTools

  • Open Network tab, reload page, inspect a cached resource:
    • Status might show (from disk cache) or (from memory cache) if cached.
    • For resources with max-age but no immutable, you might see 200 with from disk cache or network requests with 304.

13) Common pitfalls and how to avoid them

  1. Caching HTML
    • Problem: If your index.html or vite.html is cached, users can get old asset references and see broken UI.
    • Avoid: Always no-cache your HTML entry file.
  2. Caching non-hashed assets long-term
    • Problem: Logo or content images may not update for users.
    • Avoid: Short TTL or rename files when updating (versioning).
  3. Not using ETag/Last-Modified
    • Problem: Clients re-download entire payloads when unchanged — wasted bandwidth.
    • Avoid: Use ETag or Last-Modified so clients can get 304.
  4. Caching user-specific responses publicly
    • Problem: Data leak (private data served to other users).
    • Avoid: Use private/no-store for user-specific responses.
  5. Relying solely on Nginx for dynamic caching
    • Problem: Hard to maintain invalidations and complex to configure with Passenger.
    • Avoid: Use Rails headers + CDN; or a caching reverse proxy only if necessary and you know how to invalidate.

14) Commands and operational notes for Passenger

Test nginx config

sudo nginx -t

Reload nginx gracefully

sudo systemctl reload nginx

Restart nginx if reload fails

sudo systemctl restart nginx

Restart Passenger (app-level)

  • Passenger allows restarting an app without touching the systemd service:
# restart specific app by path
sudo passenger-config restart-app /apps/mydomain/current

  • Or restart all apps (be careful):
sudo passenger-config restart-app --ignore-app-not-running

Check Passenger status

passenger-status

Always run nginx -t before reload. Make sure to test caching headers after a reload (curl) before rolling out.

15) Final checklist before/after deploy (practical)

Before deploy:

  • Ensure build pipeline fingerprints Vite output (hash in filenames).
  • Ensure /apps/mydomain/current/public/vite/ contains hashed assets.
  • Confirm vite.html is correct and references the hashed file names.
  • Confirm Nginx snippet for /vite/ long cache is present and not overridden.

After deploy:

  • Run sudo nginx -t and sudo systemctl reload nginx.
  • Test hashed asset: curl -I https://www.mydomain.com/vite/index-...jsCache-Control: public, max-age=31536000
  • Test HTML: curl -I https://www.mydomain.com/vite.htmlCache-Control: no-cache
  • Test sample API endpoint headers: curl -I https://api.mydomain.com/api/v1/products → verify Cache-Control and ETag/Last-Modified where applicable.
  • Run smoke tests in browser (Chrome DevTools) to verify resources are cached as expected.

16) Appendix — example Rails snippets (summary)

Set header manually

response.set_header('Cache-Control', 'public, max-age=60, s-maxage=300')

Return 304 using conditional GET

def show
  resource = Resource.find(params[:id])
  if stale?(etag: resource, last_modified: resource.updated_at)
    render json: resource
  end
end

Redis caching (Rails.cache)

data = Rails.cache.fetch("top_products_page_#{params[:page]}", expires_in: 5.minutes) do
  Product.top.limit(20).to_a
end
render json: data

Conclusion (Part 4)

This part explained where caching belongs in an API-only Rails + Vue application, how Passenger fits into the stack, how to set cache headers for safe API caching, optional Redis caching strategies, and practical testing/operational steps.