Home
/
Blog
/
Blog article

3/20/2026

Building a Rate Limiter from Scratch in Node.js (2026)

Node.js rate limiter concept with token bucket visualization

Every API you ship without rate limiting is a ticking time bomb. One aggressive client, one bot loop, one bad actor — and your server is toast. I learned this the hard way on a side project that got hammered by a scraper at 3 AM.

The good news? You don't need Redis or a third-party service to get started. You can build a solid rate limiter in pure Node.js, plug it into Express, and have it running in under an hour.

Why Rate Limiting Matters

Rate limiting isn't just about stopping attacks. It's about protecting your server from accidental overload, controlling costs when you call paid APIs downstream, ensuring fair usage across clients, and a first line of defense against DDoS. If your API is public-facing in 2026, rate limiting isn't optional. It's table stakes.

Algorithm 1: Token Bucket

The token bucket is the classic approach. You have a bucket that holds tokens. Each request costs one token. Tokens refill at a fixed rate. If the bucket is empty, the request is rejected. It allows short bursts while enforcing an average rate — perfect for APIs where users might send a few rapid requests legitimately.

Here's the implementation. The key insight is that tokens refill lazily — we calculate how many tokens should have been added since the last request, avoiding unnecessary timers and keeping memory usage flat:

class TokenBucket { constructor(maxTokens, refillRate) { this.maxTokens = maxTokens; this.tokens = maxTokens; this.refillRate = refillRate; this.lastRefill = Date.now(); } refill() { const now = Date.now(); const elapsed = (now - this.lastRefill) / 1000; this.tokens = Math.min(this.maxTokens, this.tokens + elapsed * this.refillRate); this.lastRefill = now; } consume() { this.refill(); if (this.tokens >= 1) { this.tokens -= 1; return true; } return false; } }

Algorithm 2: Sliding Window

The sliding window tracks actual timestamps of requests. Instead of tokens, you count: how many requests has this client made in the last N seconds? It gives precise control — 100 requests per minute means exactly that. No burst allowance.

class SlidingWindow { constructor(windowMs, maxRequests) { this.windowMs = windowMs; this.maxRequests = maxRequests; this.requests = []; } isAllowed() { const now = Date.now(); this.requests = this.requests.filter(ts => now - ts < this.windowMs); if (this.requests.length < this.maxRequests) { this.requests.push(now); return true; } return false; } }

One thing to watch: this stores a timestamp per request, so memory grows with traffic. For high-volume APIs, switch to a counter-based sliding window. For most side projects and medium-traffic APIs, this works perfectly.

Plugging It into Express

Here's an Express middleware using the token bucket. Each IP gets its own bucket, stored in a Map:

const buckets = new Map(); function rateLimiter(maxTokens, refillRate) { return (req, res, next) => { const key = req.ip; if (!buckets.has(key)) buckets.set(key, new TokenBucket(maxTokens, refillRate)); const bucket = buckets.get(key); if (bucket.consume()) { next(); } else { res.status(429).json({ error: 'Too many requests', retryAfter: Math.ceil(1 / refillRate) }); } }; } app.use('/api', rateLimiter(10, 2));

A few things to add in production: clean up stale buckets periodically, use a better key than req.ip behind proxies (try x-forwarded-for or a user ID from auth), and always return a Retry-After header on 429 responses — it's good API citizenship.

Token Bucket vs Sliding Window

Use token bucket when you want burst tolerance, care more about average rate, or need memory efficiency. Use sliding window when you need strict per-window enforcement, are building billing or quota systems, or precision matters. For most Express APIs, start with token bucket — it's more forgiving for legitimate users while still blocking abuse.

When to Graduate to Redis

This in-memory approach has limits: single process only (multiple Node.js instances each maintain separate bucket maps), no persistence across restarts, and memory grows at scale. When you hit these walls, reach for Redis — same algorithms, shared state, atomic operations. But don't start there. Build in-memory first, understand the algorithms, and upgrade when traffic demands it.

Wrapping Up

Rate limiting feels complex until you build it. Two classes, one middleware, and your API actually enforces boundaries. Start in-memory, test it, upgrade to Redis when you need to. Check out my projects or more posts on the blog for more Node.js and React guides.