Why Rate Limiting Matters

Without rate limiting, your API is vulnerable to:

Brute force attacks: Millions of password guesses
Credential stuffing: Testing stolen credentials
Data scraping: Extracting your entire database
Resource exhaustion: Expensive queries overloading servers
Cost exploitation: Running up your API/compute bills

AI never adds rate limiting. You have to.

Rate Limiting Strategies

1. Fixed Window

Limit: 100 requests per minute Window: 00:00 to 00:59, 01:00 to 01:59, ...

00:00 - 00:30: 90 requests ← OK 00:30 - 00:59: 10 requests ← Hits limit 00:59: Request blocked 01:00: Counter resets, allowed again

Pros: Simple, memory-efficient Cons: Burst at window boundaries

2. Sliding Window

Limit: 100 requests per minute Check: Last 60 seconds from current time

00:30: Check 99:30 - 00:30, 50 requests → Allowed 00:45: Check 99:45 - 00:45, 95 requests → Allowed 00:50: Check 99:50 - 00:50, 100 requests → Blocked

Pros: Smoother rate limiting Cons: More memory/computation

3. Token Bucket

Bucket: 10 tokens Refill: 1 token per second

Request 1: 9 tokens remaining Request 2: 8 tokens remaining ... Request 10: 0 tokens, wait for refill (1 second passes): 1 token available

Pros: Allows bursts, smooth average rate Cons: More complex implementation

Implementing Rate Limiting

Next.js with Upstash

bash

npm install @upstash/ratelimit @upstash/redis

javascript

// lib/ratelimit.ts
import { Ratelimit } from '@upstash/ratelimit'
import { Redis } from '@upstash/redis'
export const ratelimit = new Ratelimit({
  redis: Redis.fromEnv(),
  limiter: Ratelimit.slidingWindow(10, '10 s'),
  analytics: true,
})
// middleware.ts
import { ratelimit } from '@/lib/ratelimit'
import { NextResponse } from 'next/server'
export async function middleware(request) {
  const ip = request.ip ?? '127.0.0.1'
  const { success, limit, reset, remaining } = await ratelimit.limit(ip)
  if (!success) {
    return new NextResponse('Too Many Requests', {
      status: 429,
      headers: {
        'X-RateLimit-Limit': limit.toString(),
        'X-RateLimit-Remaining': remaining.toString(),
        'X-RateLimit-Reset': reset.toString(),
      },
    })
  }
  return NextResponse.next()
}export const config = {
  matcher: '/api/:path*',
}

Express with express-rate-limit

javascript

import rateLimit from 'express-rate-limit'
const limiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100, // Limit each IP to 100 requests per window
  standardHeaders: true,
  legacyHeaders: false,
  message: { error: 'Too many requests, please try again later.' },
})
app.use('/api/', limiter)
// Stricter limit for auth endpoints
const authLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 5, // Only 5 login attempts per 15 minutes
  skipSuccessfulRequests: true, // Don't count successful logins
})app.use('/api/auth/', authLimiter)

In-Memory (Development/Simple Cases)

javascript

// Simple in-memory rate limiter
const requests = new Map()
function rateLimit(key, limit, windowMs) {
  const now = Date.now()
  const windowStart = now - windowMs
  // Get or initialize request log
  let log = requests.get(key) || []
  // Remove old entries
  log = log.filter(timestamp => timestamp > windowStart)
  // Check limit
  if (log.length >= limit) {
    return { allowed: false, remaining: 0 }
  }
  // Add current request
  log.push(now)
  requests.set(key, log)
  return { allowed: true, remaining: limit - log.length }
}
// Usage
export async function POST(req) {
  const ip = req.headers.get('x-forwarded-for') || 'unknown'
  const { allowed, remaining } = rateLimit(ip, 10, 60000)
  if (!allowed) {
    return Response.json({ error: 'Rate limit exceeded' }, { status: 429 })
  }  // Process request
}

Endpoint-Specific Limits

Different endpoints need different limits:

javascript

const limits = {
  // Authentication - very strict
  '/api/auth/login': { requests: 5, window: '15m' },
  '/api/auth/register': { requests: 3, window: '1h' },
  '/api/auth/reset-password': { requests: 3, window: '1h' },
  // Standard API - moderate
  '/api/users': { requests: 100, window: '1m' },
  '/api/posts': { requests: 100, window: '1m' },
  // Expensive operations - strict
  '/api/export': { requests: 5, window: '1h' },
  '/api/search': { requests: 30, window: '1m' },
  '/api/ai/generate': { requests: 10, window: '1m' },  // Public endpoints - lenient
  '/api/health': { requests: 1000, window: '1m' },
}

Rate Limiting by User vs IP

IP-Based (Default)

javascript

const identifier = request.ip

Good for: Unauthenticated endpoints, login pages Problem: Shared IPs (offices, schools) get rate limited together

User-Based

javascript

const session = await getServerSession()
const identifier = session?.user.id || request.ip

Good for: Authenticated endpoints, per-user quotas Problem: Attackers can create multiple accounts

Combined

javascript

// Separate limits for IP and user
const ipLimit = await ratelimit.limit(ip:${ip})
const userLimit = session
  ? await ratelimit.limit(user:${session.user.id})
  : { success: true }if (!ipLimit.success || !userLimit.success) {
  return new Response('Rate limited', { status: 429 })
}

Response Headers

Always include rate limit headers:

javascript

return new Response(data, {
  headers: {
    'X-RateLimit-Limit': '100',
    'X-RateLimit-Remaining': '95',
    'X-RateLimit-Reset': '1640000000',
    'Retry-After': '60', // Seconds until reset
  },
})

Handling Rate Limit Errors

Client-Side

javascript

async function fetchWithRetry(url, options, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    const response = await fetch(url, options)
    if (response.status === 429) {
      const retryAfter = response.headers.get('Retry-After') || 60
      await new Promise(resolve => setTimeout(resolve, retryAfter * 1000))
      continue
    }
    return response
  }  throw new Error('Rate limit exceeded after retries')
}

Server-Side Error Response

javascript

if (!rateLimitResult.success) {
  return Response.json(
    {
      error: 'Rate limit exceeded',
      message: 'Too many requests. Please try again later.',
      retryAfter: Math.ceil((rateLimitResult.reset - Date.now()) / 1000),
    },
    {
      status: 429,
      headers: {
        'Retry-After': Math.ceil((rateLimitResult.reset - Date.now()) / 1000).toString(),
      },
    }
  )
}

Rate Limiting Checklist

CRITICAL ENDPOINTS
==================
[ ] Login: 5 attempts per 15 minutes
[ ] Registration: 3 per hour
[ ] Password reset: 3 per hour
[ ] 2FA verification: 5 per 15 minutes
STANDARD API
============
[ ] Default limit on all endpoints
[ ] Per-endpoint customization where needed
[ ] Both IP and user-based limits
EXPENSIVE OPERATIONS
====================
[ ] Export/download: Strict limits
[ ] Search: Moderate limits
[ ] AI/LLM calls: Per-user quotasIMPLEMENTATION
==============
[ ] Rate limit headers in responses
[ ] Proper 429 status code
[ ] Retry-After header
[ ] Client-side retry logic

The Bottom Line

Rate limiting is insurance against abuse. Every API needs it. AI never adds it.

No rate limiting = unlimited attack attempts. Add limits on day one.

Rate Limiting for AI-Generated APIs: Stop Abuse Before It Starts

Why Rate Limiting Matters

Rate Limiting Strategies

1. Fixed Window

2. Sliding Window

3. Token Bucket

Implementing Rate Limiting

Next.js with Upstash

Express with express-rate-limit

In-Memory (Development/Simple Cases)

Endpoint-Specific Limits

Rate Limiting by User vs IP

IP-Based (Default)

User-Based

Combined

Response Headers

Handling Rate Limit Errors

Client-Side

Server-Side Error Response

Rate Limiting Checklist

The Bottom Line

Ready to secure your AI-generated code?

Continue Reading

Top 10 Security Vulnerabilities in AI-Generated Code (2026 Edition)

Authentication Bypass in AI-Generated Code: Common Patterns and Fixes