All articles
Vulnerabilities9 min readJanuary 8, 2026
Rate LimitingAPI SecurityDoS PreventionThrottling

Rate Limiting for AI-Generated APIs: Stop Abuse Before It Starts

APIs without rate limiting invite abuse. Learn how to implement proper rate limiting in your Next.js and Node.js applications.

Security Guide

Why Rate Limiting Matters

Without rate limiting, your API is vulnerable to:

  • Brute force attacks: Millions of password guesses
  • Credential stuffing: Testing stolen credentials
  • Data scraping: Extracting your entire database
  • Resource exhaustion: Expensive queries overloading servers
  • Cost exploitation: Running up your API/compute bills
AI never adds rate limiting. You have to.

Rate Limiting Strategies

1. Fixed Window

Limit: 100 requests per minute
Window: 00:00 to 00:59, 01:00 to 01:59, ...

00:00 - 00:30: 90 requests ← OK 00:30 - 00:59: 10 requests ← Hits limit 00:59: Request blocked 01:00: Counter resets, allowed again

Pros: Simple, memory-efficient Cons: Burst at window boundaries

2. Sliding Window

Limit: 100 requests per minute
Check: Last 60 seconds from current time

00:30: Check 99:30 - 00:30, 50 requests → Allowed 00:45: Check 99:45 - 00:45, 95 requests → Allowed 00:50: Check 99:50 - 00:50, 100 requests → Blocked

Pros: Smoother rate limiting Cons: More memory/computation

3. Token Bucket

Bucket: 10 tokens
Refill: 1 token per second

Request 1: 9 tokens remaining Request 2: 8 tokens remaining ... Request 10: 0 tokens, wait for refill (1 second passes): 1 token available

Pros: Allows bursts, smooth average rate Cons: More complex implementation

Implementing Rate Limiting

Next.js with Upstash

bash
npm install @upstash/ratelimit @upstash/redis
javascript
// lib/ratelimit.ts
import { Ratelimit } from '@upstash/ratelimit'
import { Redis } from '@upstash/redis'

export const ratelimit = new Ratelimit({ redis: Redis.fromEnv(), limiter: Ratelimit.slidingWindow(10, '10 s'), analytics: true, })

// middleware.ts import { ratelimit } from '@/lib/ratelimit' import { NextResponse } from 'next/server'

export async function middleware(request) { const ip = request.ip ?? '127.0.0.1' const { success, limit, reset, remaining } = await ratelimit.limit(ip)

if (!success) { return new NextResponse('Too Many Requests', { status: 429, headers: { 'X-RateLimit-Limit': limit.toString(), 'X-RateLimit-Remaining': remaining.toString(), 'X-RateLimit-Reset': reset.toString(), }, }) }

return NextResponse.next() }

export const config = { matcher: '/api/:path*', }

Express with express-rate-limit

javascript
import rateLimit from 'express-rate-limit'

const limiter = rateLimit({ windowMs: 15 * 60 * 1000, // 15 minutes max: 100, // Limit each IP to 100 requests per window standardHeaders: true, legacyHeaders: false, message: { error: 'Too many requests, please try again later.' }, })

app.use('/api/', limiter)

// Stricter limit for auth endpoints const authLimiter = rateLimit({ windowMs: 15 * 60 * 1000, max: 5, // Only 5 login attempts per 15 minutes skipSuccessfulRequests: true, // Don't count successful logins })

app.use('/api/auth/', authLimiter)

In-Memory (Development/Simple Cases)

javascript
// Simple in-memory rate limiter
const requests = new Map()

function rateLimit(key, limit, windowMs) { const now = Date.now() const windowStart = now - windowMs

// Get or initialize request log let log = requests.get(key) || []

// Remove old entries log = log.filter(timestamp => timestamp > windowStart)

// Check limit if (log.length >= limit) { return { allowed: false, remaining: 0 } }

// Add current request log.push(now) requests.set(key, log)

return { allowed: true, remaining: limit - log.length } }

// Usage export async function POST(req) { const ip = req.headers.get('x-forwarded-for') || 'unknown' const { allowed, remaining } = rateLimit(ip, 10, 60000)

if (!allowed) { return Response.json({ error: 'Rate limit exceeded' }, { status: 429 }) }

// Process request }

Endpoint-Specific Limits

Different endpoints need different limits:

javascript
const limits = {
  // Authentication - very strict
  '/api/auth/login': { requests: 5, window: '15m' },
  '/api/auth/register': { requests: 3, window: '1h' },
  '/api/auth/reset-password': { requests: 3, window: '1h' },

// Standard API - moderate '/api/users': { requests: 100, window: '1m' }, '/api/posts': { requests: 100, window: '1m' },

// Expensive operations - strict '/api/export': { requests: 5, window: '1h' }, '/api/search': { requests: 30, window: '1m' }, '/api/ai/generate': { requests: 10, window: '1m' },

// Public endpoints - lenient '/api/health': { requests: 1000, window: '1m' }, }

Rate Limiting by User vs IP

IP-Based (Default)

javascript
const identifier = request.ip

Good for: Unauthenticated endpoints, login pages Problem: Shared IPs (offices, schools) get rate limited together

User-Based

javascript
const session = await getServerSession()
const identifier = session?.user.id || request.ip

Good for: Authenticated endpoints, per-user quotas Problem: Attackers can create multiple accounts

Combined

javascript
// Separate limits for IP and user
const ipLimit = await ratelimit.limit(ip:${ip})
const userLimit = session
  ? await ratelimit.limit(user:${session.user.id})
  : { success: true }

if (!ipLimit.success || !userLimit.success) { return new Response('Rate limited', { status: 429 }) }

Response Headers

Always include rate limit headers:

javascript
return new Response(data, {
  headers: {
    'X-RateLimit-Limit': '100',
    'X-RateLimit-Remaining': '95',
    'X-RateLimit-Reset': '1640000000',
    'Retry-After': '60', // Seconds until reset
  },
})

Handling Rate Limit Errors

Client-Side

javascript
async function fetchWithRetry(url, options, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    const response = await fetch(url, options)

if (response.status === 429) { const retryAfter = response.headers.get('Retry-After') || 60 await new Promise(resolve => setTimeout(resolve, retryAfter * 1000)) continue }

return response }

throw new Error('Rate limit exceeded after retries') }

Server-Side Error Response

javascript
if (!rateLimitResult.success) {
  return Response.json(
    {
      error: 'Rate limit exceeded',
      message: 'Too many requests. Please try again later.',
      retryAfter: Math.ceil((rateLimitResult.reset - Date.now()) / 1000),
    },
    {
      status: 429,
      headers: {
        'Retry-After': Math.ceil((rateLimitResult.reset - Date.now()) / 1000).toString(),
      },
    }
  )
}

Rate Limiting Checklist

CRITICAL ENDPOINTS
==================
[ ] Login: 5 attempts per 15 minutes
[ ] Registration: 3 per hour
[ ] Password reset: 3 per hour
[ ] 2FA verification: 5 per 15 minutes

STANDARD API ============ [ ] Default limit on all endpoints [ ] Per-endpoint customization where needed [ ] Both IP and user-based limits

EXPENSIVE OPERATIONS ==================== [ ] Export/download: Strict limits [ ] Search: Moderate limits [ ] AI/LLM calls: Per-user quotas

IMPLEMENTATION ============== [ ] Rate limit headers in responses [ ] Proper 429 status code [ ] Retry-After header [ ] Client-side retry logic

The Bottom Line

Rate limiting is insurance against abuse. Every API needs it. AI never adds it.

No rate limiting = unlimited attack attempts. Add limits on day one.

Ready to secure your AI-generated code?

Stop reading about vulnerabilities. Start fixing them.

Start Scanning Free