// HTTP methods · status codes · resource design · versioning · pagination · senior → principal
GET /orders/{id}, POST /orders, DELETE /orders/{id}/items/{itemId}. Bad: POST /getOrder, GET /deleteUser. Use plural nouns for collections (/users, /products). Nest sub-resources only when the child cannot exist without the parent and nesting depth stays ≤ 2 levels — deeper paths become brittle. For actions that don't map to CRUD cleanly, use a sub-resource noun: POST /payments/{id}/refunds rather than POST /refundPayment.
/v1/orders) — explicit, easy to route, but pollutes URLs and breaks REST's uniform interface. Header versioning (Accept: application/vnd.myapi.v2+json) — clean URLs, but harder to test in a browser. Query param (?version=2) — easy to ignore by proxies, not RESTfully clean. URI versioning wins in practice for public APIs because it's visible and cacheable. Key rule: never break a published version. Additive changes (new fields, new endpoints) are backward-compatible; removing fields or changing types requires a new version.
?offset=40&limit=20) — simple, supports random access, but unstable when rows insert/delete mid-page (items skipped or duplicated). Page/Size (?page=3&size=20) — same trade-offs as offset. Cursor-based (opaque token encoding last-seen position) — stable, efficient for large datasets and infinite scroll, but no random page jump. Preferred at scale. Keyset (?after_id=1234) — similar to cursor but uses a sortable key; very fast with an index. Always include total count (X-Total-Count header or response envelope) and next/prev links (HATEOAS) so clients don't hardcode offsets.
Cache-Control (max-age, no-store, no-cache, private/public), ETag (entity tag — server fingerprint of resource), Last-Modified. Conditional requests: If-None-Match: <etag> or If-Modified-Since → 304 if unchanged. GET and HEAD are cacheable by default; POST can be if explicitly declared. PUT/DELETE must invalidate caches. Vary header tells CDNs to cache separately per Accept-Language, Accept-Encoding, etc. ETags enable optimistic concurrency on updates: send If-Match: <etag> with a PUT; server returns 412 Precondition Failed if the resource changed since the client read it.
"_links": {"cancel": {"href": "/orders/42/cancellations", "method": "POST"}}. Practically: few public APIs implement full HATEOAS; most stop at Level 2 (resources + verbs). But including next/prev in paginated responses and Location on 201 are lightweight HATEOAS wins that pay off immediately.
Retry-After header. Standard informational headers: X-RateLimit-Limit (max requests), X-RateLimit-Remaining, X-RateLimit-Reset (epoch timestamp). Algorithms: Token Bucket (bursty traffic allowed, smooth average), Leaky Bucket (constant output rate), Fixed Window (simple but boundary spike), Sliding Window (smoother, no boundary spike). Apply limits per API key, per user, or per IP depending on the API's trust model.
/deleteUser?id=5 means those actions can be replayed unexpectedly, logged in access logs, and cached. Always use POST/PUT/DELETE for mutations.
Idempotency-Key: <uuid> header) on non-idempotent operations. The server deduplicates within a time window and returns the original response for duplicate keys.
200 OK with {"status": "error", "message": "Not found"}. This breaks HTTP clients, monitoring tools, and caches that rely on status codes. Always return the correct HTTP status. Include a structured error body with a machine-readable code field and human-readable message — following RFC 7807 (Problem Details) is good practice.
Content-Type: application/json on requests with a body causes many frameworks to ignore or misparse the body. On responses, missing Content-Type prevents correct client parsing. Always set both Content-Type and Accept explicitly. For file uploads use multipart/form-data, not application/json.
Deprecated header and sunset date before removal.
| GET | Retrieve resource. Safe + idempotent. Response cacheable. No body. |
| POST | Create resource or trigger action. Not safe, not idempotent. 201 Created + Location on success. |
| PUT | Full replace. Idempotent. 200 OK or 204 No Content. Client sends complete representation. |
| PATCH | Partial update. Not inherently idempotent. Use JSON Patch (RFC 6902) or Merge Patch (RFC 7396). |
| DELETE | Remove resource. Idempotent. 204 No Content. Repeated deletes: 404 or 204 both acceptable. |
| HEAD | Same as GET but no body. Use to check existence or get headers (Content-Length, ETag) cheaply. |
| OPTIONS | Returns allowed methods. Used for CORS preflight. Response includes Allow and Access-Control-* headers. |
| 200 OK | Generic success for GET, PUT, PATCH with body. |
| 201 Created | POST that created a resource. Must include Location header pointing to new resource. |
| 202 Accepted | Request accepted but processing async. Include a job/status URL in response. |
| 204 No Content | Success with no response body. Common for DELETE and PATCH. |
| 304 Not Modified | Conditional GET hit cache. ETag/Last-Modified still fresh. No body sent. |
| 400 Bad Request | Malformed request syntax, invalid parameters. Include field-level validation errors. |
| 401 Unauthorized | No valid authentication provided. Include WWW-Authenticate header. |
| 403 Forbidden | Authenticated but not authorized. Don't expose resource existence here. |
| 404 Not Found | Resource doesn't exist. Can also use 404 to hide 403 for sensitive resources. |
| 409 Conflict | State conflict — duplicate create, optimistic lock failure, business rule violation. |
| 412 Precondition Failed | If-Match or If-None-Match header condition failed. Used with ETags for optimistic concurrency. |
| 422 Unprocessable Entity | Syntactically valid but semantically invalid (e.g., end date before start date). |
| 429 Too Many Requests | Rate limit exceeded. Include Retry-After and X-RateLimit-* headers. |
| 500 Internal Server Error | Unhandled server error. Never expose stack traces. Log internally, return generic message. |
| 503 Service Unavailable | Temporarily overloaded or down. Include Retry-After. Used during maintenance/deploys. |
| Content-Type | Media type of the body. Request: what I'm sending. Response: what I'm returning. Always set. |
| Accept | Client's preferred response media type. Enables content negotiation. |
| Authorization | Credentials — Bearer |
| ETag | Response fingerprint. Client sends back as If-Match (update) or If-None-Match (cache check). |
| Cache-Control | Caching directives: max-age= |
| Location | URI of newly created resource (201) or redirect target (3xx). |
| Retry-After | Seconds (or HTTP date) until client may retry. Use with 429 and 503. |
| Idempotency-Key | Client-generated UUID for deduplication of non-idempotent requests (payments, emails). |
| X-Request-ID | Correlation ID for distributed tracing. Echo it in response for log correlation. |
| Deprecation / Sunset | RFC 8594: Deprecation: |
| Offset / Limit | Simple. ?offset=40&limit=20. Supports random page jump. Unstable: inserts/deletes skew results. Bad on large offsets (DB scans all preceding rows). |
| Page / Size | Same as offset but user-facing. ?page=3&size=20. Same trade-offs. Avoid for large tables. |
| Cursor-based | Opaque encoded pointer (base64 timestamp+id). Stable across mutations. Fast. No random page jump. Preferred for infinite scroll and event feeds. |
| Keyset | ?after_id=1234. Requires sortable unique key. Very fast with index. Transparent (not opaque). Good for time-series data. |
| Response envelope | Always include: { "data": [...], "pagination": { "next_cursor": "...", "has_more": true } } or use Link header (RFC 5988) for HATEOAS alignment. |
| URI versioning (/v1/) | Most common. Explicit, easy to route, test, and cache. Pollutes URL namespace. Breaking: resource URIs change between versions. Best for public APIs. |
| Header versioning | Accept: application/vnd.api.v2+json or custom X-API-Version header. Clean URLs. Hard to test in browser. Proxy/CDN routing is complex. Best for internal or partner APIs. |
| Query param (?version=2) | Easy to add. Often ignored by proxies. Not cache-friendly. Suitable for simple tools where header control is hard. |
| Additive changes (no bump) | New optional fields in response, new optional request params, new endpoints — always backward-compatible. No version bump needed. |
| Breaking changes | Removing fields, changing types, renaming fields, making optional required. Always bump major version. Maintain old version for deprecation period (6–12 months min). |
| Protocol | HTTP/1.1 or HTTP/2 | HTTP/1.1 or HTTP/2 | HTTP/2 only |
| Data format | JSON (usually) | JSON | Protobuf (binary) |
| Schema / contract | OpenAPI (optional) | Strongly typed schema (SDL) | Strongly typed .proto file |
| Query flexibility | Fixed endpoints per resource | Client specifies exact fields | Fixed RPCs per method |
| Over-fetching | Common | Eliminated by design | Minimal (schema-driven) |
| Streaming | SSE / chunked | Subscriptions | Native bi-directional streaming |
| Caching | HTTP cache built-in | Hard (POST by default) | No HTTP caching; app-level only |
| Browser support | Native | Native | Needs grpc-web proxy |
| Error handling | HTTP status codes | 200 + errors[] array | gRPC status codes |
| Best for | Public APIs, CRUD, mobile | Complex frontends, BFF layer | Internal microservices, low latency |
| Tooling maturity | Excellent (Postman, curl) | Good (Apollo, Altair) | Good (grpcurl, BloomRPC) |
increment counter by 1 is a PATCH that isn't idempotent), though most PATCH implementations are.
Choose PUT when: the client always sends the complete resource (e.g., replacing a configuration object). Choose PATCH when: you want to update specific fields without knowing or sending the full resource (e.g., updating a user's email only).
Two PATCH formats: JSON Merge Patch (RFC 7396) — simple key-value overlay, null means delete field. JSON Patch (RFC 6902) — operation array (add, remove, replace, move, copy, test), more expressive.WWW-Authenticate header indicating the auth scheme.
403 Forbidden — the client is authenticated (you know who they are) but they don't have permission to access the resource.
Decision tree: - No/invalid credentials → 401 - Valid credentials, wrong role/scope → 403 - Valid credentials, resource doesn't exist → 404 (or 403 to hide existence)
Security note: returning 404 instead of 403 for sensitive resources prevents information leakage — an attacker learns neither the resource exists nor that they lack permission. Use this pattern for confidential resources.POST /payments isn't idempotent, a retry after a timeout creates a duplicate charge.
Making POST idempotent with idempotency keys: 1. Client generates a UUID and sends it as Idempotency-Key: <uuid> header 2. Server stores {key → response} in a fast store (Redis) with a TTL (24 hours) 3. On receipt: if key seen before, return stored response immediately; if not,
process and store result atomically
4. Return the same response (including status code) for duplicate requests
The storage must be atomic: use Redis SET key value NX EX <ttl> to prevent races where two simultaneous requests with the same key both proceed.Offset/Limit (?offset=40&limit=20): - Pro: random page access, simple to implement - Con: unstable (inserts/deletes shift rows between pages), slow at high offsets
(DB must scan all preceding rows), inconsistent results for real-time data
Cursor-based pagination: - Server returns an opaque cursor (base64 of timestamp + last ID) - Client passes ?cursor=<token> on next request - Pro: stable, consistent, O(log n) with index, works for infinite scroll - Con: no random page jump, cursor becomes invalid if data is deleted
Keyset pagination (?after_id=1234&after_created=2024-01-01T12:00:00Z): - Similar to cursor but uses readable columns; very fast with composite index - Good for time-ordered data (logs, events)
My recommendation: use cursor-based for anything > 10k rows or with real-time updates. Always return has_more, total count where cheap, and next/prev links.
Cache-Control directives set cacheability: max-age=3600 (cache for 1 hour), no-store (never cache), no-cache (cache but always revalidate), private (browser only), public (shared caches ok).
ETag is an opaque fingerprint of the resource (hash of content or version). Server includes it in GET responses. Client stores it.
Conditional requests: - If-None-Match: <etag> — client asks "return body only if changed." Server
returns 304 Not Modified (no body, saves bandwidth) or 200 with new body.
- If-Match: <etag> — client says "update only if resource still matches this etag."
Server returns 412 Precondition Failed if modified by someone else.
ETags for optimistic concurrency: 1. GET /resource → response has ETag: "v42" 2. Client edits, sends PUT /resource with If-Match: "v42" 3. If another writer updated it in between, server returns 412 — no lost update
json {
"type": "https://api.example.com/errors/validation",
"title": "Validation Failed",
"status": 422,
"detail": "One or more fields failed validation.",
"instance": "/orders/abc",
"errors": [
{"field": "quantity", "message": "must be greater than 0"}
]
}
Key properties: - type: URI identifying the error type (machine-readable) - title: human-readable, stable (don't change per request) - detail: specific to this request - instance: URI of the specific occurrence (links to logs) - Extend with errors array for validation failures
Never expose: stack traces, internal class names, SQL errors, server paths. Always log the correlation ID (X-Request-ID) so support can trace the request.Accept: application/json, application/xml;q=0.8, */*;q=0.5 (q values are quality factors, 0–1, default 1.0)
Server responds with: - Best matching format and Content-Type: application/json - 406 Not Acceptable if no match is possible
Language negotiation: Accept-Language: en-US, fr;q=0.8 Encoding: Accept-Encoding: gzip, br — server can compress response
Why it matters: one endpoint serves multiple consumers (JSON for web, XML for legacy SOAP clients, CSV for analytics pipelines) without separate URLs. Also used for API versioning: Accept: application/vnd.myapi.v2+json.Access-Control-Allow-Origin header.
Preflighted requests (non-simple methods or headers like PUT, DELETE, Authorization, custom headers): browser sends an OPTIONS request first: OPTIONS /api/orders Origin: https://app.example.com Access-Control-Request-Method: DELETE Access-Control-Request-Headers: Authorization Server responds: Access-Control-Allow-Origin: https://app.example.com Access-Control-Allow-Methods: GET, POST, PUT, DELETE Access-Control-Allow-Headers: Authorization, Content-Type Access-Control-Max-Age: 86400 ← cache preflight for 24h Browser then sends the actual request.
Access-Control-Allow-Credentials: true needed if cookies/auth headers must be sent cross-origin. Cannot be combined with Allow-Origin: *.Breaking changes require a version bump. Process:
1. Release v2 alongside v1 — never replace, always add 2. Announce deprecation on v1: add Deprecation: true and
Sunset: Sat, 31 Dec 2025 00:00:00 GMT headers (RFC 8594) to v1 responses
3. Communicate migration guide — exact diff of what changed, code examples 4. Monitor v1 traffic — identify active callers using API key / request logs 5. Reach out to high-volume callers before sunset 6. Return 410 Gone after sunset date — never 404 (clients need to know the
resource existed and was intentionally removed)
What counts as breaking: removing fields, changing field types, renaming fields, making optional params required, changing auth scheme, changing status codes. Not breaking: adding new optional fields to responses, adding new optional request params, adding new endpoints, loosening validation.
Idempotency-Key: uuid-v4 (per payment attempt, not per session) 2. Server atomically: INSERT INTO idempotency_keys (key, status='PROCESSING') ON CONFLICT DO NOTHING
— if conflict, another request with same key is in-flight or done
3. Execute payment: call payment processor, write to payments table 4. Update idempotency record: {key, status='DONE', response_body, response_status} 5. Return response; subsequent retries get stored response
Handling partial failures: - Payment processor times out before confirmation: store status='UNCERTAIN' - Reconciliation job polls processor for final state and updates - Client retrying gets 202 Accepted with a status check URL until resolved
Audit trail: - Append-only payment_events table: every state transition (CREATED, AUTHORIZED, CAPTURED, REFUNDED, FAILED) - Include actor, timestamp, idempotency_key, processor_response on each event - Never update; read by replaying events or projecting to payments table
Race condition prevention: - Redis SET key 'PROCESSING' NX EX 30 as distributed lock before DB write - Expires after 30s so a crashed server doesn't lock foreverratelimit:{api_key}:{window_start} → count - Fast (sub-ms), consistent, but Redis is now a bottleneck and single point of failure - Mitigation: Redis Cluster, read replicas for quota checks
Token bucket in Redis (atomic Lua script): lua -- Atomically check and decrement token bucket local tokens = tonumber(redis.call('GET', key)) or capacity if tokens > 0 then redis.call('SET', key, tokens-1, 'EX', window)
return 1 else return 0 end
Approximate local + sync: - Each instance tracks locally; periodically syncs with Redis - Trades perfect accuracy for resilience; over-allows briefly during sync interval - Acceptable for most APIs; unacceptable for billing/quota-critical limits
Response headers (always include): X-RateLimit-Limit: 1000 X-RateLimit-Remaining: 743 X-RateLimit-Reset: 1704067200 Retry-After: 57
Limit dimensions: per API key, per user, per IP, per endpoint, per tier (free/pro/enterprise). Implement as a middleware/filter at the API gateway level, not per service.Deprecation + Sunset headers to deprecated fields/endpoints - Remove old fields after sunset date; return 410 for removed endpoints
Contract testing (Pact): - Consumer-driven: consumers publish their expectations (pacts) - Provider CI runs against pact broker; build fails if provider breaks a pact - Catches breaking changes before they reach production
Field-level deprecation in OpenAPI: yaml userId:
type: integer
deprecated: true
description: "Use userUuid instead. Removed after 2026-06-01."user_id param (IDOR prevention)
- Rate limiting per API key + per IP
Input validation: - Validate all inputs against strict schema (size limits, type, format, regex) - Reject unknown fields or strip them — prevents mass assignment - Limit JSON body size (e.g., 1MB max) to prevent large payload DoS
Output filtering: - Never expose internal IDs, stack traces, DB column names in errors - Use opaque IDs (UUIDs or encoded IDs) — sequential IDs enable enumeration attacks
Audit logging: - Log every mutating request with auth principal, timestamp, resource, diff - Store separately from application logs — immutable, tamper-evident
API gateway features: TLS termination, auth, rate limiting, IP allowlisting, request/response logging — centralise here rather than per-service.The core tension: internal APIs optimize for developer experience and velocity; external APIs optimize for stability, discoverability, and trust. Dual-layer architecture: - Internal APIs (BFFs and service-to-service): gRPC for performance-critical paths, REST for simpler CRUD. Versioning via Protobuf evolution rules. Deployed continuously. Contracts enforced by Pact/schema registries. - External APIs (developer platform): REST with OpenAPI spec as the contract. Semantic versioning with long deprecation windows (6–12 months). API key management, usage analytics, developer portal (docs, sandbox, SDKs).
API Gateway as the seam: - External requests hit the gateway first: auth, rate limiting, request translation - Gateway routes to internal services; internal callers bypass (or use internal gateway) - This decouples external API stability from internal service topology Design governance: - API design review board: new external APIs reviewed against style guide before release - OpenAPI-first: spec written before implementation; code generated from spec - Breaking change policy: documented, enforced, with automated Pact checks in CI Developer experience investment: - Interactive docs (Swagger UI / Redoc) generated from spec - SDKs in top languages auto-generated from OpenAPI spec (openapi-generator) - Sandbox environment with realistic test data - Webhook delivery with retry, delivery logs, and replay UI Metrics that matter for a platform: API adoption rate, SDK version distribution (how many still on deprecated versions), time-to-first-successful-call (DX metric), error rate per consumer, SLA breach rate.
Immediate triage (minutes): - Check error rate, latency P50/P95/P99, saturation (CPU, connections, DB pool) at each layer: CDN → API gateway → service → DB - Identify if degradation is global or per-endpoint/consumer - Pull slow query log; check connection pool exhaustion; check downstream dependency health (DB, cache, third-party APIs) - Apply rate limiting tighter if specific consumers are causing the spike Short-term stabilization (hours): - Scale horizontally if CPU/connection-bound (add instances) - Enable aggressive caching at gateway for GET endpoints that can tolerate staleness - Shed load: return 503 with Retry-After for non-critical endpoints - Circuit break unhealthy downstream dependencies Root cause categories and fixes: - N+1 queries on REST responses: add projection endpoints, or GraphQL where clients over-fetch; add read replicas; add response-level caching (Redis) - Missing indexes on filter params: profile slow queries, add targeted indexes - Hot partition: if sharding by customer ID and one customer is massive, rethink sharding key or add dedicated capacity - Thundering herd on cache miss: cache stampede — use mutex/lock-based cache population or probabilistic early expiration (XFetch algorithm)
Architectural response (weeks): - Read path: CQRS — separate read model (denormalized, cached) from write model - Write path: async where possible (202 + event); offload heavy processing to queues - Contract with consumers: add pagination where endpoints return unbounded results - SLA tiering: separate clusters for critical vs. background traffic; bulkhead pattern
{ "data": [...], "next_cursor": "<base64>", "has_more": true }. The web dashboard can use the same cursor API but present page numbers as a UX affordation — map page clicks to cursor hops internally.?format=csv&from=2024-01-01&to=2024-12-31, streams chunked response with Transfer-Encoding: chunked. Process in keyset batches of 10k rows using (created_at, id) > (last_created_at, last_id). Alternatively, trigger an async export job (202 Accepted + Location: /exports/{jobId}), write to S3, return a presigned download URL when done.(account_id, created_at DESC, id DESC) covers all common filter patterns. Date range filter with account_id prefix is highly selective. Partial index on transaction_type if type filtering is frequent.INCR ratelimit:{key}:min:{unix_minute} with EXPIRE 60. Long window: INCR ratelimit:{key}:month:{year_month} with EXPIRE 2678400 (31 days). Lua script combines both checks atomically — single round-trip. Redis latency is typically < 1ms on the same rack; 5ms budget is achievable.(last_refill_time, current_tokens) per key and a Lua script that refills tokens proportional to elapsed time on each request.X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset (epoch of next window reset), and for 429 responses Retry-After. These headers let well-behaved clients implement client-side throttling and back-off before hitting the limit.