9 API Design Principles That Hold Up in 2026
Most “API design principles” articles still read like they were written in 2015. The fundamentals matter (REST verbs, status codes, predictable URIs) but the bar has moved. In 2026 your API is also being consumed by AI agents through MCP, governed by platform teams across multiple gateways, and benchmarked against the LLM APIs your customers already know.
This is the working list we use with WSO2 API Platform customers when they review an API at the design stage. Nine principles, in the order they matter.
1. Design for the consumer, not the implementation
The most common mistake in API design is letting the database schema bleed through to the API surface. Resources should reflect the concepts the consumer reasons about, not the tables you happen to store them in. A /customers/123/subscriptions endpoint is more useful than /db/v2/cust_subs_join, even if the second one is closer to your data model.
The discipline here is to write three or four example calls before writing any code. If the example call is awkward to write, the API is awkward to use. Stripe famously rewrote its early API after Patrick Collison wrote the first integration himself and discovered that the original shape required four calls to perform what a developer mentally thought of as one transaction. The fix was a single Charges resource that hid the underlying complexity, and that decision still shapes the API a decade later.
The practical test: hand a written example to a developer who has never used your API and ask them what they think happens. If they can answer correctly without consulting docs, the design is doing its job. If they hesitate, the resource names or response shapes need work before you write the first line of server code.
2. Use the conventions developers already know
A well-designed API is boring to read. Developers should be able to predict your URL structure, request shape, and response codes without consulting the docs. The conventions that have stayed conventional:
- Plural nouns for collections, singular for items.
/ordersfor the collection,/orders/42for one order. - HTTP verbs do the action; URLs identify resources.
POST /orderscreates an order. Do not invent/createOrder. - Standard CRUD mapping (the verbs you will use almost all the time):
| Method | Purpose | Idempotent? | Request body? |
|---|---|---|---|
| GET | Retrieve a resource or list | Yes | No |
| POST | Create a new resource | No | Yes |
| PUT | Replace a resource entirely | Yes | Yes |
| PATCH | Update part of a resource | Sometimes | Yes |
| DELETE | Remove a resource | Yes | No |
- kebab-case in URLs, camelCase or snake_case in JSON. Pick one for JSON and use it everywhere. Mixing styles inside one API is the fastest way to lose developer trust.
When you deviate from a convention you owe the consumer an explanation. When you follow it you owe them nothing.
3. Make endpoints predictable
Predictable endpoints reduce documentation lookups, which is the actual UX of an API.
Nested resources should map to real ownership. /users/123/orders is good when an order belongs to one user. /orders/?userId=123 is also good. What is not good is having both: pick one.
Avoid more than two levels of nesting. /companies/1/users/2/orders/3/items/4 is a mistake. When you find yourself reaching for a third level, you usually want a top-level resource with a filter parameter instead.
4. Make responses self-explanatory
A response should tell the consumer what happened without them parsing the body. Three layers do that work together:
Status codes are the universal signal. Stick to standard ones: 200 family for success, 400 family for client errors, 500 family for server errors. Custom 6xx codes are a red flag. We have a full HTTP status code reference if you need a refresher.
The response body returns the resource on create and update, returns the deletion outcome on delete. For errors, return a machine-readable code, a human-readable message, and (where appropriate) the field that failed validation.
Headers carry the metadata. Include X-Request-Id (or Request-Id) on every response so consumers can quote it back to your support team. Include rate-limit headers (X-RateLimit-Remaining, X-RateLimit-Reset) so consumers can self-throttle.
Together those three layers let a developer debug a problem without opening your dashboard.
5. Build security in at the design phase
Security designed in is cheap; security retrofitted is not. The minimums in 2026:
- Authentication. OAuth 2.0 or OIDC for user-facing flows, API keys or mTLS for server-to-server, JWT for short-lived access tokens. Never invent your own scheme.
- Transport encryption. TLS 1.2 or higher on every endpoint, including ones you think are internal. mTLS for partner integrations.
- Sensitive data handling. Never put PII or secrets in URLs, since they leak to logs and Referer headers. Encrypt log payloads at rest. Mask sensitive fields in any observability tooling you use.
- Scopes and granularity. Issue tokens with the smallest scope that gets the job done. “Read all customer data” should be a different scope from “read my own customer record.”
Most production breaches we see at the platform layer come from missed basics, not novel attacks. The OWASP API Security Top 10 (last revised in 2023) reflects this directly: the top three risks are broken object-level authorization, broken authentication, and broken object property-level authorization. None of those require an exotic exploit; they require a single endpoint where the implementing engineer forgot to check that the requesting user owns the resource being modified. The fix at the design layer is to make object-level authorization an explicit requirement in your API contract, not an afterthought added during code review.
6. Plan for rate limits and quotas before launch
Rate limiting is part of the API contract. Pick the algorithm before you ship, not after the first incident. The four patterns worth knowing:
- Fixed window: N requests per minute, counter resets. Simple, but allows burst at window boundaries.
- Sliding window: N requests over the last 60 seconds, rolling. Smoother, more memory.
- Token bucket: fill a bucket at rate R, each request takes a token. The de-facto default for public APIs.
- Leaky bucket: requests queue and drain at rate R. Useful when downstream is the bottleneck.
Whichever you choose, do three things: return 429 Too Many Requests (not 503), include Retry-After in the response header, and surface usage to the customer in real time so they can self-correct. Thresholds that look right on paper rarely hold up after launch, so plan to revisit them after the first month of real traffic.
Quota design is a related decision that often gets skipped. Pick the dimension before you ship: requests per minute, requests per day, requests per resource type, requests by token cost (relevant for AI APIs where one call can cost 100x another). Stripe rate-limits by integration; OpenAI by token throughput; GitHub by authenticated user plus IP. The right dimension depends on what the cost of a single call actually is on your infrastructure, and that varies enough by API that there is no single “correct” answer.
7. Version deliberately
APIs evolve. The question is whether the evolution is visible to your consumers. Three versioning approaches are in common use:
- URI versioning (
/v1/orders,/v2/orders): easy to implement, easy to route, easy to deprecate. Most public APIs default here. - Header versioning (
Accept: application/vnd.company.v1+json): cleaner URLs, harder for the consumer to debug. - Query string (
/orders?version=1): simple but blurs the line between version and filter parameter. Avoid for production.
The actual discipline is not picking an approach; it is having a deprecation policy. State, in writing, how much notice you give before removing a version. The standard is twelve months for paid public APIs, six months for free public APIs, and ninety days for internal APIs.
Communicate deprecation in the response itself, not just in a changelog. Two IETF specs define complementary HTTP response headers for this purpose: RFC 8594 defines Sunset (the date the endpoint goes away) and RFC 9745 defines Deprecation (flags that the endpoint is deprecated as of a given date). Modern SDKs and well-behaved HTTP clients warn developers automatically when they see these headers, so the deprecation message reaches the consumer’s terminal before it reaches their inbox.
8. Document as part of the design, not after
The shortest path to a well-documented API is to make the documentation the source of truth, not the output. An OpenAPI-first workflow (write the spec, generate the server stub from it, generate the docs from it) keeps the three artifacts in sync by construction.
This is also where the WSO2 API Platform earns its keep at enterprise scale. WSO2 API Manager applies governance policies against the OpenAPI spec automatically (naming rules, required fields, security minimums), so design-time decisions get enforced before code is merged.
What good docs include at minimum:
- Authentication walkthrough with a working
curlexample - Each endpoint with request shape, response shape, status codes, and at least one realistic example
- Error catalog keyed by error code
- Changelog and deprecation calendar: every version, what changed, when the old version goes away
- SDKs or Postman collection, generated from the spec, kept in sync
Docs that drift from the API are worse than no docs at all. A consumer who once tried to use an outdated example and got a cryptic error is unlikely to trust the docs again, which means every subsequent call becomes a support ticket. The cost of stale documentation is measured in developer-hours per integration, and at any meaningful scale it dwarfs the cost of keeping the spec-to-docs pipeline maintained.
9. Design for observability and AI-agent consumption
This is the principle that did not exist in 2020 and that matters most in 2026.
Observability-by-design. Decide at the API design stage what you will measure in production:
- Which fields are interesting for cohort analysis? Mark them in the spec.
- What is the “successful business event” (order created, message sent, payment captured)?
- Which fields contain PII and need to be masked in any logging pipeline?
When those decisions are made at design time, your observability stack (in our case, Moesif API monitoring) can wire up automatically. When they are made later, you end up doing a separate instrumentation project six months after launch.
Agent-consumable shapes. A meaningful share of API traffic now comes from AI agents through MCP. Two design choices make your API agent-friendly without code changes:
- Self-describing operations. Your OpenAPI spec should have a clear
summary,description, andoperationIdfor every endpoint. These become the natural-language interface the agent uses to choose your endpoint over a competitor’s. Treat them as user-facing copy, not as internal documentation; vague descriptions like “manages payment objects” cause wrong tool selection in roughly the same proportion that a poorly named UI button confuses a human user. - Idempotency where it matters. Agents retry.
POSTendpoints that should be idempotent (create-or-update) should accept anIdempotency-Keyheader and return the same response on retry. Stripe’s API documents this header explicitly, and the same pattern is widely used across payments, infrastructure, and fintech APIs; using a different name (X-Request-Id,X-Client-Token) breaks compatibility with client libraries that expect the convention. The IETF also has a draft “The Idempotency-Key HTTP Header Field” working toward standardization.
The WSO2 AI Gateway converts any REST API into an MCP-compatible server, so the better your spec, the better the agent experience without a second build to maintain.
Common design patterns these principles produce
The nine principles above describe the what. The patterns below show how they compose in production APIs, and which decisions repeatedly come up because the principles do not pin them down by themselves.
Pagination on every list endpoint. A GET that returns “all orders” works in development with 50 rows and falls over in production at 50,000. Pick between offset/limit (?limit=50&offset=100) and cursor-based (?cursor=abc123) pagination upfront, and apply the same shape to every list endpoint. Cursor pagination survives concurrent writes; offset pagination is simpler but degrades at high offsets. Cap the maximum page size so a client cannot request ?limit=10000.
Filtering, sorting, and field selection via query parameters. ?status=paid&customer_id=42 for filtering. ?sort=created_at or ?sort=-created_at for sorting. ?fields=id,name,email for field selection. The shape compounds across the API; once consumers learn it on one endpoint, they assume it everywhere. Inconsistent filter naming (status here, state there) is one of the most common consistency violations in API reviews.
A consistent response envelope for both success and failure. Successful responses return the resource (for GET) or the created/updated resource (for POST/PUT/PATCH). Error responses return {"error": {"code": "...", "message": "...", "field": "..."}} with the same shape across every endpoint. Consumers parse error.code to handle errors programmatically; the message is for humans and may change.
Content negotiation that matches what clients send. Read Accept and Content-Type headers and respond accordingly. Accept: application/json is the common case; Accept: text/csv for endpoints that support CSV export. Return 406 Not Acceptable if you cannot satisfy the client’s Accept header rather than silently returning a different format.
Resource representations that are stable across operations. The shape returned by GET /orders/{id} should match the shape of item entries in GET /orders. The shape returned by POST /orders (creation response) should be the same as the GET shape. Inconsistent shapes for the same resource across operations is a permanent paper cut for SDK authors and a frequent source of agent confusion.
Webhook + polling as a pair for asynchronous results. For long-running operations, return 202 Accepted with a job ID and a status URL. Let consumers either poll the status URL or subscribe to a webhook to be notified when the job completes. The pattern scales better than holding HTTP connections open for minutes.
Bulk operations for chatty workflows. When consumers regularly need to make N small calls (especially agent runtimes looping over collections), expose a batch endpoint that accepts a list of operations in a single request. Saves network round-trips and reduces per-call overhead substantially.
Caching headers that match the resource’s actual mutability. Cache-Control: max-age=300 on GET endpoints whose responses do not change every second. Cache-Control: no-store on responses containing sensitive data. ETag on resources where conditional requests (If-None-Match) would meaningfully reduce bandwidth. The defaults shipped by most frameworks are wrong for both extremes (over-caching sensitive data, under-caching cacheable data).
These patterns are not separate principles; they are what the principles produce when applied. Reading them in this concrete form is often the fastest way to spot which principles your API is following only partially.
Where to take this next
The fastest way to put these principles into production is to make them enforceable. Write your OpenAPI spec first, set up automated governance against it, and instrument every call from day one so you can see whether your design is actually serving your consumers.
Start a 14-day Moesif free trial to get per-user, per-endpoint observability on the API you are designing. No credit card required. To see how design, governance, runtime, and monetization fit together in one architecture, walk through the WSO2 API Platform with our team.
Frequently asked questions
What are the principles of API design? Design for the consumer, follow REST conventions, keep endpoints predictable, make responses self-explanatory, bake in security, plan rate limits, version deliberately, document as you design, and design for observability and AI-agent consumption.
What are the 7 core API design principles? The most common short lists drop the “AI-agent consumption” and “observability” principles and keep the seven REST fundamentals: consumer-first, conventions, predictability, self-explanatory responses, security, rate limiting, and versioning. Those two extra principles are now table stakes for any API designed in 2026.
What are the 4 pillars of REST API? Stateless interactions, a uniform interface (consistent verbs and resources), cacheability where appropriate, and a client-server separation of concerns. These are Roy Fielding’s original constraints, and they still hold.
Where do API design and API governance meet? Design says what the API should look like. Governance enforces it across an organization at scale. Platforms like WSO2 API Manager turn design principles into automated checks on every spec.