Many questions arise when we start designing an API, especially if we want to create a REST API and adhere to the REST core principles:
- Client-Server Architecture
- Layered System
- Uniform Interface
One topic in this space that is debated quite often is the nesting of resources also called sub-resources.
- Why would anyone nest their resources?
- Are they a good idea in the first place?
- Should we nest our resources?
- When should we nest our resources?
- If we nest our resources, what should we keep in mind?
Since this decision can have a considerable impact on many parts of your API, like security, maintainability or changeability, I want to shine some light on this topic in hopes that it helps to make this decision more educated.
First, we will look into the reasons that speak for nested resources. After that, we will talk about the reasons that make nested resources problematic.
Let’s start with the central question: Why should we use a nested resource design approach?
Why should we use this approach:
over this one:
The main reason for this approach is readability; a nested resource URL can convey that one resource belongs to another one. It gives the appearance of a hierarchical relationship, like directories give in file-systems.
These URLs convey less meaning about the relationship:
Than these URLs:
We can directly see that the rating we are requesting belongs to a specific book. In many cases, this can make debugging easier.
I said appearance of hierarchical relationship because the underlying data-model doesn’t have to be hierarchical. For example, on GitHub, a user can have contributed code to multiple repositories, and a repository can have contributions from various users. It’s a many-to-many relationship.
If you only knew about one of the endpoints, it could seem that it was a one-to-many relationship.
Other, more technical reasons, are relative IDs or context of the nested resource.
Houses, for example, have house-numbers, but they are local to the streets they belong to. If you know the house has number 42, but you don’t remember the street it doesn’t help you much.
Another example could be file-names in a file-system. Just knowing that our file is called
README.md won’t help if there are hundreds of files named like that in hundreds of different directories.
If we use a relational database, we often have unique keys for all of our data records, but as we see, with other kinds of data-stores, like file-systems, this doesn’t necessarily have to be the case.
Nested URLs can also be manipulated rather easily. If a hierarchy is encoded in an URL we can drop parts of the URL to climb this hierarchy up. This makes APIs with nested resources quite a bit simpler to navigate.
To sum it all up, we want to use nested resources to improve readability and in turn developer experience and sometimes we even have to use them because the data-source doesn’t give us a way to identify a nested resource solely by their ID.
Now that we talked about the reasons why we should use nesting, it’s also important to talk about the other side: Why should we not nest our resources?
While nesting is sometimes necessary and can’t be avoided, it is often a choice that comes with specific costs or dangers we should keep in mind.
Let’s look at them one-by-one.
Potentially Long URLs
We learned before that nesting resources could make our URLs more readable, but this isn’t a sure bet.
Especially in rather complex systems with many relationships between the resources the nested approach can lead to rather long and complicated URLs.
This issue can become even more problematic if we use long strings as IDs:
So when we start to go down this path, we should step back sometimes and look if we are still accomplishing our goal of improved readability.
A rule of thumb is a maximum nesting depth of two. Sometimes a depth of three is also okay. For example, if our IDs are short and easily readable.
_What is Moesif? Moesif is the most advanced API analytics service used by Thousands of platformsto understand how your customers use your APIs and which resources they use the most.
In general, using nested resources isn’t as flexible as using root resources only.
For example, if we have a many-to-many relationship. Repositories have multiple contributors, but every user can also contribute to various repositories.
If we want to realize this with nested resources, we have to create two endpoints alone for this relationship
If we want to realize this without nesting, we could define one root resource for contributions that also allows filter parameters in its URL.
The parameters are optional, so we could also use it to get all contributions, and we can
POST to it to change and create relationships.
While this doesn’t seem to be a problem with one-to-many relationships, in which one part of the relationship can’t have multiple connections, we can still get at a point where we want to search for all records of a nested resource across its parent resources.
So while having this endpoint:
We could still want to get all children of all mothers and create a new endpoint for this
Redundant endpoints also increase the surface of our API, and while more readable URLs for our resource relationships are a good thing for developer experience, a giant amount of endpoints is not.
Multiple endpoints increase the effort for the API owner to document the whole thing and make onboarding for new customers much more troublesome.
Multiple endpoints that return the same representations can also lead to problems with caching and can violate one of the core principles of RESTful API design.
This problem can be solved via HTTP redirects, so all representations are returned from a central root resource and can be cached, but there is still code needed to implement this.
It can also violate another core principle, the Uniform Interface.
When a client holds a representation of a resource, including any metadata attached, it has enough information to modify or delete the resource on the server, provided it has permission to do so.
If the representation doesn’t include information about the nesting and we don’t have root resources to directly access it; we can’t create, update or delete it.
Multiple Database Queries
If we traverse a relationship graph down instead of using one unique identifier (if it exists) to retrieve a representation from a resource, we need to check if the relationship realized in an URL holds true.
Take this example of getting a nested comment
- Is there a blog with ID X?
- Let’s ask the DB!
- Does our blog with ID X have an article with ID Y?
- Let’s ask the DB!
- Does our article with ID Y have a comment with ID Z?
- Let’s ask the DB!
Getting all comments on all articles of all blogs is also a problem.
- query for all blogs
- query each blog for each of its articles
- query each article for each of their comments
The N+1 Query problem hit’s us hard with this API design.
If we just had a root resource for our comments, we could query it and throw in a few filter parameters if needed. If comments have globally unique IDs, we could query them directly.
If we share links to our resources, all data encoded inside the URL is potentially exposed to third parties, even if they don’t have access to request the representation from our API.
URLs will be logged by intermediates when requesting anything via HTTP on the Internet, so the links doesn’t even have to be actively shared on social media or the like.
For example, this image link:
If we share it somewhere, we people learn that we have a user with a specific name and that they uploaded images on our service.
If the image link was a root resource, no such information would be apparent.
If our relationships change, the URLs they’re encoded into aren’t stable anymore.
Sometimes this can be useful, but more often than not we want to keep our URLs so old links won’t stop working.
For example, this owner-product relationship:
If the product were accessible as a root resource it wouldn’t matter who owns it.
As I mentioned before, if the relationships change rather often, we can also consider to treat the relationship itself as a resource.
With this approach, we can change the relationships via one single endpoint but link our other resources directly via their own root resource that isn’t affected by this change.
So what is the take-away of all this?
Should we nest our resources or not?
Sometimes it can’t be avoided, because the data-source simply doesn’t gives us any other choice, but if we have the choice we should consider all the pros and cons.
If the data is strictly hierarchical, not too deply nested and the relationships don’t change too often, I would go with nested resources.
The downsides aren’t too big for the wins in developer experience.
If the data is prone to relationship changes or has quite complex relationships to start with, it’s easier to maintain root resources or even to consider completely different approaches like GraphQL.
More endpoints and, as the nesting scenario implies, more complex endpoints means more code and documentation to write. This doesn’t lead to a question of feasibility in terms of skills or know-how, but often simply questions of development and maintainence costs. So even if we know how to do it and security or cacheability isn’t much of a concern, we have to ask ourself if it gives us any competitive advantage.