Designing APIs that Enable Scaleable Frontends

December 21, 2017

This blog post was originally written for and posted on Rangle’s blog. You can find a link to the original here.

Successful web development requires delivering strong communication between backend servers and frontend applications. The end client of an API needs to easily understand how to utilize the system to develop features and improve the application. REST (REpresentational State Transfer) has historically been used as the paradigm for this communication.

REST helps to describe communication between the frontend system and backend server. It provides the backend with a simple interface for accessing data and executing server-side functions. This interface delivers a commonly agreed upon set of methods that are baked into HTTP requests that every website uses. It’s the existing standard that is commonly used across applications developed in recent years.

However, even with its legacy of success, REST implementations still present some challenges:

The frontend can receive too much data that it isn’t using. This is a waste of bandwidth for the consumer of the application as well as a large memory overhead as any data sent will be parsed into a JavaScript object.
Multiple requests may be required to gather all of the data to render a view on the client side. These requests can result in large load times and frame drops as large chunks of data are loaded incrementally.
Backend systems will also be querying or requesting unimportant data that it serves up. This increases computation time and unnecessary load across the backend system all the way down to the database.

These problems and others can compound to hamper a product and cripple its ability to scale to meet production demand. It can also be difficult to optimize for end users because of the breadth of issues across both the frontend and backend systems. We’ll highlight several design patterns that have emerged to improve the quality of both ends of the application and deliver features to market faster. In particular, we’ll be looking to GraphQL as a potential solution.

Custom Endpoints

The first pattern is really a brute-force approach to developing endpoints. Determine what exact data is required at request time and return back a response that meets the needs of that specific payload. For example /mail/getMail?id=1 might be an endpoint in the API that returns all of the mail a user has received.

The upside of this approach is that it’s easy to optimize this request to do only what it can do well. However, the downside becomes more visible when we start to add more endpoints to get specific information about a user’s mail. We might have a new scenario where we don’t want to include the sender information in the payload. So we might create a new endpoint to handle that:

/mail/getMailWithoutSenderDetails?id=1

Now we’ve started down a dangerous path where our API begins to get more cluttered with endpoints. These endpoints add up and it becomes slower and more tedious to maintain all of them. Without the right testing, it will become more difficult with time to determine the purpose of each endpoint and why each it was created.

Flagging Data Fields

Okay, so custom endpoints are out. What about getting only specific data fields by flagging the inclusion of them within our request? We’ll be able to accomplish the same task of not receiving sender information by adding a flag to our query parameters:

/mail/getMail?id=1&includeSenderDetail=false

Now the API server is able to read a flag and filter out sender details when the flag is false. This reduces clutter in our API and also allows us to optimize our endpoint’s response size dynamically per request. Later on, the frontend may need to further filter out more data to further trim down this dataset. However, when we repeat this pattern on our endpoint, we begin to see some issues with this approach.

Adding more of these flags complicates our endpoints and API consumers will eventually need to review all flags to ensure that they are getting the data they want from the endpoint. These flags have a potential of breaking existing API contracts because the default case for the flag must return the same data as before the flag was added. Finally, these flags are also not specific enough to filter out data coming back from a large request. If a request has 65 fields and only 5 are needed for a feature, we might end up needing upwards of 60 of these flags depending on how flexible we want to make our API.

Custom Enum to Specify Data Fields

What if there was a way that we could group multiple flags together into a common list of data fields that the client wants to receive? Let’s iterate on the previous approach and try this style of pruning down our response. For example, if we wanted to get the first three data fields from each mail object coming back in our response we’d specify:

/mail/getMail/1/?datafields=1&datafields=2&datafields=3

This is an excellent option for our frontend use case because we can specifically request just the data fields that our application is going to use, which is great for bandwidth and memory usage. It may also speed up our requests on the backend, as our server will be able to filter queries for data down to just the specific data fields we’ve requested and spend less time sending that data to the application. Also, if any of these data fields that were filtered out required computation, we can completely eliminate that overhead.

The predominant downside of this approach is the need to create a new enum that must be maintained for each endpoint. The API will also need to serve this enum up, as unfortunately the enum is not very readable to someone working on the frontend. They’ll need to request the enum to answer the question: “What does datapoint=1 correspond to?“.

Nested-graph Requests

Another approach is to try passing the names of our enum instead. But if we’re already going to be passing this enum we might as well solve another problem: filtering data fields nested in objects. We want to give our mail endpoint the ability to filter out all of the sender details except for the name. Using the tools in our existing toolbelt we can either build a custom endpoint, specify a flag to prune down the sender details, or we could add a new item to our existing enum that specifically relates to the name in the nested data structure.

All of the above will work, but it would be nice to not have to hack something together that we’re going to regret when it comes time to maintain our bad decisions. Instead, we can introduce a “graph” of what we would like to receive down into our endpoint:

/mail/getMail/1/?with={id,subject,timestamp,sender.name}

In our new request we pass in the name and a little bit of the structure of the fields that we’d like to receive back. In this case we want id, subject, timestamp, and the nested value sender.name.

This allows for very specific requests that are both readable and efficient. This improvement does come with the upfront cost of developing this feature on the backend. Some framework or tooling will need to be developed to parse a requested field as an attribute of the server’s underlying model and return it in a response. Having to maintain that code will be a lot of work and bugs have a chance of impacting every endpoint across the entire system. Definitely a high risk solution.

GraphQL Requests

Fortunately for everyone requesting specific nested data fields, filtering down datasets and providing a readable interface for clients are problems that have been solved by a common query language: GraphQL. GraphQL is a popular language that allows clients to make requests that represent the JSON object that we’d like to receive back from the server. For example a request looks like this:

/mail/getMail/1/?query=
{
  mail: {
    id
    timestamp
    subject
    sender: {
      id
      name
    }
  }
}

There are many features that describe why GraphQL is a great fit:

Handles these types of queries so we won’t need to spend time building our own response parser to handle any nested-graph requests
Exposes an interactive request model that allows frontend users to ‘explore’ what data fields can be requested from the backend
Optimizes API requests to return exactly which data fields we want to receive in a response
Covers every model in our system with a single endpoint reducing maintenance overhead
Provides server library hookups to easily add GraphQL to many popular servers
New data fields can be easily added to scale out new features
Frontend can easily grow and add features on top of the API with little to no upfront backend development once GraphQL has been integrated

GraphQL makes a compelling case for any application development team looking to quickly scale out a frontend solution. Communicating using the data that’s actually being requested also helps everyone understand the data that’s being modeled more clearly without having to interpret data field IDs or arbitrary flags. Adopting this strategy will give any team looking for efficient development of features an easy win.