Hello!
Few days ago I had an opportunity to do a presentation on MeetJS meetup which is basically a series of meetings of JS/frontend community in Poland. The topic of the presentation could be translated to "GraphQL as an alternative to REST API".
This was the first time I did presentation on public conference/meetup so that was an unique experience for me and I'm really glad I had this opportunity.
Because of this event I also thought it will be nice opportunity to write a blog article on this topic. In this article I will explain what is the GraphQL, what are the benefits of using it and what are downsides. I will use Rest API as a reference in some points. If you don't know REST already then you probably should catch up and learn a little bit more about it as it's the most common architectural standard connected to transferring data between systems. Explaining the details of REST Api is out of the scope of this article so if you're not comfortable with it then you should start with some other resources and then come back here. I will assume at least basic knowledge of REST Api, huge experience is probably not needed though.
But what is wrong with REST Api?
But lets start with REST Api and a little explanation what is even wrong with it and why should you even consider any alternatives.
Overfetching
First issue you might have is overfetching. Overfetching is basically any situation when you get more data than needed. For example you need a name of some user but the only endpoint you have available returns also a lot of unnecessary data:
In the example above we also get id of the user, address and birthday. In real life scenario this could be even worse - sometimes when you use headless CMS you might have for example around 30 different fields when you only need one. This of course means more bandwidth used and in some cases more work needed by the server (like additional DB calls). Both results in worse performance.
Underfetching
Second issue you might encounter is underfetching. In previous point we fetched too much data and in this, as you might guessed, we download not enough.
Lets imagine we need to create a small part of the page where we display:
- User name
- User's posts
- User's last three followers
Unfortunately in REST this will often means that we need to perform three requests like on the image below:
This is connected to the fact that in REST each resource (or set of resources of given type) is represented by some specific URL. That's why we do a different HTTP Request for:
- User data
- User's posts data
- User's followers data
This is not the perfect solution for performance reasons. Not only we will have three HTTP requests but also server will have no chance to optimize the process of preparing data.
N+1 issue
The problem that also comes with underfetching is the N+1 problem. You might have heard about it as it's quite common issue related to fetching data from databases but it's almost the same in REST architecture.
Example of this can be following:
- You have an /books endpoint which only returns ISBN of the book
- In order to get more informations about the book you need to call another URL like /books/{ID}
Look at the image below to get an idea:
So if you need to display a list of 100 books with name for each one you need to perform:
- One /books request - in order to get ISBN for each book
- 100 requests for /books/{ID} to get the name for each of the 100 books
Which sums up to 101 request for 100 books and that's why it's called N+1 issue. You need to call another endpoint for each resource. This is an issue for two reasons:
- There are 101 HTTP requests each time someone enters your site. Even if backend responds really quickly and you have cached response it's still a problem since each HTTP request takes some time
- You don't give your backend server a chance to fetch the data in the best way. It doesn't know about 101 requests you're going to make so it will just fetch a data separately one by one for each request.
Coupling backend and frontend
Someone could argue that these are not that big issues. After all you can simply ask your backend developer to make an endpoint that returns exactly the data you need, can't you? Well you can but the issue is that it couples backend and frontend together much more. The coupling in software engineering tells you how closely modules or part of your system are connected or, in other words, the strength of the relationship between them. Why is high coupling bad? Well because each time you change your module other modules that are highly coupled need to be changed as well.
In our example - if we tell our backend developer to make an API that returns exactly the data we need that's fine until there is a need for a change. Lets imagine that we have the application which displays 100 books (as in example above) and we asked our backend developer to make an API with the exact data we need. So we have something like:
- GET /books - returns books we need and for each it also returns the name
This way we don't need to make another call for book details and we avoid n+1 issue. That's good but then client comes and asks for another field - now he wants to include the book publishion date in the books list. In such scenario frontend developer again needs to ask the backend developer and they both need to do changes and synchronize them and also make sure they're deployed at the same time. This is the issue we have from high coupling between parts of our system.
Another issue might be that we can't always change the API easily. There are many cases where it's hard. For example:
- We have some API developed by third party company and it's expensive to make such changes
- We use some public API for example for some government data or stock market data.
In such scenarios we're stuck with N+1 issue and performance issues.
GraphQL
Now we understand what issues we might encounter while using REST lets finally get to the GraphQL. I know it was quite a long introduction but I believe there's no point of showing an alternative without explaining the issue first.
Few words of a history
Well I don't know if you remember these times but apparently back in 2012 the Facebook mobile app wasn't as good as it is now. It used to crash a lot and the performance was poor. The reason for that is that mobile apps were simply a wrappers for mobile website and therefore the possibilites for improving them were limited. In that year the Facebook team decided to rebuild their app and improve it. They tried different options for API including REST but eventually they started work on a thing that now is known as GraphQL.
For details of the history you can check the post on Facebook blog.
So what is GraphQL?
GraphQL is simply a query and data manipulation language for APIs. Like REST it's not tied to any database, storage, language or framework. Instead it's more like an approach for your API development. You could say that GraphQL is a presentation layer for your backend, exactly like REST. GraphQL is traditionally served over HTTP but it's actually agnostic to the transport layer - you could also use different method of connection like WebSockets. The HTTP is the most important though and I will only focus on it.
There are features of GraphQL which makes it different than REST though.
Declarative data fetching
Declarative data fetching is probably the most important feature of GraphQL. Remember when we talked about issues with REST? They all came from the fact that our API had defined endpoints with defined data these endpoints return. Because of these we had our overfetching, underfetching and N+1 issues. Because of these we had a choice either to go with poor performance or high coupling between frontend and backend.
Declarative data fetching is a response to that issue. When I talked about underfetching I presented an example with simple part of page with name, followers and blog posts where we needed to make three REST requests to get the data we need. We also got more data we needed for each of these endpoints. With GraphQL our API call could look much simpler:
If you take a closer look on the request and response you might notice that we basically defined data we need and our API returned exactly what we wanted. This is declarative data fetching - it gives us a possibility to define what data (fields) we need. We also don't need to call different URLs for different kind of resources, instead we can call for user and in the same request we ask for different resources connected to the user like posts and followers.
It's a powerful feature and actually it solves all of our issues with underfetching, overfetching, N+1 issue and high coupling.
Coupling
Remember how I said that you might encounter an issue for example with government data? Or with data from another company? In such scenarios it can be really costful or even impossible to change the data returned. If such public APIs used the GraphQL these issues would not exist as all clients could simply ask for data they need.
There is only a single endpoint
One of the thing about GraphQL is that there is only a single endpoint available which is like a proxy for all the data. Instead of calling different endpoints for different data you always call the same one and just describe what you want to query or modify. You can say that GraphQL acts as a proxy not only because you call single URL but also because you can ask for data from its database or from other microservices, systems or APIs
We can define how the data should be retrieved by the server using "Resolver functions". These functions job will be to retrieve the data user asked for. In many cases it will simply call your database (or more likely, use the service layer) but just keep in mind this is not always the case.
Strongly typed
GraphQL is strongly typed. The types for each field are defined used Schema Definition Language (SDL). Of course if you come from languages that are not strongly typed you might think that it is something you won't need but remember the APIs are meant to be used by another people. It's not just some simple function only you will write, change and use.
The API might be used by different developers, perhaps not from the same company. The fact that the API is strongly typed is helpful for making a contract between clients of the API and the developers who create that API.
Response mirros the request
Like you could see above - the response from the API looks very similiar to the request. This is really useful if you think about it as the returned data is really easy to predict.
Introspection
In GraphQL you can use a feature called "Introspection" for describing the API. This means you can simply ask the GraphQL API for defined resources and types:
This feature can be used by the GraphQL client you use so it will automatically generate a documentation for you - without any additional work from the developer! Check the GIF below for an example of such generated docs:
Mutations for data manipulation
Usually API not only returns data but can also be used for inserting, modifying or deleting the data. In REST you do that by using different HTTP methods: POST, PUT, DELETE and sometimes PATCH.
In GraphQL on the other hand you use a thing called mutations. You can think of mutations as defined methods you can call. In REST you have different URLs combined with HTTP method for different thing you want to do. Here you have different methods under different names you can call. Below you can see an example of calling such method
You can also notice that not only you call method with some parameters (name: "Bob", age: 36) but you can also define what data you want to be returned when the record is created or modified - just like in queries.
Of course available mutations are also available through introspection and therefore in automatically generated docs.
Downsides
GraphQL like anything else had its pros and cons. If we talk about downsides there are few most important.
Can't we do the same with REST?
Someone after reading all of these could ask - well that's cool, but can't we do the same with REST? For example use HTTP parameters to ask for the fields we need:
- GET /books?fields=name,publishedDate
And indeed we could. This is not a standard though and we might not find that in APIs created by another people. Also it could work well with simple fields but it would be much harder to ask for complex data types - for example ask for User and for his followers (which are also User type) and to fetch only name from the followers.
But indeed in some cases we might simply not need that at all - for example if we have simple microservice used only by other microservices in our internal architecture.
Adds complexity
For simple APIs the GraphQL might be just too much. It can add complexity which is unnecessary in many cases.
Caching might be harder with GraphQL
In REST we have endpoints. Each endpoint represents some data. In GraphQL on the other hand we have one endpoint and the returned data is calculated based on what request we make to that endpoint.
If we have some cache layer between our server and the client then it might be an issue as caching whole URLs is just easier.
It doesn't mean we can't have a good performance using GraphQL - we simply need to think a little bit differently about the way we use cache and the connections between layers in our architecture.
Performance for multi level queries
Since GraphQL gives you possibility to connect data the way you want you might be tempted to ask for data too "deeply". For example ask for author of blog post, then ask for all of his posts then ask for all reviews and for user of these reviews etc. Such use cases might result in poor performance on the backend side as querying for the data could become an issue.
Less popular
REST Api is way more popular than GraphQL. This means more resources, more frameworks, most questions on StackOverflow. Your job colleagues also might feel more secure to use something they already know than learning something new.
Of course GraphQL isn't that new anymore but still there is a huge gap between REST and GraphQL when it comes to popularity. On the other hand you will already find quite a lot of resources on the topic and frameworks which supports it. In Java for example you can easily create it using Spring - it's not that different from creating REST Api.
Anyway - will it replace REST?
Well there is no single answer. GraphQL is not meant to be replacement of REST but an alternative. Anyway in my personal opinion in some cases it's simply better. Examples of such scenarios can be:
- API for different clients for example Android+IOS+Web app - in such scenario each client can make changes in their own pace or use the API slightly differently - it's quite common to have slightly different data on Web App and mobile app
- Publicly available APIs - you don't know your clients. You don't know what use cases they might have with your data. In such scenario you can't provide some specific API and instead it must be generic so everyone can use it. If you use GraphQL your clients can fetch the data they need and also avoid issues with underfetching/overfetching
- Using API available in CMS - in some cases you might have an option to pick between REST API or GraphQL provided by CMS you use. This is for example the case if you use Liferay - both REST and GraphQL are available if you plan to use it as headless CMS for your data. In such scenario I believe you should go for GraphQL as even if REST seems fine at first it might not be fine later when the use cases are more complex
There are also scenarios where REST might work just fine. For example:
- Simple server with not too many endpoints
- Microservice for internal architecture
Summary
To summarize - I believe if you ever worked with REST you should also know the basics of GraphQL. This way you will be able to make better decisions about architecture of your application. This will of course result in better app you do, better API you provide to your clients or perhaps reduces time to market and simplifies the maintainance of your app once it is realeased.
In this article I didn't touch any code. I did it on purpose as the code will look different for different languages/frameworks but the concept will stay the same. I haven't also explained too much of the GraphQL code but that's simply because there is a nice documentation available at https://graphql.org/learn/. I belive you will find everything you need there once you decide to learn something more about GraphQL or you decide to start using it.
Like always I hope you learned something. If you had any questions you can always contact me or leave a comment.