With GraphQL you can query exactly what you want whenever you want. That is amazing for working with an API, but also has complex security implications.
Instead of asking for legitimate, useful data, a malicious actor could submit an expensive, nested query to overload your server, database, network, or all of these. Without the right protections you open yourself up to a DoS (Denial of Service) attack.
Neither the depth nor the individual amounts are exceptionally high in this query, so it would pass through our current protections. Yet, it potentially fetches tens of thousands of records, meaning it’s intensive on the database, server and network, a worst-case scenario.
To prevent this we would need to analyze queries before we run them to calculate their complexity and block them if were too expensive. While this is more work than both of our previous protections, it’ll make 100% sure no malicious query can get to our resolvers.
Before you go ahead and spend a ton of time implementing query cost analysis be certain you need it. Try to crash or slow down your staging API with a nasty query and see how far you get — maybe your API doesn’t have these kinds of nested relationships, or maybe it can handle fetching thousands of records at a time perfectly fine and doesn’t need query cost analysis!
Implementing Query Cost Analysis
There are a couple of packages on npm to implement query cost analysis. Our two front-runners were graphql-validation-complexity, a plug-n-play module, or graphql-cost-analysis, which gives you more control by letting you specify a @cost directive. There’s also graphql-query-complexity, but I wouldn’t recommend choosing it over graphql-cost-analysis since it’s the same idea without directive or multiplier support. We went with graphql-cost-analysis because we have a large discrepancy between our fastest resolvers (20μs) and our slowest ones (10s+), so we needed the control we got from it. That being said, maybe graphql-validation-complexity is enough for you, try it out for sure!
The way it works is that you specify the relative cost of resolving a certain field or type. It also has multiplication support, so if you requested a list any nested field within would be multiplied by the pagination amount, which is very neat.
This is what the @cost directive looks like in practice:
This is just a snippet of our API types, but you get the point. You specify how complex a certain field is, which argument to multiply by and the maximum cost and graphql-cost-analysis does the rest for you.
I determined how complex certain resolvers are via the performance tracking data exposed by Apollo Engine. I went through the entire schema and assigned a value based on the p99 service time. Then we looked through all the queries on our client to figure out the most expensive one, which got ~500 complexity points. To give us a bit of leeway for the future we set the maximum complexity to 750.
Running the evilQuery from above, now that we’ve added graphql-cost-analysis, I get an error message telling me that the “GraphQL query exceeds maximum complexity, please remove some nesting or fields and try again. (max: 750, actual: 1010319)”
1 million complexity points? Rejected!