

How to GraphQL with Ruby, Rails, Active Record, and no N+1
source link: https://evilmartians.com/chronicles/how-to-graphql-with-ruby-rails-active-record-and-no-n-plus-one
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

How to GraphQL with Ruby, Rails, Active Record, and no N+1
You work on a mature web application that cleanly separates backend and frontend. The server-side code, written in Ruby, is mostly responsible for translating HTTP requests into SQL statements (with the help of an ORM) through rich and well-documented API. You choose GraphQL over REST to streamline your endpoints, but your database is not happy with all the extra queries. After much searching, you find an exhaustive hands-on guide on fighting N+1 from a fellow GraphQL-ing Rubyist… Here it comes!
GraphQL can do wonders in a backend-only Rails application, giving your clients (whether a frontend framework or other API consumers) a single endpoint for fetching data in any shapes and sizes they might need.
There’s only one catch, but it’s a big one. N+1 big.
As the list of the associations to load is always determined at the runtime, it is very hard to be smart about querying the database.
You can either accept the sad reality of one query for parent record + one query for each association (hence “N+1”, even though the stricter term will be “1+N”)—or you can load all possible associations in advance with fancy SQL statements. But if you have a rich schema, and that’s the reason to switch to GraphQL in the first place, preloading can put an even bigger strain on a database than letting N+1 run amock. Luckily, there are tools in the Ruby-GraphQL world that allow us to be more selective and smarter about what we load, when, and how.
It’s always better to have an example
To not be unfounded, let’s draw up a practical example of a simple schema for a simple “Twitter clone” application. The goal here is not to be original but to be able to relate to types right away. They are Tweet
, User
, and Viewer
. The Viewer
is the user who views the feed of other user’s tweets. We created a separate type for a “current user” because it may expose properties otherwise inaccessible on “general” users.
class Types::BaseObject < GraphQL::Schema::Object
def current_user
context[:current_user]
end
end
class Types::Tweet < Types::BaseObject
field :content, String, null: false
field :author, Types::User, null: false
end
class Types::User < Types::BaseObject
field :nickname, String, null: false
end
class Types::Viewer < Types::BaseObject
field :feed, [Types::Tweet], null: false
def feed
# In this case, FeedBuilder is a query object
# that returns a Tweet relation based on passed params
FeedBuilder.for(current_user)
end
end
class Types::Query < Types::BaseObject
field :viewer, Types::Viewer, null: true, resolver_method: :current_user
end
class GraphqlSchema < GraphQL::Schema
query Types::Query
end
I have also prepared a gist that contains our whole Rails “application” in a single file. You can’t run it, but it’s functional enough to pass the included specs for comparing different optimization methods that we discuss in this article. To view the code and run the specs, you can run the following in your terminal in any temporary folder:
curl "https://gist.githubusercontent.com/DmitryTsepelev/d0d4f52b1d0a0f6acf3c5894b11a52ca/raw/cba338548f3f87c165fc7ec07eb2c5b55120f7a2/2_demo.rb" > demo.rb
createdb nplusonedb # To create a PostgreSQL test database, requires Postgres installation
rspec demo.rb # to run tests that compare different N+1-fighting techniques
This code contains an N+1 problem right away. Querying the feed that includes nicknames of tweet authors will trigger a single query for tweets
tables, and N queries in users
.
{
query {
viewer {
feed {
content
author {
nickname
}
}
}
}
}
Solution #0: Load all the associations!
Let’s start by cleaning up our code and extracting feed loading to a resolver—a special class that encapsulates our database-querying logic.
class Resolvers::FeedResolver < Resolvers::BaseResolver
type [Types::Tweet], null: false
def resolve
FeedBuilder.for(current_user)
end
end
class Types::Viewer < Types::BaseObject
field :feed, resolver: Resolvers::FeedResolver
end
If you’re interested, here’s the definition for our FeedBuilder
module that abstracts out some Active Record calls:
module FeedBuilder
module_function
def for(user)
Tweet.where(author: user.followed_users)
.order(created_at: :desc)
.limit(10)
end
end
Extracting logic to a resolver allows us to create alternative resolvers and hot-swap them to compare results. Here’s a resolver that solves the N+1 problem by preloading all associations:
class Resolvers::FeedResolverPreload < Resolvers::BaseResolver
type [Types::Tweet], null: false
def resolve
FeedBuilder.for(current_user).includes(:author) # Use AR eager loading magic
end
end
This solution is most obvious, but not ideal: we will make an extra SQL query to preload users no matter what, even if we request just the tweets and don’t care about their authors (I know, it’s hard to imagine, but let’s say it’s for the anonymized data-mining operation).
Also, we have to define a list of associations on the top level (in Query
type or inside resolvers that belong to it). It’s easy to forget to add a new association to the list when a new nested field appears deep inside the graph.
However, this approach is helpful when you know that client does ask for the author data most of the time (for instance, when you control the frontend code).
Solution #1: Lookaheads
While resolving a query, the GraphQL’s execution engine knows which data was requested, so it’s possible to find out what should be loaded at the runtime. The graphql-ruby gem comes with a handy Lookahead feature that can tell us in advance if a specific field was requested. Let’s try it out in a separate resolver:
class Resolvers::FeedResolverLookahead < Resolvers::BaseResolver
type [Types::Tweet], null: false
extras [:lookahead]
def resolve(lookahead:)
FeedBuilder.for(current_user)
.merge(relation_with_includes(lookahead))
end
private
def relation_with_includes(lookahead)
# .selects?(:author) returns true when author field is requested
return Tweet.all unless lookahead.selects?(:author)
Tweet.includes(:author)
end
end
In this case, we make the query in the users
table only when the client asks for the author
field. This approach works fine only in case associations are minimal and not nested. If we take a more complex data model where users have avatars and tweets have likes, then our resolver can get out of hand real quick:
class Resolvers::FeedResolverLookahead < Resolvers::BaseResolver
type [Types::Tweet], null: false
extras [:lookahead]
def resolve(lookahead:)
scope =
Tweet.where(user: User.followed_by(current_user))
.order(created_at: :desc)
.limit(10)
scope = with_author(scope, lookahead) if lookahead.selects?(:author)
scope = with_liked_by(scope, lookahead) if lookahead.selects?(:liked_by)
scope
end
private
def with_author(scope, lookahead)
if lookahead.selection(:author).selects?(:avatar)
scope.includes(user: :avatar_attachment)
else
scope.includes(:user)
end
end
def with_liked_by(scope, lookahead)
if lookahead.selection(:liked_by).selects?(:user)
if lookahead.selection(:liked_by).selection(:user).selects?(:avatar)
scope.includes(likes: { user: :avatar_attachment })
else
scope.includes(likes: :user)
end
else
scope.includes(:likes)
end
end
end
You’re right, that’s not elegant at all! What if there was a way to load associations only when they are accessed? Lazy preloading can help us!
Solution #2: Lazy preloading (by Evil Martians)
With some help from my Evil Martian colleagues, I’ve written a little gem called ar_lazy_preload that lets us fall back to the preloading solution but makes it smarter without any additional effort. It makes a single request to fetch all associated objects only after the association was accessed for the first time. Of course, it works outside of GraphQL examples too and can be really handy in REST APIs or while building server-rendered views. All you need is to add gem "ar_lazy_preload"
to your Gemfile, bundle install
, and then you’ll be able to write your resolver like so:
class Resolvers::FeedResolverLazyPreload < Resolvers::BaseResolver
type [Types::Tweet], null: false
def resolve
FeedBuilder.for(current_user).lazy_preload(:author)
end
end
The gem is created with laziness in mind, so if you feel lazy even to type .lazy_preload
all the time, you can enable it globally for all Active Record calls by adding a line of configuration:
ArLazyPreload.config.auto_preload = true
However, this approach has some downsides:
- we finally brought the first external dependency;
- we do not have much control over queries that are made and it will be hard to customize them;
- if lazy preloading is not turned on, we still have to list all possible associations at the top level;
- if one table is referenced from two places, we will make twice the database requests.
What else can we do?
Solution #3: graphql-ruby lazy resolvers
The graphql-ruby
gem that makes GraphQL possible in our Ruby apps comes bundles with a way to use lazy execution:
- instead of returning data, you can return a special lazy object (this object should remember the data it replaced);
- when a lazy value is returned from a resolver, the execution engine stops further processing of the current subtree;
- when all non–lazy values are resolved, the execution engine asks the lazy object to resolve;
- lazy object loads the data it needs to resolve and returns it for each lazy field.
It takes some time to wrap your head around this, so let’s implement a lazy resolver step by step. First of all, we can reuse the initial FeedResolver
that is not aware of associations:
class Resolvers::FeedResolver < Resolvers::BaseResolver
type [Types::Tweet], null: false
def resolve
FeedBuilder.for(current_user)
end
end
Then, we should return a lazy object from our Tweet
type. We need to pass the ID of the user and a query context because we will use it to store a list of IDs to load:
class Types::Tweet < Types::BaseObject
field :content, String, null: false
field :author, Types::User, null: false
def author
Resolvers::LazyUserResolver.new(context, object.user_id)
end
end
Each time a new object is initialized, we add a pending user ID to the query context, and, when #user
is called for the first time, we make a single database request to get all the users we need. After that, we can fill user data for all lazy fields. Here is how we can implement it:
class Resolvers::LazyUserResolver
def initialize(context, user_id)
@user_id = user_id
@lazy_state = context[:lazy_user_resolver] ||= {
user_ids: Set.new,
users_cache: nil
}
@lazy_state[:user_ids] << user_id
end
def user
users_cache[@user_id]
end
private
def users_cache
@lazy_state[:users_cache] ||=
begin
user_ids = @lazy_state[:user_ids].to_a
@lazy_state[:user_ids].clear
User.where(id: user_ids).index_by(&:id)
end
end
end
Wondering how the execution engine can tell the difference between regular and lazy objects? We should define lazy resolver in the schema:
class GraphqlSchema < GraphQL::Schema
lazy_resolve(Resolvers::LazyUserResolver, :user)
query Types::Query
end
It tells the execution engine to stop resolving users when the Resolvers::LazyUserResolver
object is returned and only come back to it after all the other, non-lazy fields are resolved.
That works, but it’s quite a bit of boilerplate code that you might have to repeat often. Plus, the code can become quite convoluted when our lazy resolvers need to resolve other lazy objects. Fortunately, there exists a less verbose alternative.
Solution #4: Batch loading
The gem graphql-batch from Shopify uses the same lazy mechanism of graphql-ruby
but hides the ugly boilerplate part. All we need to do is inherit from GraphQL::Batch::Loader
and implement the perform
method:
class RecordLoader < GraphQL::Batch::Loader
def initialize(model)
@model = model
end
def perform(ids)
@model.where(id: ids).each { |record| fulfill(record.id, record) }
ids.each { |id| fulfill(id, nil) unless fulfilled?(id) }
end
end
This loader (taken from the examples directory in the official repo) expects a model class in the initializer (to decide where the data should be loaded from). #perform
method is responsible for fetching data, #fulfill
method is used to associate a key with the loaded data.
Batch loader usage is similar to the lazy version. We pass User
to the initializer and ID of the user to load lazily (this ID will be used as a key to fetch the associated user):
class Types::Tweet < Types::BaseObject
field :content, String, null: false
field :author, Types::User, null: false
def author
RecordLoader.for(::User).load(object.author_id)
end
end
As usual, we need to turn on lazy loading in our schema:
class GraphqlSchema < GraphQL::Schema
query Types::Query
use GraphQL::Batch
end
How does this work? When use GraphQL::Batch
is added to the schema, Promise#sync
method is registered to resolve lazily (it uses Promise.rb under the hood). When #load
method is called on a class that inherits from GraphQL::Batch::Loader
, it returns a Promise
object—that is why the execution engine treats it as a lazy value.
This approach has a useful side–effect—you can chain loading in the following way:
def product_image(id:)
RecordLoader.for(Product).load(id).then do |product|
RecordLoader.for(Image).load(product.image_id)
end
end
Solution #5: Better schema design
But even with all the advanced techniques we described above, it is still possible to end up with N+1. Imagine that we are adding an admin panel where you can see a list of users. When a user is selected, a user profile pops up, and you can see a list of their followers. In GraphQL world, where data should be accessed from the place it belongs to, we could do something like this:
class Types::User < Types::BaseObject
field :nickname, String, null: false
field :followers, [User], null: false do
argument :limit, Integer, required: true, default_value: 2
argument :cursor, Integer, required: false
end
def followers(limit:, cursor: nil)
scope = object.followers.order(id: :desc).limit(limit)
scope = scope.where("id < cursor", cursor) if cursor
scope
end
end
class Types::Query < Types::BaseObject
field :users, [User], null: false
field :user, User, null: true do
argument :user_id, ID, required: true
end
def users
::User.all
end
def user(user_id:)
::User.find(user_id)
end
end
The list of users can be fetched using the following query:
query GetUsers($limit: Int) {
users(limit: $limit) {
nickname
}
}
A list of users who follow a specific user can be loaded like so:
query GetUser($userId: ID, $followersLimit: Int, $followersCursor: ID) {
user(userId: $userId) {
followers(limit: $limit, cursor: $followersCursor) {
nickname
}
}
}
The problem appears when someone tries to load a list of users with their followers in the same query:
query GetUsersWithFollowers(
$limit: Int
$followersLimit: Int
$followersCursor: ID
) {
users(limit: $limit) {
nickname
followers(limit: $limit, cursor: $followersCursor) {
nickname
}
}
}
In this case, we cannot get rid of N+1 at all: we have to make a database call for each user because of cursor pagination. To handle such a case, we could to use the less elegant solution and move pagination to the top level:
class Types::Query < Types::BaseObject
field :users, [User], null: false
field :user, User, null: true do
argument :user_id, ID, required: true
end
field :user_followers, [User], null: false do
argument :limit, Integer, required: true, default_value: 2
argument :cursor, Integer, required: false
end
def users
::User.all
end
def user(user_id:)
::User.find(user_id)
end
def user_followers(user_id:, limit:, cursor: nil)
scope = UserConnection.where(user_id: user_id).order(user_id: :desc).limit(limit)
scope = scope.where("user_id < cursor", cursor) if cursor
scope
end
end
This design still makes it possible to load users and their followers, but it turns out that we move from N+1 on the server side to N+1 HTTP requests. The solution looks fine, but hey, we love GraphQL for its logical schema structure! We want to fetch followers from the User
type!
No problem. We can to restrict fetching the followers
field when multiple users are requested. Let’s return an error when it happens:
class Types::Query < Types::BaseObject
field :users, [User], null: false, extras: [:lookahead]
field :user, User, null: true do
argument :user_id, ID, required: true
end
def users(lookahead:)
if lookahead.selects?(:followers)
raise GraphQL::ExecutionError, "followers can be accessed in singular association only"
end
::User.all
end
def user(user_id:)
::User.find(user_id)
end
end
With this schema, it’s still possible to fetch followers of a singular user, and we have completely prevented the unwanted scenario. Don’t forget to mention it in the docs!
That’s it! You’ve made it to the end of our little guide, and now you have at least six different approaches to try out in your Ruby-GrapgQL code to make your application N+1 free.
Don’t forget to check out other articles on GraphQL and N+1 problem in our blog: from the beginner-friendly code-along tutorial on building a Rails GraphQL application with React frontend in three parts (start here) to the more specific use-cases of using GraphQL with Active Storage Direct Upload, dealing with persisted queries coming from Apollo, and reporting non-nullable violations in graphql-ruby
.
We also have a couple of gems to make dealing with N+1 easier in “classic” Rails apps and a couple of articles to go along with them: Squash N+1 queries early with n_plus_one_control test matchers for Ruby and Rails and Fighting the Hydra of N+1 queries.
Over the past few years, our team has invested a lot of effort, including building open source, for making GraphQL a first-class citizen in Rails applications. If you think of introducing a GraphQL API in your Ruby backend—feel free to give us a shout.
Recommend
-
13
Track Down and Fix Slow ActiveRecord SQL Query Performance in Rails Updated Jun 28, 2019 3 comments 6 minute read ...
-
11
In the previous post, I shared some tips on adding Active Storage’s direct uploads to Rails+GraphQL applications. So, now we know how to upl...
-
18
Happy code for happy people Let me start with a strong statement: I am a happy person. This happiness is multi-dimensional, with some dimensions bringing more value than others (and I’m not talking about the...
-
4
How to overwrite tojson (asjson) in Active Record models in Rails Let’s say you have a model in Rails with certain attributes and columns. When you serialize it with to_json, by default Rails will includ...
-
7
We just released Ruby Gem 2.11. We are always making things easier to use for you, so more things work out of the box and more instrumentation and dashboarding is built without you doing any heavy lifting. This release has a big overhaul of A...
-
8
encryption Published on 16 September 2021 16 September 2021 • 4 min read Active Record Encryption i...
-
4
rails 7 Published on 22 September 2021 22 September 2021 • 3 min read Side effects of Active Record's...
-
18
This adds encrypted attributes to Active Record models. This is an extraction from HEY. You can
-
13
I have an article already on File Uploading in GraphQL API in Rails with ActiveStorage. After that article, codelion asked how to unit...
-
4
GitHub GraphQL - Get active repos Get 100 repos owned by a user or org, with the most recently pushed repo first If you have access to see the private repos of an org or user, you'll see those to...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK