3

Understanding n+1 query problems in Ruby on Rails

 1 year ago
source link: https://sourcediving.com/understanding-n-1-query-problems-in-ruby-on-rails-b2ea4be86a2d
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Understanding n+1 query problems in Ruby on Rails

n+1 queries are a common problem in web application development. This common pattern (or antipattern) can degrade application performance by increasing server response times and by placing additional strain on limited database resources.

1*DJuNcRPQSgDccWSQ1BntGA.jpeg

Photo by Glenn Hansen on Unsplash

However, n+1 queries are sometimes desireable, and can be used along with other design patterns to improve performance. Because this pattern can have both positive and negative effects, it’s important that developers have a good understanding of n+1 patterns and their implications in the context of their application.

This guide aims to explain n+1 queries in the context of modern Ruby on Rails web applications.

What is an n+1 query?

An n+1 query occurs when we are loading a collection of records from a database, and also one related record for each record in that collection. The first query loads the collection of records from the database, and then one additional query is executed for each n record. Hence, the total number of queries executed is n + 1.

💡 Tip

It sometimes helps to think of this as a 1+n query, since the 1 (the query that loads the collection) is executed first, and then an additional n queries are executed thereafter.

A simple example

Suppose we have a collection of recipes and each recipe belongs to a user. In our app, we wish to display a list of each recipe’s title and the name of the user who published the recipe next to it:

"Mac and cheese by Marco"
"Lobster thermidore by Joe"
"Fish tacos by Betty"

We might choose to implement this in Ruby like this:

def title_and_author(recipe:, user:)
"#{recipe.title} by #{user.name}"
end

…and call our new helper method like so:

Recipe.limit(3).each do |recipe|
puts title_and_author(recipe: recipe, user: recipe.user)
end

If we were to look at the application logs in this hypothetical example, we would expect to see something like the following:

SELECT "recipes".* FROM "recipes" LIMIT 3
SELECT "users".* FROM "users" WHERE "users"."id" = 1 LIMIT 1
SELECT "users".* FROM "users" WHERE "users"."id" = 2 LIMIT 1
SELECT "users".* FROM "users" WHERE "users"."id" = 3 LIMIT 1

In this case, n = 3 (the number of recipes). For each recipe, an additional query is executed to load the related user record.

1 query to load the recipes. 3 queries to load each user.

total queries = n + 1 = 3 + 1 = 4

4 queries in total.

📝 Note

n+1 has become a common term to describe additional queries being executed for a set of n records. In practice, it is often more than simply 1 extra query per record. Sometimes the correct expression might be 2n+1 or 3n+2. We typically still refer to these as being "n+1 problems".

What causes n+1 query problems?

n+1s are a problem caused by human error, and tend to happen in applications that employ lazy-loading to improve performance.

Because Rails’s lazy-loading is so easy to use, and will defer the loading of resources until they are called in code, it’s common for developers to introduce n+1 queries to a feature without realising it.

Using common code design patterns and best-practices can sometimes make it harder to recognise an n+1 query by reading the code alone, because these encourage code to be separated out into many different places and reused in different contexts.

Here are some examples of code patterns that might introduce n+1 queries…

Iterating over a collection

n+1s, by definition, appear when we are working with collections of records. Often, they can appear when we call methods within iteration blocks, without pre-loading all of the data those blocks rely on.

Sometimes these patterns are very easy to spot, like in the following example:

<!-- app/views/recipes/index.html.erb --><% Recipe.limit(3).each do |recipe| %>
<%= recipe.user.name %>
<% end %>

But sometimes the patterns are less obvious. For example:

@recipes = Recipe.limit(3)
@user_names = @recipes.map(&:user).map(&:name)

In both of these examples, just the @recipes collection has been loaded from the database to begin with. Calling user on each recipe iteratively will trigger an additional query to load each recipe's user individually.

Separating code across architectural layers

Often these patterns can be hidden behind a series of different methods in different application layers. These instances are difficult to spot by just reading the code, especially when we’re only focussing on one particular area of the code (such as the view templates):

<!-- app/views/recipes/index.html.erb -->
<h1>Latest users to publish recipes</h1>
<%= latest_recipe_user_names(@recipes) %># app/helpers/recipes_helper.rb
module RecipesHelper
def latest_recipe_user_names(recipes)
recipes.map { |recipe| recipe.author_name }
.to_sentence
end
end# app/models/recipe.rb
class Recipe < ApplicationRecord
belongs_to :user def author_name
user.name
end
end

Metaprogramming / calling relations dynamically

Metaprogramming can make it harder still to recognise an n+1 by reading the code alone. Study the following example, and try to spot how it might introduce n+1 queries:

def collect_associated_names(collection, assoc_name)
association = collection.model.reflect_on_association(assoc_name)
unless association&.klass.instance_methods.include?(:name)
return Array.new(collection.count, nil)
end collection.map do |item|
if association.belongs_to? || association.has_one?
item.public_send(assoc_name).name
else
item.public_send(assoc_name).map(&:name)
end
end.flatten
end
@recipes = Recipe.limit(3)
collect_associated_names(@recipes, :user) # => ["Betty", "Joe", "Marco"]

If we were to inspect the application logs in each of the above examples, we would see the same n+1 pattern in each case:

SELECT "recipes".* FROM "recipes" LIMIT 3
SELECT "users".* FROM "users" WHERE "users"."id" = 1 LIMIT 1
SELECT "users".* FROM "users" WHERE "users"."id" = 2 LIMIT 1
SELECT "users".* FROM "users" WHERE "users"."id" = 3 LIMIT 1

Talking to strangers

n+1s can show up when an object tries to talk to a stranger.

A stranger, as defined in the Law of Demeter, is an object that is only related to the current object through one of its dependencies.

Consider our earlier example:

<!-- app/views/recipes/index.html.erb --><% @recipes.each do |recipe| %>
<%= recipe.user.name %>
<% end %>

In this case, our current context is the view template object. @recipes, and the local variable recipe are both friends or direct dependencies of the current view template object. But user has no direct relationship with the current view template object, and is only related to the template through the intermediary object recipe. In this example, user is a stranger to the view template.

By talking to user (by calling the method #name), this code may invoke n+1 database queries, because each recipe's user has not been explicitly eager-loaded as a dependency of the view template's dependency @recipes.

To state this problem more generally: n+1 queries are likely to occur when an object calls methods on other objects that are not direct dependencies of this object.

This tends to happen because, as we move further away from objects within our current working context, we tend to think less about their properties and how they are defined. While we’re working on the recipes index page, we’re less likely to be thinking about how the Recipe model is defined, and less likely still to think about Recipe’s relationships and dependencies.

Delegates pattern

A common pattern to avoid Law of Demeter violations, is to define delegates within a class’s public interface.

For example:

class Recipe < ApplicationRecord
belongs_to :user
delgate :name, to: :user, prefix: true
end<% @recipes.each do |recipe| %>
<%= recipe.user_name %>
<% end %>

This code is Demeter compliant because user_name has been defined as part of the public interface of Recipe, rather than us having to call recipe.user.name. By looking at the code in this example, there is no reason to suspect that this code will have any negative consequences.

However, behind the scenes, Rails’s delegate method is calling #user on each recipe individually, creating another n+1 problem—just as we saw in the previous example.

Delegates can help us write cleaner public interfaces, but we have hidden the complexity of our model’s relationship to the database behind another layer of abstraction, which makes it more difficult to recognise that an n+1 query has been introduced.

In summary

To summarize, this section:

  • There are various ways we can introduce n+1 problems into our application
  • n+1s can arise as a result of following established code-design best-practices
  • It’s often difficult to recognise an n+1 query simply by reading the code

How do I recognise an n+1 query?

As described in the section above, it’s difficult to spot an n+1 query by simply reading our application code.

Instead, the best way to find n+1 queries is by reading your application’s log. Here, you can see a record of every database query being made as part of each request, and look for instances that might indicate an n+1 query that can be optimised.

A simple n+1 log example

Typically, an n+1 query will look like many similar queries being executed one after the other:

SELECT "recipes".* FROM "recipes" LIMIT 3
SELECT "users".* FROM "users" WHERE "users"."id" = 1 LIMIT 1
SELECT "users".* FROM "users" WHERE "users"."id" = 2 LIMIT 1
SELECT "users".* FROM "users" WHERE "users"."id" = 3 LIMIT 1

These instances are the simplest, and are usually the easiest to detect and solve.

Sometimes, though, they are not so easy to identify, and require a more thorough read of the log files.

A non-sequential n+1 log example

Sometimes the repetitive n+1 queries are separated by other queries, and so they don’t stand out as obviously from scanning the application logs:

SELECT "recipes".* FROM "recipes" LIMIT 2
SELECT "users".* FROM "users" WHERE "users"."id" = 1 LIMIT 1
SELECT "ingredients".* FROM "ingredients" WHERE "ingredients"."recipe_id" IN (1,2)
SELECT "users".* FROM "users" WHERE "users"."id" = 2 LIMIT 1

In this example, the n+1 problem on users is separated by the SELECT queries on the ingredients table (the ingredients are properly eager-loaded as a collection). It’s important to review the entire log for a given request to look for similar queries becase they might not always be grouped together.

Analogous query example

Sometimes, we can load the same resources in different ways. This can lead to us to make more database queries than are necessary, but these aren’t always easy to spot in the log.

For example:

<% @recipes.each do |recipe| %>
<% if recipe.comments.any? %>
<%= render recipe.comments %>
<div>Showing a total of <%= pluralize("comment", recipe.comments.count) %></div>
<% end %>
<% end %>--- log output ---
Recipe Load (0.0ms) SELECT "recipes".* FROM "recipes"
Comment Exists? (0.1ms) SELECT 1 AS one FROM "comments" WHERE "comments"."recipe_id" = 1 LIMIT 1
Comment Load (0.1ms) SELECT "comments".* FROM "comments" WHERE "comments"."recipe_id" = 1
Comment Count (0.1ms) SELECT COUNT(*) FROM "comments" WHERE "comments"."recipe_id" = 1
Comment Exists? (0.0ms) SELECT 1 AS one FROM "comments" WHERE "comments"."recipe_id" = 2 LIMIT 1
Comment Load (0.0ms) SELECT "comments".* FROM "comments" WHERE "comments"."recipe_id" = 2
Comment Count (0.1ms) SELECT COUNT(*) FROM "comments" WHERE "comments"."recipe_id" = 2
Comment Exists? (0.0ms) SELECT 1 AS one FROM "comments" WHERE "comments"."recipe_id" = 3 LIMIT 1
Comment Load (0.0ms) SELECT "comments".* FROM "comments" WHERE "comments"."recipe_id" = 3
Comment Count (0.0ms) SELECT COUNT(*) FROM "comments" WHERE "comments"."recipe_id" = 3

In this example, the logs show 3n+1 queries, because for each recipe we are:

  • Checking for the existence of any related comments
  • Loading all of the related comments
  • Counting the related comments

… and each of these executes a separate query.

It’s important to understand that sometimes a query can look different from other functionally similar queries, and that this can hide opportunities for optimisation.

The above example could be optimised to execute only two queries:

@recipes = Recipe.limit(3).includes(:comments)<% @recipes.each do |recipe| %>
<% if recipe.comments.load.any? %>
<%= render recipe.comments %>
<div>Showing a total of <%= pluralize("comment", recipe.comments.length) %></div>
<% end %>
<% end %>--- log output ---
Recipe Load (0.0ms) SELECT "recipes".* FROM "recipes"
Comment Load (0.1ms) SELECT "comments".* FROM "comments" WHERE "comments"."recipe_id" IN (1,2,3)

When should I fix an n+1 query (and when should I not)?

When optimising a particular endpoint for performance, the ultimate goal should be to improve overall user experience. With this goal in mind, there are various factors you should consider, before you decide to address a particular n+1 query, including:

  1. Does this endpoint currently meet our SLOs?
  2. Do we cache any aspects of this endpoint?
  3. Does eager-loading really improve performance?

Does this endpoint currently meet our SLOs?

Service Level Objectives (SLOs) are the objective metrics we use to determine whether or not a part of our application is performing to the desired standards we have defined. These should be a strong consideration when deciding whether or not to fix performance issues such as n+1 problems. Software teams have limited resources to apply to solving problems, and often an endpoint’s performance is Good Enough™, even when it isn’t perfect.

If an endpoint has n+1 queries, but is currently performing within your SLOs, it might be reasonable to focus on other optimisation opportunities elsewhere.

If an endpoint is performing below your team’s SLOs, you should probably raise this as a problem to be addressed with your team.

Do we cache any aspects of this endpoint?

Caching — specifically fragment caching or Russian Doll caching — can drastically change whether or not it pays to fix n+1 problems. In some cases, code that results in n+1 queries can actually be beneficial to the overall performance of an endpoint.

Consider the following example:

<% @recipes.each do |recipe| %>
<% cache(recipe) do %>
<% cache recipe.user do %>
<h3>Written by: <%= recipe.user.name %></h3>
<% end %>
<%- @recipe.steps.each do |step| %>
<% cache(step) %>
<%= step.description %>
<% end %>
<% end %>
<% end %>
<% end %>

In this example, this endpoint would execute at least two n+1 queries. For each recipe, we’re also loading its associated user and steps. If the collection @recipes contained ten recipes, this code could trigger twenty-one database queries.

However, because we are caching the recipe content as a fragment of HTML, the methods #user and #steps will not be called on subsequent page loads—the recipe's HTML content would be returned from the cache instead. This means that, as long as this recipe is not updated, this endpoint will only execute one query per request to load this content.

If just one of the recipes is updated, the records associated with that recipe would trigger additional queries again in subsequent loads, but this would likely still be fewer than the original twenty-one queries.

Since we’re serving mostly cached content when our recipes page is loaded, eager-loading the related user and steps might actually be detrimental. With a warm cache, we only need to load the recipes collection from the database on each page load, and subsequently loading the users and steps would execute additional queries without providing any performance benefit.

Because of this effect, it is important to review the application logs with caching turned on in development mode, to see what effect caching has on the queries that are executed. This should be a factor in whether or not n+1 queries should be fixed or implemented.

See this interview with DHH and Nate Berkopec for more information.

Does eager-loading really improve performance?

As developers, we are taught to think of n+1s as always causing slower responses, and increasing database burden. Sometimes, paradoxically, it can be faster to execute multiple, simple queries than one more complex query across joins and with other conditions.

Consider the following example:

@recipes = Recipe.eager_load(:user)
@recipes.each do |recipe|
puts recipe.user.name
end--- log output ---
SELECT "recipes"."id" AS t0_r0, "recipes"."user_id" AS t0_r1, "recipes"."title" AS t0_r2, "recipes"."created_at" AS t0_r3, "recipes"."updated_at" AS t0_r4, "users"."id" AS t1_r0, "users"."name" AS t1_r1, "users"."created_at" AS t1_r2, "users"."updated_at" AS t1_r3 FROM "recipes" LEFT OUTER JOIN "users" ON "users"."id" = "recipes"."user_id"

Eager-loading users along with recipes creates a fairly lengthy sql query with an outer join. It is possible that, in a large production dataset with tens of millions of records, this query actually takes longer to execute than the alternative solution shown below, which performs three simple queries on the primary key index of a single table:

@recipes = Recipe.all
@recipes.each do |recipe|
puts recipe.user.name
end--- log output ---
SELECT "users".* FROM "users" WHERE "users"."id" = 1 LIMIT 1
SELECT "users".* FROM "users" WHERE "users"."id" = 2 LIMIT 1
SELECT "users".* FROM "users" WHERE "users"."id" = 3 LIMIT 1

The important point to iterate here, is that performance optimisation should be data-driven and outcome focussed. Sometimes, changes you make might not have the positive impact you expect them to, and so one should, if possible, benchmark changes with a production-like dataset before merging them; or monitor the impact of these changes shortly after deploying them; in order to validate that they are indeed improvements.

Summary

To summarise the above:

  • To identify n+1 query problems before they are deployed, you should review the application log to see how a given endpoint interacts with the database
  • Review the log both with caching disabled, and with caching enabled. Be sure to run a few requests to see how caching changes the database usage
  • Whether to spend time on resolving an n+1 or not should be a decision that is informed by our SLOs, and reviewed with your team
  • Resolving n+1 queries isn’t always best for optimal performance

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK