appsignal

Track cache hits with custom metrics

Robert Beekman on Nov 13, 2017

Every server running an app that uses AppSignal sends a collection of samples and metrics to our Push API every 30 seconds.

Each request has a key we use to determine which app the data came from. To do that, we need to query our database to find the app for each incoming request. With thirty billion requests per month, we're constantly trying to find ways to reduce the number of queries to make AppSignal faster.

We implemented caching to reduce the number of queries on our database clusters. Whenever we fetch an app from the database, we store it in Memcached for one minute. After deploying this change to production, we found that we were doing more queries than before. It seemed like the cache was invalidated too often. To find out where that happened, we added some custom metrics to find out where the cache was improperly invalidated.

There are a couple of places where we invalidate the cache, such as when the push processed time is updated or when we detect a new namespace.

# Update last push processed at time
if app.last_push_processed_at < 5.minutes.ago
  app.set(:last_push_processed_at => Time.now)
  Rails.cache.delete(cache_key)
end
 
if namespaces_diff.any?
  app.add_to_set(:namespaces => namespaces_diff)
  Rails.cache.delete(cache_key)
end

We added multiple counters to determine which of these cache invalidations was the culprit. In this example we increment the app.cache.invalidate counter to count the total number of validations, and use specific keys such as app.cache.invalidate_push_time and app.cache.invalidate_namespaces for specific invalidations.

# Update last push processed at at time
if app.last_push_processed_at < 5.minutes.ago
  app.set(:last_push_processed_at => Time.now)
  Rails.cache.delete(cache_key)
  Appsignal.increment_counter('app.cache.invalidate', 1)
  Appsignal.increment_counter('app.cache.invalidate_push_time', 1)
end
 
if namespaces_diff.any?
  app.add_to_set(:namespaces => namespaces_diff)
  Rails.cache.delete(cache_key)
  Appsignal.increment_counter('app.cache.invalidate', 1)
  Appsignal.increment_counter('app.cache.invalidate_namespaces', 1)
end

Adding the custom metrics above, we were able to graph our cache hits over time. It became immediately apparent which of the cache keys caused the rise in queries. The app.cache.invalidate_namespaces key was invalidated for each request.

The total number of cacheable requests is counted as app.cache.maybe.

After deploying a fix for this issue, the number of invalidations dropped to zero as long as the namespaces for an app weren't updated.

Adding custom metrics makes it easier to understand what's happening where, when, and how often. In this case, knowing the number of cache invalidations and showing them in a readable graph allowed us to quickly find an issue. It just needs a couple of lines of code to increment a certain value and create a dashboard.

Let us know if you have any questions about custom metrics and if we can help you to get them set up in your application. We're happy to help!

Track cache hits with custom metrics

Track cache hits with custom metrics

Recommend

Elixir package 1.9: Improved Erlang error & Container memory support

西藏矿业：净利润同比增长1018.3% 量价齐升下公司延续高速成长

独立APP这只螃蟹，俞敏洪吃得下吗？

喧嚣中的美团

iPhone 14 Rumors: No Sierra Blue, No Titanium Model, Stronger MagSafe Magnets, a...

Network automation with Ansible filters

孙宏斌，狂人终局

UEFI 2.10和ACPI 6.5规范发布：支持新处理器、内存接口和平台类型

Microsoft Launches Arm-based Azure VMs Powered by Ampere Chips - Slashdot

谁在用Keep？一半人付了钱，却坚持不住！拉一个新人花473元

About Joyk