Cache and reuse generated attribute methods by casperisfine · Pull Request #4209...
source link: https://github.com/rails/rails/pull/42095
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Context
By experimenting with #42085, I was able to better profile our application, which made me realize that define_attribute_methods
was responsible for about 140MB
of total resident memory as well as for ~15%
of boot time in our app. Of course you mileage may vary based on the size of the app.
The main bulk of the extra memory is from CodeGenerator
:
retained memory by file
-----------------------------------
...
125.79 MB rails-5e55ac69d843/activemodel/lib/active_model/attribute_methods.rb
A quick digging session shows that we have 238k
attribute methods, but only 50k (21%)
of them are unique:
>> attr_methods = ActiveRecord::Base.descendants.flat_map { |m| m.send(:generated_attribute_methods).public_instance_methods(false) }; nil => nil >> attr_methods.size => 238818 >> attr_methods.uniq.size => 50707
So if we were to be able to re-use them, we could theoretically cut that resident memory by ~80%
, so cut our total memory usage by ~88MB
(on the totally wrong assumption that sharing a method has no overhead, but it's just for napkin math needs).
Looking at the detail of the duplicated methods, it's pretty clear what going on:
>> attr_methods.group_by(&:itself).transform_values(&:size).sort_by(&:last).reverse.take(20) => [[:clear_id_change, 875], [:id_previously_was, 875], [:id_previous_change, 875], [:id_previously_changed?, 875], [:id_changed?, 875], [:id_change, 875], [:id_will_change!, 875], [:saved_change_to_id?, 875], [:id_change_to_be_saved, 875], [:will_save_change_to_id?, 875], [:id_before_last_save, 875], [:saved_change_to_id, 875], [:id_came_from_user?, 875], [:restore_id!, 875], [:created_at_was, 859], [:created_at_will_change!, 859], [:created_at, 859], [:created_at_change, 859], [:will_save_change_to_created_at?, 859], [:created_at_change_to_be_saved, 859]]
Just from that you can easily infer that we have 875 ActiveRecord::Base
descendants, and that the vast majority of them have a created_at
and updated_at
column. Then we have a long tail of other popular column names across models, some fairly specific to our usage, such as shop_id (817)
, but many that shouldn't be too uncommon in most apps such as type (250)
, deleted_at (171)
, name (141)
, title (72)
, etc.
Overall I think it's fair to assume attribute names are not random, and that it's common for an attribute name to be shared across multiple models.
micro benchmarks
Based on the above analysis I wondered wether we could use define_method(name, UnboundMethod)
to share the MRI InstructionSequence
between these similar methods, and wether it's effective:
# frozen_string_literal: true require 'heap-profiler' class GeneratedAttributeMethods < Module end cache = GeneratedAttributeMethods.new CACHE = {} 50_000.times do |i| meth_name = "foo_#{i}" cache.module_eval("def #{meth_name}; 42; end") CACHE[meth_name] = cache.instance_method(meth_name) end mods = 5.times.map { GeneratedAttributeMethods.new } $names = 25.times.map { |i| "foo_#{i}" } if ARGV.first == 'cache' def def_methods(mod) $names.each do |name| mod.define_method(name, CACHE[name]) end end else def def_methods(mod) mod.module_eval($names.map { |m| "def #{m}; 42; end" }.join(";")) end end HeapProfiler.report("/tmp/heap-prof-#{ARGV.first || 'orig'}") do mods.each { |m| def_methods(m) } end
$ heap-profiler /tmp/heap-prof-orig/ | grep 'Total retained' Total retained: 59.67 kB (391 objects) $ heap-profiler /tmp/heap-prof-cache/ | grep 'Total retained' Total retained: 6.44 kB (136 objects)
Answer, looks like yes.
This patch
So once the above is laid out, I believe the change is quite straightforward. Rather than directly generating the method code in each generated_attribute_methods
modules, we first define them in a central "cache" module, and then copy them from the cache into the final module.
Actual effect
I tried this patch against our app, the retained memory reduction was about 80MB
, so not far off from the napkin math.
retained memory by file
-----------------------------------
...
46.35 MB rails-88a98efb114b/activemodel/lib/active_model/attribute_methods.rb
Backward compatibility
Unfortunately at this stage there's one backward incompatibility issue that might break gems using the define_method_attribute*
API, such as my own frozen_record
: https://github.com/byroot/frozen_record/blob/42e41eb60d17df6a692e2e18e2ea7af6a28265b9/lib/frozen_record/compact.rb#L29-L31
This whole caching strategy assumes that two attribute methods with the same name will have the same source code. The ability for subclasses to define define_method_attribute*
breaks this assumption.
So I need to find a way to handle this.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK