John Fremlin's blog: How to optimise Common Lisp under SBCL: object pools

Waiting for updates: connected

Posted 2009-12-21 11:41:00 GMT

1 watching live

Suppose your program is spending a long time in the garbage collector. This issue is likely to affect any high throughput Lisp program, unless carefully designed to avoid it.

The best way to reduce pressure on the GC and the memory hierarchy is to avoid using so much memory. If you have run out of ideas for changing your algorithm, the next best thing is to use dynamic-extent allocation to put object on the stack. However, this is not always convenient or even possible.

One workaround when you have a good idea of the object lifetime is to make an object pool. Instead of freeing objects, keep them referenced in a pool. When you need a new one, request it from the pool and only allocate if empty. This seems like it should reduce the GC pressure and it often does.

Note however, that on SBCL it may sometimes make your program slower. Here is a program that demonstrates this.

(defmacro with-returning-cycles (&body body)
  (alexandria:with-unique-names (h0 l0 h1 l1)
   `(multiple-value-bind (,h0 ,l0) 
	(sb-impl::read-cycle-counter)
      ,@body
      (multiple-value-bind (,h1 ,l1)
	(sb-impl::read-cycle-counter)
	(sb-impl::elapsed-cycles ,h0 ,l0 ,h1 ,l1)))))
(defun make-buf ()
  (make-array 20))
(defun touch-buffers (n)
  (let ((buffers (loop repeat n collect (get-buf))))
    (let ((x (random #x100)))
      (loop for a in buffers
	    do (setf (aref a 0) x)))
    (mapc 'drop-buf buffers)))
 
#+:buf-pool
(progn
  (defvar *pool* nil)
  (defun get-buf ()
    (or (pop *pool*) (make-buf)))
  (defun drop-buf (buf)
    (push buf *pool*)))
 
#-:buf-pool
(progn
  (defun get-buf ()
    (make-buf))
  (defun drop-buf (buf)
    (declare (ignore buf))))
 
(defun benchmark (function &key (count 1000) (repeats 5))
  (let ((totals (make-array repeats :initial-element 0)))
    ; warm up
    (loop repeat repeats do 
	  (funcall function))
    (loop repeat count
	  do
	  (gc)
	  (loop for i below repeats do
		(incf (aref totals i) (with-returning-cycles 
		   (funcall function)))))
    (map 'vector (lambda (x) (round x count)) totals)))

Using the non-pooled version, the cycle counts stay fairly consistent though the first run after a GC is slightly slower.

CL-USER> (gc :full t) (benchmark (lambda () (touch-buffers 10000)))
#(4335098 3977324 4140443 4134816 4212069)
CL-USER> (gc :full t) (benchmark (lambda () (touch-buffers 10000)))
#(4111324 3750190 3726014 3897913 4096875)

However, with the pooled version, the first run after a GC suffers numerous page faults marking the pooled objects in oldspace non-write-protected. So the first run is twice as slow as the naive version! Of course, later runs are much faster.

CL-USER> (gc :full t) (benchmark (lambda () (touch-buffers 10000)))
#(7105889 1335499 1268755 1283744 1298589)
CL-USER> (gc :full t) (benchmark (lambda () (touch-buffers 10000)))
#(8959631 1586089 1519643 1525521 1546846)

This post demonstrates the need for measuring whether your optimisation is beneficial in a real test case, and that you might need to consider the effects of oldspace write protection. (Having a LIFO pool mitigates this to a certain extent, and is probably a good idea from a memory hierarchy perspective provided there is one pool per core.)

Continued.

John Fremlin's blog: How to optimise Common Lisp under SBCL: object pools

John Fremlin's blog: How to optimise Common Lisp under SBCL: object pools

Waiting for updates: connected

Recommend

贝克汉姆夫妇的两家公司披露上财年数据：销售额均有所增长

Inspirational Websites Roundup #22

不能错过的CAS+volatile实现同步代码块

人事动向｜Chanel 老将加盟欧舒丹；Coty任命首席采购官；Firmenich 任命香水业务高管

Karl Lagerfeld贴身保镖兼私人助理出书，回忆大师生前的生活点滴

Why you're probably doing endpoint security all wrong

Let’s Encrypt 升級資料庫伺服器 (AMD YES？) – Gea-Suan Lin's BLOG

Episode 2: Empowering employees to be secure and productive

Twisted Colorful Spheres with Three.js

Bally、LV 和 NABA 入选【华丽志可持续时尚力度榜】本周榜单（另附7条可持续时尚最新...

About Joyk