John Fremlin's blog: How to optimise Common Lisp under SBCL: planning

4 years ago

source link: http://john.freml.in/sbcl-optimise-plan
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

John Fremlin's blog: How to optimise Common Lisp under SBCL: planning

Waiting for updates: connected

Posted 2009-12-21 11:43:00 GMT

Planning the architecture — Your typical software project fulfils many functions. Before starting to optimise, you have to decide which functions you wish to make faster. For example, in tpd2 you can define new URL (slowly) and request it over HTTP (quickly). Once you have decided on a set of performance requirements, you can consider whether it is possible to achieve it with your system. This means you need to figure out a realistic numeric value for the performance requirements. Saying fast as possible probably means that you have no way of quantifying the benefit of faster performance and so are wasting your energies pointlessly optimising something that doesn't need to go fast.

Hopefully, your existing design already meets your performance requirements. In that case, fantastic. You don't need anything more. Implement it knowing you needn't worry about performance.

For an example of a sensible requirement that can invalidate a design: you want to serve 10k requests/s, and it takes around 300us to start a new thread. Then you cannot have one new thread for each request. There is no point starting out with a design that is based around one thread for each request.

You have to estimate the minimum amount of actual work your approach will require. This requires common sense or experience. Is the bottleneck resource going to be network bandwidth? Disk seeks? Division operations on the CPU? JPEG image decodes? Is there enough RAM?

A very common source of unnecessary work is in changing from one data representation to another at module boundaries (for example, YUV to RGB image conversions). Often it can dominate the actual, useful work. When I was hacking on embedded video codecs (H.264 etc.), simply copying and converting took a large slice of processing. The same with Unicode conversion in early versions of tpd2.

If you are unsure about what is important and want to test your theories, then run a few mock-up experiments with made up data. The time macro is very helpful for this. For example,

CL-USER> 
(let ((s (make-string 1000000 :initial-element #\x)))
     (time (sb-ext:octets-to-string (sb-ext:string-to-octets s
     	    	       :external-format :utf-8) :external-format
		       :utf-8)) 
     nil)
Evaluation took:
  0.146 seconds of real time
  0.150000 seconds of total run time (0.150000 user, 0.000000 system)
  [ Run times consist of 0.040 seconds GC time, and 0.110 seconds non-GC time. ]
  102.74% CPU
  260,730,414 processor cycles
  21,773,968 bytes consed

So it takes approximately 260 cycles per character to convert from Unicode to UTF-8 and back (including GCing the resulting mess). Suppose each request involves 1kB of data; then we have to process 10MB/s of data. If we convert to UTF-8 and back with SBCL's slow string conversion, we will use 2.6 GHz just doing that. So we cannot afford to do it, and we can figure that out before barrelling ahead down an impossible path.

PS. If you want to run more automated experiments, you can get cycle counts with this macro (be careful to repeat tests until you are confident that the results are statistically significant).

(defmacro with-returning-cycles (&body body)
  (alexandria:with-unique-names (h0 l0 h1 l1)
   `(multiple-value-bind (,h0 ,l0) 
	(sb-impl::read-cycle-counter)
      (locally ,@body)
      (multiple-value-bind (,h1 ,l1)
	(sb-impl::read-cycle-counter)
	(sb-impl::elapsed-cycles ,h0 ,l0 ,h1 ,l1)))))

Recommend

john.freml.in 4 years ago
Cache

John Fremlin's blog: How to optimise Common Lisp under SBCL: the garbage collect...

John Fremlin's blog: How to optimise Common Lisp under SBCL: the garbage collector (draft)Waiting for updates: connectedPosted 2009-12-21 11:42:00 GMTWith Common Lis...

john.freml.in 4 years ago
Cache

John Fremlin's blog: How to optimise Common Lisp under SBCL: introduction

John Fremlin's blog: How to optimise Common Lisp under SBCL: introductionWaiting for updates: connectedPosted 2009-12-21 11:44:00 GMTPeople occasionally ask me vague...

john.freml.in 4 years ago
Cache

John Fremlin's blog: How to optimise Common Lisp under SBCL: object pools

John Fremlin's blog: How to optimise Common Lisp under SBCL: object poolsWaiting for updates: connectedPosted 2009-12-21 11:41:00 GMT1 watching liveSuppose your pr...

john.freml.in 4 years ago
Cache

John Fremlin's blog: Common Common Lisp myths about declaim

John Fremlin's blog: Common Common Lisp myths about declaimWaiting for updates: connectedPosted 2009-07-25 07:19:00 GMTThere are some unfortunate myths and misunderstandi...

john.freml.in 4 years ago
Cache

John Fremlin's blog: manardb: a fast persistent object store for Common Lisp

John Fremlin's blog: manardb: a fast persistent object store for Common LispWaiting for updates: connectedPosted 2009-09-12 08:33:00 GMTJust before leaving

john.freml.in 4 years ago
Cache

John Fremlin's blog: How to optimise Common Lisp under SBCL: don't profile (draf...

John Fremlin's blog: How to optimise Common Lisp under SBCL: don't profile (draft)Waiting for updates: connectedPosted 2010-01-10 13:55:00 GMTDon't worry about performanc...

john.freml.in 4 years ago
Cache

John Fremlin's blog: Portable Common Lisp code walking with macroexpand-dammit

John Fremlin's blog: Portable Common Lisp code walking with macroexpand-dammitWaiting for updates: connectedPosted 2009-07-28 23:23:00 GMTNowadays Common Lisp is rat...

john.freml.in 4 years ago
Cache

John Fremlin's blog: How to optimise Common Lisp under SBCL: builtins

John Fremlin's blog: How to optimise Common Lisp under SBCL: builtinsWaiting for updates: connectedPosted 2009-12-23 23:00:00 GMTCommon Lisp exports nearly a thousand sym...

john.freml.in 3 years ago
Cache

John Fremlin's blog: Conformance of Common Lisp implementations

John Fremlin's blog: Conformance of Common Lisp implementationsWaiting for updates: connectedPosted 2009-09-05 05:49:00 GMTOver the last few weeks, I have been writing an...

john.freml.in 3 years ago
Cache

John Fremlin's blog: Complications in portably transforming Common Lisp

John Fremlin's blog: Complications in portably transforming Common LispWaiting for updates: connectedPosted 2009-08-16 02:46:00 GMTWhether a particular declaration expres...

John Fremlin's blog: How to optimise Common Lisp under SBCL: planning

Waiting for updates: connected

Recommend

About Joyk