

John Fremlin's blog: How to optimise Common Lisp under SBCL: planning
source link: http://john.freml.in/sbcl-optimise-plan
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

John Fremlin's blog: How to optimise Common Lisp under SBCL: planning
Waiting for updates: connected
Posted 2009-12-21 11:43:00 GMT
Planning the architecture — Your typical software project fulfils many functions. Before starting to optimise, you have to decide which functions you wish to make faster. For example, in tpd2 you can define new URL (slowly) and request it over HTTP (quickly). Once you have decided on a set of performance requirements, you can consider whether it is possible to achieve it with your system. This means you need to figure out a realistic numeric value for the performance requirements. Saying fast as possible probably means that you have no way of quantifying the benefit of faster performance and so are wasting your energies pointlessly optimising something that doesn't need to go fast.
Hopefully, your existing design already meets your performance requirements. In that case, fantastic. You don't need anything more. Implement it knowing you needn't worry about performance.
For an example of a sensible requirement that can invalidate a design: you want to serve 10k requests/s, and it takes around 300us to start a new thread. Then you cannot have one new thread for each request. There is no point starting out with a design that is based around one thread for each request.
You have to estimate the minimum amount of actual work your approach will require. This requires common sense or experience. Is the bottleneck resource going to be network bandwidth? Disk seeks? Division operations on the CPU? JPEG image decodes? Is there enough RAM?
A very common source of unnecessary work is in changing from one data representation to another at module boundaries (for example, YUV to RGB image conversions). Often it can dominate the actual, useful work. When I was hacking on embedded video codecs (H.264 etc.), simply copying and converting took a large slice of processing. The same with Unicode conversion in early versions of tpd2.
If you are unsure about what is important and want to test your theories, then run a few mock-up experiments with made up data. The time macro is very helpful for this. For example,
CL-USER> (let ((s (make-string 1000000 :initial-element #\x))) (time (sb-ext:octets-to-string (sb-ext:string-to-octets s :external-format :utf-8) :external-format :utf-8)) nil) Evaluation took: 0.146 seconds of real time 0.150000 seconds of total run time (0.150000 user, 0.000000 system) [ Run times consist of 0.040 seconds GC time, and 0.110 seconds non-GC time. ] 102.74% CPU 260,730,414 processor cycles 21,773,968 bytes consed
So it takes approximately 260 cycles per character to convert from Unicode to UTF-8 and back (including GCing the resulting mess). Suppose each request involves 1kB of data; then we have to process 10MB/s of data. If we convert to UTF-8 and back with SBCL's slow string conversion, we will use 2.6 GHz just doing that. So we cannot afford to do it, and we can figure that out before barrelling ahead down an impossible path.
PS. If you want to run more automated experiments, you can get cycle counts with this macro (be careful to repeat tests until you are confident that the results are statistically significant).
(defmacro with-returning-cycles (&body body) (alexandria:with-unique-names (h0 l0 h1 l1) `(multiple-value-bind (,h0 ,l0) (sb-impl::read-cycle-counter) (locally ,@body) (multiple-value-bind (,h1 ,l1) (sb-impl::read-cycle-counter) (sb-impl::elapsed-cycles ,h0 ,l0 ,h1 ,l1)))))
Recommend
-
7
John Fremlin's blog: How to optimise Common Lisp under SBCL: the garbage collector (draft)Waiting for updates: connectedPosted 2009-12-21 11:42:00 GMTWith Common Lis...
-
10
John Fremlin's blog: How to optimise Common Lisp under SBCL: introductionWaiting for updates: connectedPosted 2009-12-21 11:44:00 GMTPeople occasionally ask me vague...
-
5
John Fremlin's blog: How to optimise Common Lisp under SBCL: object poolsWaiting for updates: connectedPosted 2009-12-21 11:41:00 GMT1 watching liveSuppose your pr...
-
11
John Fremlin's blog: Common Common Lisp myths about declaimWaiting for updates: connectedPosted 2009-07-25 07:19:00 GMTThere are some unfortunate myths and misunderstandi...
-
14
John Fremlin's blog: manardb: a fast persistent object store for Common LispWaiting for updates: connectedPosted 2009-09-12 08:33:00 GMTJust before leaving
-
10
John Fremlin's blog: How to optimise Common Lisp under SBCL: don't profile (draft)Waiting for updates: connectedPosted 2010-01-10 13:55:00 GMTDon't worry about performanc...
-
7
John Fremlin's blog: Portable Common Lisp code walking with macroexpand-dammitWaiting for updates: connectedPosted 2009-07-28 23:23:00 GMTNowadays Common Lisp is rat...
-
7
John Fremlin's blog: How to optimise Common Lisp under SBCL: builtinsWaiting for updates: connectedPosted 2009-12-23 23:00:00 GMTCommon Lisp exports nearly a thousand sym...
-
15
John Fremlin's blog: Conformance of Common Lisp implementationsWaiting for updates: connectedPosted 2009-09-05 05:49:00 GMTOver the last few weeks, I have been writing an...
-
10
John Fremlin's blog: Complications in portably transforming Common LispWaiting for updates: connectedPosted 2009-08-16 02:46:00 GMTWhether a particular declaration expres...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK