3

SQLite in Go, with and without cgo

 2 years ago
source link: https://datastation.multiprocess.io/blog/2022-05-12-sqlite-in-go-with-and-without-cgo.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Stay in the loop about new blog posts; subscribe to the mailing list!

SQLite in Go, with and without cgo

Published on May 12, 2022 by Phil Eaton
sqlitegobenchmark

Most people use the mattn/go-sqlite3 package to interact with SQLite in Go. This package uses cgo and bundles the SQLite C amalgamation files with the Go source.

But Go developers often prefer not to use cgo (see for example cgo is not go). I mention this because there happens to be an underrated translation of SQLite's C source code to Go. Since it is a translation of C to Go, you don't need to use cgo to call into it. Some developers find this idea compelling. And this Go translation is an impressive work in and of itself.

But for real-world usage there are at least two major concerns I had: compatibility and performance. According to their documentation they pass most or all SQLite3 tests, so it seems pretty compatible. But I didn't see anything on their site or documentation (which, to their detriment, is pretty scant) that talked about performance. So I took a look.

This post summarizes some basic ingestion and query benchmarks using mattn/go-sqlite3 and modernc.org/sqlite. tldr; the Go translation is consistently at least twice as slow for both small datasets and large, for INSERTs and SELECTs.

Benchmarks

There are two main benchmarks, one ingest benchmark and one query benchmark. They are both run 10 times and both run against a growing number of rows.

Ingest

This benchmark inserts 10_000, 479_827, and 4_798_270 rows 10 times each. There are ten columns and the contents are a mix of randomly generated strings and integers.

Query

This benchmark runs a single GROUP BY query: SELECT COUNT(1), age FROM people GROUP BY age ORDER BY COUNT(1) DESC. It runs 10 times against each of the sizes of rows ingested.

Here is the mattn/go-sqlite3 version. And here is the SQLite Go translated version.

Machine Specs

I am running these benchmarks on a dedicated bare metal instance, OVH Rise-1.

  • RAM: 64 GB DDR4 ECC 2,133 MHz
  • Disk: 2x450 GB SSD NVMe in Soft RAID
  • Processor: Intel Xeon E3-1230v6 - 4c/8t - 3.5 GHz/3.9 GHz

Results

Below are the averages across all 10 runs for each query category, number of rows acting on, and the library (cgo or no cgo) used.

Category Average Time (Seconds) Standard Deviation # Rows Library
insert 0.046708 0.002034 10000 mattn
insert 0.094758 0.001631 10000 modernc
group_by 0.000048 0.000006 10000 mattn
group_by 0.003762 0.000246 10000 modernc
insert 2.148416 0.012983 479827 mattn
insert 4.533048 0.01208 479827 modernc
group_by 0.000048 0.000006 479827 mattn
group_by 0.230283 0.002151 479827 modernc
insert 21.344969 0.084976 4798270 mattn
insert 45.322141 0.114671 4798270 modernc
group_by 0.000051 0.000005 4798270 mattn
group_by 2.791617 0.01678 4798270 modernc

Summary

modernc.org/sqlite is a very impressive project. But it is at least twice as slow in every variation even with smaller datasets. And SELECT performance gets way worse as the dataset size increases. The Go translation also has a greater standard deviation which might be attributable to the garbage collector. My guess is that mattn/go-sqlite3 is so constant in SELECT performance even as the dataset increases because even 4 million rows is peanuts for SQLite. Or maybe my benchmark code is wrong!

Based on these results, if your workload has solely small datasets (i.e. small business apps) the tradeoff allowing you to avoid cgo could be worth it. Otherwise if you care strongly about performance you'll be better off with the real SQLite and mattn/go-sqlite3.

Share

Did you know there is a Go translation of SQLite's C source code? This translation allows you to use SQLite in Go without cgo, which many Go developers find enticing.

But how does it perform? Unfortunately, at least twice as slow in all tested cases.https://t.co/rgarrMJFKQ pic.twitter.com/UzDO9BU371

— Multiprocess Labs (@multiprocessio) May 12, 2022

With questions, criticism or ideas, email or Tweet me.

Also, check out DataStation and dsq.

Stay in the loop about new blog posts; subscribe to the mailing list!

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK