

SQLite in Go, with and without cgo
source link: https://datastation.multiprocess.io/blog/2022-05-12-sqlite-in-go-with-and-without-cgo.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

SQLite in Go, with and without cgo
Most people use the mattn/go-sqlite3 package to interact with SQLite in Go. This package uses cgo and bundles the SQLite C amalgamation files with the Go source.
But Go developers often prefer not to use cgo (see for example cgo is not go). I mention this because there happens to be an underrated translation of SQLite's C source code to Go. Since it is a translation of C to Go, you don't need to use cgo to call into it. Some developers find this idea compelling. And this Go translation is an impressive work in and of itself.
But for real-world usage there are at least two major concerns I had: compatibility and performance. According to their documentation they pass most or all SQLite3 tests, so it seems pretty compatible. But I didn't see anything on their site or documentation (which, to their detriment, is pretty scant) that talked about performance. So I took a look.
This post summarizes some basic ingestion and query benchmarks using
mattn/go-sqlite3 and
modernc.org/sqlite. tldr; the
Go translation is consistently at least twice as slow for both small
datasets and large, for INSERT
s and SELECT
s.
Benchmarks
There are two main benchmarks, one ingest benchmark and one query benchmark. They are both run 10 times and both run against a growing number of rows.
Ingest
This benchmark inserts 10_000, 479_827, and 4_798_270 rows 10 times each. There are ten columns and the contents are a mix of randomly generated strings and integers.
Query
This benchmark runs a single GROUP BY
query: SELECT COUNT(1), age FROM people GROUP BY age ORDER BY COUNT(1) DESC
. It runs 10 times
against each of the sizes of rows ingested.
Here is the mattn/go-sqlite3 version. And here is the SQLite Go translated version.
Machine Specs
I am running these benchmarks on a dedicated bare metal instance, OVH Rise-1.
- RAM: 64 GB DDR4 ECC 2,133 MHz
- Disk: 2x450 GB SSD NVMe in Soft RAID
- Processor: Intel Xeon E3-1230v6 - 4c/8t - 3.5 GHz/3.9 GHz
Results
Below are the averages across all 10 runs for each query category, number of rows acting on, and the library (cgo or no cgo) used.
Category | Average Time (Seconds) | Standard Deviation | # Rows | Library |
---|---|---|---|---|
insert | 0.046708 | 0.002034 | 10000 | mattn |
insert | 0.094758 | 0.001631 | 10000 | modernc |
group_by | 0.000048 | 0.000006 | 10000 | mattn |
group_by | 0.003762 | 0.000246 | 10000 | modernc |
insert | 2.148416 | 0.012983 | 479827 | mattn |
insert | 4.533048 | 0.01208 | 479827 | modernc |
group_by | 0.000048 | 0.000006 | 479827 | mattn |
group_by | 0.230283 | 0.002151 | 479827 | modernc |
insert | 21.344969 | 0.084976 | 4798270 | mattn |
insert | 45.322141 | 0.114671 | 4798270 | modernc |
group_by | 0.000051 | 0.000005 | 4798270 | mattn |
group_by | 2.791617 | 0.01678 | 4798270 | modernc |
Summary
modernc.org/sqlite is a very
impressive project. But it is at least twice as slow in every
variation even with smaller datasets. And SELECT
performance gets
way worse as the dataset size increases. The Go translation also has a
greater standard deviation which might be attributable to the garbage
collector. My guess is that mattn/go-sqlite3 is so constant in
SELECT
performance even as the dataset increases because even 4
million rows is peanuts for SQLite. Or maybe my benchmark code is wrong!
Based on these results, if your workload has solely small datasets (i.e. small business apps) the tradeoff allowing you to avoid cgo could be worth it. Otherwise if you care strongly about performance you'll be better off with the real SQLite and mattn/go-sqlite3.
Share
Did you know there is a Go translation of SQLite's C source code? This translation allows you to use SQLite in Go without cgo, which many Go developers find enticing.
— Multiprocess Labs (@multiprocessio) May 12, 2022
But how does it perform? Unfortunately, at least twice as slow in all tested cases.https://t.co/rgarrMJFKQ pic.twitter.com/UzDO9BU371
With questions, criticism or ideas, email or Tweet me.
Also, check out DataStation and dsq.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK