Making Sense Out of Datomic, The Revolutionary Non-NoSQL Database
source link: https://blog.jakubholy.net/2013/06/16/making-sense-out-of-datomic-the-revolutionary-non-nosql-database/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Why? Why?!?
As we shall see shortly, Datomic is very different from the traditional RDBMS databases as well as the various NoSQL databases. It even isn't a database - it is a database on top of a database. I couldn't wrap my head around that until now. The key to the understanding of Datomic and its unique design and advantages is actually simple.The mainstream databases (and languages) have been designed around the following constraints of 1970s:
- memory is expensive
- storage is expensive
- it is necessary to use dedicated, expensive machines
But Datomic isn't an academical project. It is pragmatic, it wants to fit into our existing environments and make it easy for us to start using its futuristic capabilities now. And it is not as fresh and green as it might seem. Rich Hickey, the master mind behind Clojure and Datomic, has reportedly thought about both these projects for years and the designs have been really well thought through.
The Weird Architecture of Datomic
- Datomic is a database on top of another database (or rather storage) - in-memory, a file system, a traditional RDBMS, Amazon Dynamo.
- You do not send your query to the server and get back the result. Instead, you get back all the data you need to execute the query and run the query - and all subsequent queries - locally. Thus, "joins" are pretty cheap and you can do plenty of otherwise impossible things (combine data from multiple databases and local data structures, run any code on them, ...). Each application using Datomic - a "peer" - will have the data it needs, based on its unique needs and usage patterns, close to itself.
- All writes go through one component, called Transactor, which essentially serializes the writes, thus ensuring ACID. It might sound as a bottleneck but it isn't for most practical purposes[1] given the design and typical application needs. (Reportedly, Datomic could handle all transactions for all credit cards in the world. Listen to the experiences of Room Key with their rather write-heavy load in the Relevance Podcast with Kurt Zimmer (Podcast Episode 033).)
- Datomic works quite similarly to a version control system such as Git. It never overwrites data, there are no updates. You only mark the data as not valid anymore and add new data, which produces a new version of the database (think of git hash / svn revision number). You can then query the latest state of the database or the state as of a particular version. (Of course the whole database isn't copied whenever you add a fact to it. Datomic is smart and efficient.)
- It is not a single, monolithic server, the storage, transactor, and peers are physically separate pieces.
- Network access as fast as or faster then disk access => can fetch all the data over the network
- Plenty of memory => can store a substantial subset of it on each peer according to its actual needs
- Storage is huge and cheap => we can easily store historical data
- Experiences with efficient, immutable, "persistent" data structures used in modern FP languages => cheap creation of new "database values"
The Unique Value Proposition And Capabilities of Datomic
We have now learned about and hopefully understood the unique design of Datomic. But what does it give to us, what does it distinguish from other databases?The architecture, together with few other design decisions, provides the following key characteristics:
- Programmability - data, schema, query input/output, transaction metadata are all just elementary data structures that you have fully available at the peer and can thus combine and process in powerful ways unimaginable before
- Persistence/accountability - you never lose history, can annotate transactions with metadata about who/why etc., support for finding out how things were, how they have been changing, performing what-if analysis
- Elastic scalability - since a lot of the load has been pushed to the peers
- Flexibility - no rigid schema, easy to navigate and combine and cache data based on each peer's unique needs, extensibility via data functions
Closing Notes
Datomic has similar goals as relational databases (especially ACID) and could be used in similar use cases. Performance-wise, if writes are more important than reads, if you need to write really a lot of data each second continuously, or if you have over billions of "rows" then you might prefer another solution. Thanks to the design and recommended architecture for heavily loaded installations, i.e. with memcached in front of the storage, the performance of the backend isn't so important (as the peers have the data they need locally or get it from memcached) so it should be selected more based on the usage-related characteristics.Summary
The design of Datomic - peers fetching data and running queries locally, a single coordinator of writes (transactor), building on existing databases/storage tools (and keeping all the history) seemed very strange and perhaps inefficient to me until I realized that the traditional databases are designed around constraints that do not exist anymore. Datomic now makes sense to me and seems as a tool with intriguing capabilities and great potential. I hope you see it the same way now :-).I have left out some interesting topics such as what data structures can be stored in Datomic and the data model and query model used. To learn about these and more about Datomic, head to Datomic for Five Year Olds and Datomic's home page.
Bonus Links
- Data functions for optimistic and pesimistic locking in Datomic (forum answer)
- HighScalability.com: VoltDB Decapitates Six SQL Urban Myths and Delivers Internet Scale OLTP in the Process - description of the architecture of VoltDB, that has a few things in common with Datomic (single-threaded writes, "stored procedures" as units of transaction etc.)
- VoltDB - Mike Stonebraker's incredibly scaleable, SQL, ACID database that also breaks up with the constraint of 70s and leverages huge RAM, single-threaded access etc.
Recommend
-
50
Single-use tokens have a variety of security applications. Whether it’s a password-reset token, or capturing a financial transaction, there are times when something should happen exactly once, atomically. Once a token is...
-
41
asami An implementation of the Naga storage protocol in memory. This has a query API that looks very similar to a simplified Datomic. Usage Create a store with:
-
22
Today's releases of Datomic Cloud and
-
1
A wiki made with Clojure, yada and Datomic Client04 Jan 2017 At EuroClojure in Bratislava an unconference session was about Clojure and REST. At this session Liber...
-
4
Gin Rummy with Datomic12 Jun 2014 When I began to learn Clojure I found the Gin Rummy game in the PLT Racket/Scheme program...
-
5
Embrace the blob and "NoSQL" will make sense A recent chat I had with a friend has convinced me that the so-called NoSQL world needs to do a lot more education before more people will "get it". It seems to scare certain individua...
-
3
Gin Rummy with Datomic12 Jun 2014 When I began to learn Clojure I found the Gin Rummy game in the PLT Racket/Scheme program...
-
7
A wiki made with Clojure, yada and Datomic Client04 Jan 2017 At EuroClojure in Bratislava an unconference session was about Clojure and REST. At this session Liber...
-
3
During our regular “tech lunch,” we have got our brains blown by the talk Lucas Cavalcanti & Edward Wible - Exploring four hidden superpowers of Datomic (
-
10
Interactive Datomic Tutorial Your browser is not wide enough to properly render MaxDatom
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK