Cassandra Internals: Writing Process

Reading Time: 3 minutes

What is Apache Cassandra?

Apache Cassandra is a massively scalable open source non-relational database that offers continuous availability, linear scale performance, operational simplicity and easy data distribution across multiple data centres and cloud availability zones. It was originally developed at Facebook The main reason that Cassandra was developed is to solve Inbox-search problem. To read more about Cassandra you can refer to this blog.

Why you should go for Cassandra over a Relational Database:-

Relational Database Cassandra

Handles moderate incoming data velocity

Handles high incoming data velocity

Data arriving from one/few locations

Data arriving from many locations

Manages primarily structured data

Manages all types of data

Supports complex/nested transactions

Supports simple transactions

Single points of failure with failover

No single points of failure; constant uptime

Supports moderate data volumes

Supports very high data volumes

Centralized deployments

Decentralized deployments

Data are written in mostly one location

Data written in many locations

Supports read scalability (with consistency sacrifices)

Supports read and write scalability

Deployed in vertical scale up fashion

Deployed in horizontal scale out fashion

How the write Request works in Cassandra:-

The client sends a write request to a single, random Cassandra node. The node who receives the request acts as a proxy and writes the data to the cluster.
The cluster of nodes is stored as a “ring” of nodes and writes are replicated to N nodes using a replication placement strategy.
With the RackAwareStrategy, Cassandra will determine the “distance” from the current node for reliability and availability purposes.
Now”distance” is broken into three buckets: the same rack as the current node, same data centre as the current node, or a different data centre.
You configure Cassandra to write data to N nodes for redundancy and it will write the first copy to the primary node for that data, the second copy to the next node in the ring in another data centre, and the rest of the copies to machines in the same data centre as the proxy.
This ensures that a single failure does not take down the entire cluster and the cluster will be available even if an entire data centre goes offline.

In Short, the write request goes from your client to a single random node, which sends the write to N different nodes according to the replication placement strategy. Now node waits for the N successes and then returns success to the client.

Each of those N nodes gets that write request in the form of a “RowMutation” message. The node performs two actions for this message:

Append the mutation to the commit log for transactional purposes
Update an in-memory Memtable structure with the change

Cassandra does not update data in-place on disk, nor update indices, so there are no intensive synchronous disk operations to block the write.

There are several asynchronous operations which occur regularly:

A “full” Memtable structure is written to a disk-based structure called an SSTable so we don’t get too much data in-memory only.
The set of temporary SSTables which exist for a given ColumnFamily is merged into one large SSTable. At this point, the temporary SSTables are old and can be garbage collected at some point in the future.

That’s how the Writing process works in Cassandra internally.

Refrences :- A Brief Introduction to Apache Cassandra and Cassandra Internals

What is Apache Cassandra?

Why you should go for Cassandra over a Relational Database:-

How the write Request works in Cassandra:-

Recommend

2021趋势研判：母婴市场现状与行业趋势

新能源车崛起，传统汽车如何避免诺基亚式危机？

四个维度，通过直播塑造个人品牌

五芳斋贺岁片《小福气》，看完被治愈了！

参与率82%的线上跨年，口碑爆棚的社群活动原来可以这么玩？

Increase mobile game revenue with cross-platform functionality (VB Live)

The future of creation and play in the Metaverse

Revolution of AI in 2020! Is It Real?

Self Awareness: The Key to Mental Health and Workplace Success

Code a Java Game with (almost) Zero Coding Skills

About Joyk