Implementation of a Distributed LRU Cache Using ZooKeeper
source link: https://dev.to/zakariamaaraki/implementation-of-distributed-lru-cache-using-zookeeper-159b
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
In this tutorial, we are going to see how to create a distributed cache (LRU: Least recently used) using ZooKeeper for leader election.
Before starting you can find the source code of the project at the following link https://github.com/zakariamaaraki/Distributed-LRU-Cache
What is ZooKeeper
Apache ZooKeeper is an open source volunteer project under the Apache Software Foundation. It's essentially a service for distributed systems offering a hierarchical key-value store.
ZooKeeper is used to provide a distributed configuration service, leader-election mechanism, and service discovery / registry for large distributed systems.
To build our distributed cache, we are going to use two features provided by ZooKeeper :
- Leader-election to elect the master (leader) (we need only one instance to talk with. Why ? we are going to see that after)
- Service discovery / registry, to be able to forward requests to the master (leader)
Why Leader election ?
Suppose we have two instances (A and B) with which we communicate, we suppose also that both instances are alwayse synchronized. If we send the request 1:Set(x,1) to instance A, then to instance B request 2:Set(x,2), there is no guarantee that request 1 is executed before request 2, for several reasons, maybe instance A crash and restart or maybe because of the latency.
So the solution to this problem is to have only one instance to talk to, we can solve this problem by choosing one instance to be the master (leader), and we are allowed to talk only to the master.
Ok but what happens if this instance does down (this is a single point of failure), this is not good !!
to solve this problem we need an automatic mechanism that elect a leader whenever the leader instance goes down.
There is a lot of algorithms for doing leader election, for example :
Leader Election with ZooKeeper
Doing leader election with ZooKeeper is very simple. A simple way of doing that is to use the sequence & ephemeral flags when creating znodes that represent proposals of clients. The idea is to have a znode, say /cache-election, such that each znode creates a child znode cache-election/p-0000X With both flags sequence and ephemeral.
With the sequence flag, ZooKeeper automatically appends a sequence number that is greater than any one previously appended to a child of /cache-election. The instance that created the znode with the smallest appended sequence number is the leader.
Service Discovery
At the same time, we're using ZooKeeper as Service Discovery using Spring Cloud Zookeeper that leverages this extension for service registration and discovery.
Consistency vs Availability
Note that according to the CAP theorem, ZooKeeper is a CP system. This implies that it sacrifices availability in order to achieve consistency and partition tolerance. In other words, if it cannot guarantee correct behaviour it will not respond to queries.
Using a Service Discovery with a CP system is not a good idea, so it might be better if we use Eureka (which is a AP system) as a service discovery/registry instead of ZooKeeper (to guarantee availability) but in this case we'll need to set up another cluster.
Implementation
CRUD Operations
Cache Data structure
The java.util.LinkedHashMap.removeEldestEntry() method in Java is used to keep a track of whether the map removes any eldest entry from the map. So each time a new element is added to the LinkedHashMap, the eldest entry is removed from the map. This method is generally invoked after the addition of the elements into the map by the use of put() and putall() method.
public class Cache<K,V> {
public boolean isLeader;
public Long startUpTime;
private Map<K,V> map;
public Cache(int capacity) {
this.startUpTime = System.currentTimeMillis();
map = new LinkedHashMap(capacity, 0.75f, true) {
protected boolean removeEldestEntry(Map.Entry eldest) {
return size() > capacity;
}
};
}
public V get(K key) {
return map.get(key);
}
public boolean containsKey(K key) {
return map.containsKey(key);
}
public V remove(K key) {
return map.remove(key);
}
public V set(K key, V value) {
return map.put(key,value);
}
public String toString() {
return map.toString();
}
}
Leader Election code snippet
private void attemptForLeaderPosition() {
final List<String> childNodePaths = zooKeeperService.getChildren(LEADER_ELECTION_ROOT_NODE, false);
Collections.sort(childNodePaths);
int index = childNodePaths.indexOf(processNodePath.substring(processNodePath.lastIndexOf('/') + 1));
if(index == 0) {
log.info("[Process: " + nodeId + "] I am the new leader!");
cache.isLeader = true;
} else {
final String watchedNodeShortPath = childNodePaths.get(index - 1);
cache.isLeader = false;
watchedNodePath = LEADER_ELECTION_ROOT_NODE + "/" + watchedNodeShortPath;
log.info("[Process: " + nodeId + "] - Setting watch on node with path: " + watchedNodePath);
zooKeeperService.watchNode(watchedNodePath, true);
}
}
There is a lot of things to talk about, so i suggest you to see the whole code of the system by visiting this link : https://github.com/zakariamaaraki/Distributed-LRU-Cache
Recommend
-
8
pmem.ioPersistent Memory Programming libvmemcache - buffer-based LRU cache Introduction...
-
9
Caching in Python Using the LRU Cache Strategy by
-
8
在计算机软件领域,缓存(Cache)指的是将部分数据存储在内存中,以便下次能够更快地访问这些数据,这也是一个典型的用空间换时间的例子。一般用于缓存的内存空间是固定的,当有更多的数据需要缓存的时候,需要将已缓存的部分数据清除后再...
-
14
Implementing an LRU Cache in Rust Jan 28 ・Updated on Jan 29 ・19 min read...
-
4
Implement an LRU Cache
-
2
计算速度太慢?试试 lru_cache 装饰器发布于 今天 14:14 众所周知,python语言是相当好用的,但是它的执行性能也是相对其他语言比较慢的。还好python提供了一个非常...
-
2
用 Go 实现一个 LRU cache crossoverJie · 大约4小时之前 · 41 次点击 · 预计阅读时间 2 分...
-
6
Midpoint Insertion Strategy in MySQL LRU Cache 8 min read 455 reads Published on 26...
-
0
cookiecache Cache beyond the edge with cookies! This library provides a distributed PSR-16 compatible cache implementation for PHP 6 that uses browser cookies to store data, saving HUNDREDS of valuabl...
-
5
Caching in Python With lru_cache There are many ways to achieve fast and responsive applications. Caching is one approach that, when used correctly, makes things much faster while decreasing the load on computing resourc...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK