7

YDB – An open-source Distributed SQL Database

 2 years ago
source link: https://ydb.tech/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
YDB — an open-source Distributed SQL Database
cover.png

YDB is an open-source Distributed SQL Database that combines high availability and scalability with strong consistency and ACID transactions.

icon_01.svg
True Elastic Scalability
Add or remove nodes to easily scale up and scale down as needed. YDB is proven to work in real production with millions of transactions per second and petabytes of data for mission-critical, real-time applications.
icon_03.svg
Fault-tolerant
YDB is designed to work in three availability zones, ensuring availability even if a node or availability zone goes offline.
icon_02.svg
Easy to use
YDB combines strong consistency, ACID transactions, high performance queries, fast data ingest with a familiar SQL dialect and JSON API support. Works with any modern workloads: key-value, relational, JSON.
icon_06.svg
Automatic disaster recovery
Automatic recovery after a disk, server, or even a data center fails, with minimum latency disruptions for applications.
icon_04.svg
Available in any cloud
YDB is available for self-deployment, including Kubernetes, on-premise and cloud environments.
icon_05.svg
Open Source
Licensed under Apache 2.0.
No risk of cloud or vendor lock-in.

Success stories

Market Cart
Site Visits
Jaeger Tracing
Voice Assistant
Smart Home
market.png

Market Cart

The Cart is one of the key components of any marketplace or online store.

Using YDB as a database allowed Market Cart to withstand a hundredfold increase in the load on the Cart, while observing strict guarantees for response times.

Moreover, the migration was completed by just one developer in one month.

Use cases

Dealing with suddenly increasing workloadsYDB’s elasticity allows you to quickly change the amount of resources allocated to the database, adjusting the throughput in accordance with the load. Easily increase or decrease the amount of computing resources as necessary depending on approaching increased workloads, like Black Friday or as the result of your planned marketing campaigns.
Cache with SQL interfaceLow response times and throughput scalability allow YDB to be used simultaneously as an online database and a precomputed cache. SQL access significantly increases usability and enables operational analytics on data in the cache. Tour operator and travel aggregator websites, for example, can use YDB to cache flight or tour search results, as well as recalculate prices and check seasonal availability.
Jaeger TracingSignificant comparative efficiency in computing resources and scalability make Jaeger trace recording cost-effective and easy to use. Time-tested, proven effective.
Document databaseSupport for the JSON API and the JSONDocument data type expands YDB’s capabilities as a document database.
Centralized inventory tracking systemYDB ensures strong transaction consistency. This allows you to provide a consistent view of inventory in thousands of warehouses and retail facilities, and makes YDB the right choice for E-commerce, warehouse, and transport logistics applications.
Storage for the IoT EcosystemWith automatic sharding support, YDB lets you handle data streams from a wide range of devices: the load profile used in IoT projects.

How it works

grps.png

YDB architecture

We use commodity hardware and shared-nothing architecture, disaggregated compute and storage layers, and build a system based on logical components - tablets.

organization.png

Hierarchy

Similar to a file system tables could be organized into a hierarchy using directories.

table.png

Table

YDB provides users with a familiar abstraction: tables. Tables must contain a primary key, the data is sorted by the primary key. Tables are automatically sharded by primary key range by size or load.

nagruz%201.5.png

Split by load

The tablet will automatically split when the load increases.

size%201.5%20(1).png

Split by size

The tablet will automatically split when the size increases.

pills%201.5.png

Automatic balancing

YDB evenly distributes tablets among the nodes, and moves loaded tablets from loaded nodes. CPU, Memory, Network metrics are tracked.

distributed.png

Distributed Storage Internals

We write all the code for working with block devices ourselves. The PDisk component is responsible for working with the block device. Above PDisk is the VDisk abstraction layer. There is a special component — DSProxy between the tablet (part of the table) and VDisk. DSProxy analyzes disk availability and characteristics and depending on it can make a decision to exclude the disk from work.

proxy%202.png

DSProxy

YDB writes data to 3 Availability Zones, doesn’t send requests to obviously bad disks, and continues to operate without interruption even if one AZ and a disk in another AZ are lost.

How to start

Docker
Minikube
Start local cluster

Docker

Pull the current public version of the Docker image:

docker pull cr.yandex/yc/yandex-docker-local-ydb:latest

Create a working directory and start the container from it:

docker run -d --rm --name ydb-local -h localhost -p 2135:2135 -p 8765:8765 -p 2136:2136 -v $(pwd)/ydb_certs:/ydb_certs -v $(pwd)/ydb_data:/ydb_data -e YDB_DEFAULT_LOG_LEVEL=NOTICE -e GRPC_TLS_PORT=2135 -e GRPC_PORT=2136 -e MON_PORT=8765 cr.yandex/yc/yandex-docker-local-ydb:latest

Go to the Getting started — Self-hosted deploy — Docker in the YDB documentation to get detailed information.

YDB in a nutshell


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK