ArangoDB 3.12 – Performance for all Your Data Models

Estimated reading time: 4 minutes

We are proud to announce the GA release of ArangoDB 3.12!

Congrats to the team and community for the latest ArangoDB release 3.12! ArangoDB 3.12 is focused on greatly improving performance and observability both for the core database and our search offering. In this blog post, we will go through some of the most important changes to ArangoDB and give you an idea of how this can be utilized in your products.

Just in case you prefer to try ArangoDB 3.12 directly rather than just reading about it, you can either download the Community Version orEnterprise Trial, pull our docker images, or head over to our Managed Service ArangoGraphfor a free trial.

Improved memory accounting and usage

Version 3.12 features multiple improvements to the observability of ArangoDB deployments. Memory usage is more accurately tracked and additional metrics have been added for monitoring the memory consumption.

Note that AQL queries may now report a higher memory usage and thus run into memory limits sooner.

This AQL efficiently identifies accounts involved in a suspicious chain of transactions originating from a flagged account, considering the rapidity and sequence of these transactions.

The RocksDB block cache metric rocksdb_block_cache_usage now also includes the memory used for table building, table reading, file metadata, flushing and compactions by default.

Furthermore, the memory usage of some subsystems has been optimized. When dropping a database, all contained collections are now marked as dropped immediately. Ongoing operations on these collections can be stopped earlier, and memory for the underlying collections and indexes can be reclaimed sooner. Memory used for index selective estimates is now also released early. ArangoSearch has a smaller memory footprint for removal operations now.

All these changes together will make 3.12 much more stable and resilient against out-of-memory situations, in particular in resource constraint situations like containerized deployment, since memory scarcity is detected earlier and handled more gracefully.

Parallel execution within an AQL query

The new async-prefetch optimizer rule allows certain operations of a query to asynchronously prefetch the next batch of data while processing the current batch, allowing parts of the query to run in parallel. This will lead to performance improvements if there is still reserve (scheduler) capacity.

The new Par column in a query explain output shows which nodes of a query are eligible for asynchronous prefetching. Write queries, graph execution nodes, nodes inside subqueries, LIMIT nodes and their dependencies above, as well as all query parts that include a RemoteNode are not eligible.

The profiling output for queries includes a new Par column as well, but it shows the number of successful parallel asynchronous prefetch calls.

Improved joins

The AQL optimizer now automatically recognizes opportunities for improving local joins (e.g. using smart-joins or satellite collection) using the merge join algorithm. Queries containing segments of two or more index scans local to a database server can now be optimized, if the filter conditions are eligible.

Multi-dimensional indexes

The previously experimental ZKD index type is now stable and has been renamed to MDI. Existing indexes keep the ZKD type.

Multi-dimensional indexes can now be declared as sparse to exclude documents from the index that do not have the defined attributes or if they are explicitly set to null values. If a value other than null is set, it still needs to be numeric.

Multi-dimensional indexes now support storedValues to cover queries for better performance.

An additional MDI-prefixed index variant has been added that lets you specify additional attributes for the index to narrow down the search space using equality checks. This can, for example, be used as a vertex-centric index for graph traversals, if created on an edge collection with the first attribute in prefixFields set to _from or _to.

WAND optimization (Enterprise Edition)

For ArangoSearch Views and inverted indexes (and by extension search-alias Views), you can define a list of sort expressions you want to optimize. This is also known as WAND optimization.

If you query a View with the SEARCH operation in combination with a SORT and LIMIT operation, search results can be retrieved faster if the SORT expression matches one of the optimized expressions.

Only sorting by highest rank is supported, that is, sorting by the result of a scoring function in descending order (DESC).

SEARCH parallelization

Search queries can now be parallelized across segments using multiple threads. This helps to speed up many queries. The effect is particularly spectacular if not all search data is cached in RAM, since then reading the data from disk or SSD is the bottleneck for the query. We have seen speedups of 16x in such situations because the parallelization helps to better use the available I/O bandwidth.

Other notable features

Wildcard Analyzer
multi_delimiter Analyzer
External versioning support
Filter matching syntax for UPSERT operations
readOwnWrites option for UPSERT operations
Added AQL functions:
- PARSE_COLLECTION()
- PARSE_KEY()
- REPEAT()
- TO_CHAR()
- RANDOM()
Improved late document materialization
Transparent compression of requests and responses between ArangoDB servers and client tools (to save network bandwidth between availability zones)

Learn more

Join us on April 4th, 2024 for our release webinar to learn more about ArangoDB 3.12. Click to Register.

ArangoDB 3.12 – Performance for all Your Data Models

ArangoDB 3.12 – Performance for all Your Data Models

Improved memory accounting and usage

Parallel execution within an AQL query

Improved joins

Multi-dimensional indexes

WAND optimization (Enterprise Edition)

SEARCH parallelization

Other notable features

Learn more

Recommend

Spring is coming! - ArangoDB meets Spring Data - ArangoDB

arangodb docker image 镜像

Github GitHub - nglabo/FSharp.ArangoDB: A consistent and minimal F# driver for A...

BI Connectors for Tableau, Qlik, PowerBI and others - ArangoDB

September 2021: What's the Latest with ArangoDB? - ArangoDB

Introducing the ArangoDB-DGL Adapter

Introducing the ArangoDB-NetworkX Adapter

Bridging Knowledge and Language: ArangoDB Empowers Large Language Models for Rea...

Bridging Knowledge and Language: ArangoDB Empowers Large Language Models for Rea...

Introducing ArangoDB’s Data Loader : Revolutionizing Your Data Migration Experie...

About Joyk