4

Databricks acquires Lilac AI to boost data quality for LLM training

 1 month ago
source link: https://www.infoworld.com/article/3714681/databricks-acquires-lilac-ai-to-boost-data-quality-for-llm-training.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Databricks acquires Lilac AI to boost data quality for LLM training

Lilac AI’s suite of products when integrated with Databricks could help enterprises explore their unstructured data and use it to build generative AI applications.

By Anirban Ghoshal

Senior Writer,

InfoWorld | Mar 20, 2024 3:53 am PDT

artificial intelligence

Data lakehouse provider Databricks said it is acquiring Boston-based Lilac AI to help enterprises explore and use their unstructured data for building generative AI-based applications.

“Today, we are thrilled to announce that Lilac is joining Databricks. Lilac is a scalable, user-friendly tool for data scientists to search, cluster, and analyze any kind of text dataset with a focus on generative AI,” the company wrote in a blog post.

Lilac AI, according to listings on its portal, offered a service named Garden that would allow enterprises to search, quantify, and edit data for large language models (LLMs) that are to be used in generative AI-based applications.

This means Garden will allow data scientists and researchers to explore data clusters, derive new data categories using human feedback and classifiers, and tailor datasets based on these insights.

0 seconds of 26 secondsVolume 0%
This ad will end in 25

The offering, according to Databricks, can also be used to enable analyses of model outputs for bias or toxicity and preparation of data for RAG and fine-tuning or pre-training LLMs.

The integration of Lilac’s Garden tool, post the acquisition, will help Databricks’ enterprise customer to accelerate the development of generative AI applications, the senior executives wrote.

Further, the company executives said that they see Lilac as an essential add-on to MosiacML’s end-to-end tooling for developing generative AI-based applications.

Last year in June, Databricks acquired LLM and model-training software provider MosaicML for $1.3 billion to boost its generative AI offerings.

Lilac AI’s popularity as an open source project in the data science and AI research communities and Databricks’ own Mosiac AI team, which has been leveraging Lilac to curate data over the past year, was the reason behind the acquisition, Zaharia and other senior executives wrote.

Lilac's founders, Daniel Smilkov and Nikhil Thorat, have at least a decade of experience at Google. While Thorat co-created TensorFlow.js and was the former tech lead of the Google Image Search user interface, Smilkov co-led TensorFlow.js at the internet giant.

Databricks, at least for the last year, has been acquiring companies to boost its generative AI capabilities to compete with rivals, such as Snowflake.

Before the Lilac AI and MosiacML acquisition, the company had acquired AI-centric data governance platform provider Okera for an undisclosed sum in May last year.

The acquisition was expected to boost Databricks’ data governance capabilities while training and managing large language models (LLMs), such as its proprietary open source Dolly 2.0 LLM.

Snowflake, too, has been acquiring companies that not only boost its generative AI offerings but also bolster its capabilities around data management.

Last year in May,  the cloud-based data warehouse company acquired Neeva, a startup based in Mountain View, California, for an undisclosed sum in an effort to add generative AI-based search to its Data Cloud platform.

In February 2023, Snowflake acquired LeapYear to boost its data clean room abilities.

The LeapYear acquisition came just a month after Snowflake agreed to buy artificial intelligence-based time series forecasting platform provider Myst AI, taking the company’s acquisition count to seven companies in three years.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK