Databricks acquires Lilac to supercharge data quality efforts for gen AI apps

Key Points:

  • Databricks acquires Lilac, a Boston-based data understanding and manipulation startup, to enhance data quality for large language model applications.
  • Databricks invests in Mistral, focusing on generative AI, to become a comprehensive platform for data and generative AI solutions.
  • Lilac’s scalable open-source solution provides AI-driven features to analyze and modify unstructured text data, assisting teams in ensuring high-quality data for training AI models.


Databricks has announced the acquisition of Lilac, a Boston-based startup focused on tools for data understanding and manipulation. The terms of the deal were not disclosed. By bringing Lilac’s team and technology into its data intelligence platform, Databricks aims to enhance users’ ability to improve data quality for developing large language model (LLM) applications.


This move aligns with Databricks’ strategy to expand beyond being a data platform and become a comprehensive solution for generative AI applications. Previously, the company invested in Mistral, a notable generative AI startup, and acquired Mosaic AI. These initiatives position Databricks as a key player in the generative AI domain.


Lilac, founded by former Google engineers in 2023, provides an open-source solution for exploring and manipulating unstructured data. Its scalable platform offers intuitive user interfaces and AI-driven features for analyzing, understanding, and modifying text data at scale. The acquisition of Lilac by Databricks will enable developers to curate datasets for custom generative AI systems more effectively.


Databricks executives highlighted the value of Lilac’s technology in analyzing model outputs for bias or toxicity and preparing data for various AI applications. By integrating Lilac’s tech stack into Databricks’ Mosaic AI tooling, the goal is to simplify data curation processes and improve visibility and control over unstructured data for businesses.


This strategic acquisition reinforces Databricks’ commitment to providing end-to-end tools for developing high-quality generative AI applications. The company’s platform now offers a range of resources for building LLM-powered systems, including open models from industry leaders and specialized tools for experimentation and customization.



Prompt Engineering Guides



©2024 The Horizon