Introducing DBRX: A New State-of-the-Art Open LLM

Key Points:

DBRX sets a new state-of-the-art for open LLMs and surpasses GPT-3.5.
DBRX’s fine-grained mixture-of-experts architecture improves training and inference performance.
DBRX outperforms established open source models on language understanding, programming, and math benchmarks.

Summary:

Databricks has unveiled DBRX, an innovative large language model (LLM) that surpasses established open LLMs in benchmarks, outperforming models like GPT-3.5 and challenging GPT-4. DBRX’s state-of-the-art capabilities are highlighted by its fine-grained mixture-of-experts (MoE) architecture, leading to improvements in training and inference performance. It excels in tasks ranging from language understanding to programming and mathematics.

DBRX stands out for its efficiency, being smaller in size yet delivering superior performance compared to other models. It was trained on a curated dataset using a suite of Databricks tools, achieving advancements in model quality. The model is available on Hugging Face under an open license for Databricks customers to access via APIs, enabling pretraining and finetuning. DBRX is integrated into GenAI-powered products, showcasing exceptional performance in applications like SQL.

Databricks’ dedication to efficient training is evident in DBRX, which exhibits significant gains in training efficiency, requiring less compute than previous models. In inference, DBRX shines with high throughput, surpassing dense models like LLaMA2-70B. Leveraging an MoE architecture, DBRX strikes a balance between model quality and inference speed. Additionally, DBRX competes strongly with closed models like GPT-3.5, Gemini 1.0 Pro, and Mistral Medium across various benchmarks.

The development of DBRX involved rigorous scientific and performance challenges, made possible by Databricks’ robust training stack. Leveraging tools like Unity Catalog, Apache Spark™, and MLflow, Databricks produced DBRX over three months, building on years of LLM expertise. The model’s release marks a step towards advancing GenAI capabilities, empowering enterprises and the open community to harness the potential of advanced language models.

March 27, 2024

Original Source

BUSINESS

Snowflake says its new LLM outperforms Meta’s Llama 3 on half the training

April 24, 2024

Investors are growing increasingly weary of AI

April 18, 2024

AI startup Tome restructures to focus on paying customers, lays off staff

April 18, 2024

iOS 18 to include limited on-device AI features

April 18, 2024

The end-to-end AI chain emerges – it’s like talking to your company’s top engineer

April 18, 2024

HEALTH

Startup Uses AI to Edit Human DNA

April 23, 2024

Google and Bayer announce an AI platform to cut radiologists’ workloads

April 14, 2024

Hopes rise for mRNA cancer vaccine after Moderna trial shows promise

April 14, 2024

AI assists clinicians in responding to patient messages at Stanford Medicine

April 7, 2024

Google DeepMind alumni unveil Bioptimus: Aiming to build first universal biology AI model

March 26, 2024

WORLD AFFAIRS

Deepfakes in the courtroom: US judicial panel debates new AI evidence rules

April 24, 2024

AI Can Tell Your Political Affiliation Just by Looking at Your Face, Researchers Find

April 24, 2024

Tech exec predicts ‘AI girlfriends’ will create $1B business: ‘Comfort at the end of the day’

April 18, 2024

65% of educators think AI can save them time on admin tasks, a new study finds

April 18, 2024

Feds appoint “AI doomer” to run US AI safety institute

April 18, 2024

TECHNOLOGY

These AI Tokens Are Set to Merge—Here’s How It Will Work

April 18, 2024

GPT-4 Turbo reclaims ‘best AI model’ crown from Anthropic’s Claude 3

April 18, 2024

Google will outpace Microsoft in AI investment, DeepMind CEO says

April 18, 2024

Mistral CEO Says AI Companies Are Trying to Build God

April 18, 2024

Anthropic CEO Says That by Next Year, AI Models Could Be Able to “Replicate and Survive in the Wild”

April 18, 2024

CREATIVE

TCL’s first original movie is an absurd-looking, AI-generated love story

April 18, 2024

New AI music generator Udio synthesizes realistic music on demand

April 14, 2024

Spotify launches AI Playlist allowing users to create playlists with prompt

April 14, 2024

Assembly AI claims its new Universal-1 model has 30% fewer hallucinations than Whisper

April 7, 2024

Meta’s AI image generator struggles to create images of couples of different races

April 7, 2024

AI INFLUENCERS

Marc Andreessen

John Carmack

Paul Christiano

Clem Delangue

Timnit Gebru

Geoffrey Hinton

Jensen Huang

Lila Ibrahim

Rana el Kaliouby

Andrej Karpathy

Robert Miles

Ruslan Salakhutdinov

Kevin Scott

Aravind Srinivas

Mustafa Suleyman

Eliezer Yudkowsky

AI ORGANIZATIONS

DeepMind (Alphabet)

Weights & Biases

AI MODELS

Popular Large Language Models
ALPACA (Stanford)
BARD (Google)

Gemini (Google)
GPT (OpenAI)
LLaMA (Meta)

Mistral 7B (Mistral)

Mixtral 8x7B (Mistral)
PaLM-E (Google)
VICUNA (Fine Tuned LLaMA)

Popular Image Models

DALL-E3 (OpenAI)

Imagen (Google)

Stable Diffusion (StabilityAI)

Leaderboards

Hallucination Leaderboard

Enterprise Leaderboard

Safety Leaderboard

NPHardEval Leaderboard

NOTABLE AI APPS

Chat

ChatGPT (OpenAI)

Poe (Quora Aggregator)

Image Generation / Editing

DALL-E (OpenAI)

Audio / Voice Generation

AudioBox (Meta)

AudioCraft (Meta)

Lyria (DeepMind)

MusicFX (Google)

Video Generation

DAILY LINKS TO YOUR INBOX

AI GLOSSARY

AI 101

AI History and Overview

Neural Networks

AI Revolution (Part1 | Part2 )

Kurzgesagt
Technological Singularity
Foundation Models

AGI (Artificial General Intelligence)

ASI (Artificial Super Intelligence)

Andreessen Horowitz AI Cannon

AI RESEARCH PAPERS

AI REGULATION

Blueprint for an AI Bill of Rights

PROMPT ENGINEERING

Prompt Engineering Guides

LEARN AI

DeepLearning.ai

IBM SkillsBuild

Harvard Intro to AI

Microsoft: Generative AI for Beginners

MIT Intro to ML

Stanford / Udacity ML Specialization

Udacity Intro to AI

EMAIL: [email protected]

©2024 The Horizon