Two new open-source large language models have emerged as competitors to closed-source giants like OpenAI and Anthropic. AI21 Labs introduced Jamba, a hybrid model that combines a Transformer with a state space model, showing comparable performance to larger models on various benchmarks while significantly reducing memory usage. Jamba also boasts the ability to process the highest number of characters or words among open-source models. Its code is available under the Apache open-source license.
Meanwhile, Databricks unveiled DBRX, developed by its internal AI team, MosaicML, utilizing a “mixture of experts” approach to conserve computing resources. Despite using only 36 billion out of its 132 billion neural weights for predictions, DBRX outperforms larger models like GPT-3.5 in language understanding and coding tests. DBRX also demonstrates faster generation speeds as a chatbot compared to models with more parameters. Databricks positions DBRX as a vehicle to drive the adoption of open-source models in enterprises, emphasizing customization and improved AI application quality. The code for DBRX is available on GitHub and Hugging Face under Databricks’ open-source license.
While these advancements mark significant progress in the open-source AI landscape, one notable limitation is the models’ lack of multimodal capabilities compared to models like GPT-4 and Gemini, which can process both text and images/videos. Despite this drawback, the efficiency gains and impressive performance of Jamba and DBRX position them as compelling options for organizations seeking to leverage advanced language models in their AI applications.