Discover more from Datascience Learning Center
Databricks acquires AI startup MosaicML in $1.3 billion deal
Major companies are buying up Open-source AI tools to position themselves in this hot sector.
I’ll be writing slightly less than my usual out for a few weeks in the Summer. This news really got my attention!
Just as open-source A.I. is starting to have its Linux moment, major companies and BigTech are pivoting their approach and angle in how agnostic, how open or closed they are going to be.
MosaicML has been working on reinventing what a database is via artificial intelligence. It’s also a bit like a Hugging Face, where it enables you to easily train and deploy LLMs and other generative AI models on your data, in your secure environment and deploy LLMs.
It’s definately a “democratization of AI” play. Databricks and Snowflake are both super fun to follow and are diversifying impressively well.
I really did not expect this. Databricks, the Data and AI company, today announced it has entered into a definitive agreement to acquire MosaicML, a leading generative AI platform. Together, Databricks and MosaicML will make generative AI accessible for every organization, enabling them to build, own and secure generative AI models with their own data. The transaction is valued at approximately $1.3 billion, inclusive of retention packages.
In late June, 2023 it definately feels like M&A and IPOs are set to accelerate for the second half of 2023.
But wait a second, MosaicML is only a 2 year old startups, and this means: wait for it - the valuation is wack, 2-year-old startup for $21 million per employee! Wow that’s something. It’s a premium Databricks must think is worth it and a very fast exit.
With enterprises looking to leverage LLMs, it’s a hot sector and especially along the lines of open-source. I mean they just released MPT-30B. It’s sort of a big deal. Prior to this, MosaicML had raised just under $64 million from investors that included DCVC, AME Cloud Ventures, Lux, Frontline, Atlas, Playground Global and Samsung Next.
LLMs are putting pressure on companies to adapt fast like never before. This is a good move by Databricks I think.
Databricks snaps up MosaicML to build private, custom machine models
Naveen Rao, [left] MosaicML co-founder and CEO, and Hanlin Tang, co-founder and CTO. The company's training technologies are being applied to "building experts," using large language models more efficiently to handle corporate data.
Think about it though, this young startup is staffed with semiconductor veterans, has built a program called Composer that makes it easy and affordable to take any standard version of AI programs such as OpenAI's GPT and dramatically speed up the development of that program, the beginning phase known as the training of a neural network.
The implications here are Databricks fortifying itself with LLMs, stamping its place in open-source and doing a rather expensive acqui-hire.
And I love how crazy it is and feels. But think about it, there’s a lot of synergy here. Databricks' core platform helps customers store and sort incoming data from different sources in their own cloud clusters, while MosaicML offers tools to spin up custom AI models at low cost.
So there’s this crazy active ecosystem now running simultaneously all over the world dealing with open-source innovation at the intersection of A.I. Hugging Face and recent startup wonder-kid Mistral AI are just the tip of the iceberg.
Democratization of AI Buzzword Play
“Every organization should be able to benefit from the AI revolution with more control over how their data is used. Databricks and MosaicML have an incredible opportunity to democratize AI and make the lakehouse the best place to build generative AI and LLMs,” Ali Ghodsi, cofounder and CEO of Databricks, said in the release.
I guess what people are talking about is the wackly wild valuation. Notably, its last investor-round valuation was just $222 million — meaning it’s leaped 6x with this exit, a remarkable price that really does underscore just how frothy the AI market is right now, as well as the demand for talent and tech in the space.
Companies aligning themselves to Open-source is the new cool in A.I., as Meta’s stock share price has found out and as Amazon is investing in. Meanwhile Microsoft and Google still feel a bit in the old world of closed source with hype trains of OpenAI vs. Gemini, and so on.
Databricks Upgrade in LLM Credibility
There is a lot of synergy here though, Databricks immediately elevates itself in terms of LLM credibility. The deal will see MosaicML become a part of the Databricks Lakehouse Platform, providing generative AI tooling alongside the Databricks’ existing multicloud offerings, which include integration, storage, processing, governance, sharing, analytics and AI-related services.
MPT-30B looks really impressive, I’m not going to lie.
The startup this year introduced cloud-based commercial services where businesses can for a fee both train a neural network and perform inference, the rendering of predictions in response to user queries. Too good, to pass up on, as it turns out.
However, the more profound element of MosaicML's approach implies that whole areas of working with data -- such as the traditional relational database -- could be completely reinvented. This alone is getting a lot of attention it would seem.
The quality of its customer base also really stands out, these are some pioneering and credible teams. Its customers include the Allen Institute for AI, Generally Intelligent, Hippocratic AI, Replit and Scatter Labs.
In my opinion, this elevates a speculative valuation for Databricks that has yet to IPO, at around $42 billion. No doubt it’s going to be a monster IPO when it happens. But who knows in this world, Google acquiring Stripe is a rumor that peeked my interest, so anything can happen. Databricks itself would be a nice acquisition for someone like Amazon.
MosaicML was an Industry Leader
If you think of the open-source buzz in 2023, MosaicML was a major player. MosaicML is known for its state-of-the-art MPT large language models (LLMs). With over 3.3 million downloads of MPT-7B and the recent release of MPT-30B, MosaicML has showcased how organizations can quickly build and train their own state-of-the-art models using their data in a cost-effective way.
It was bringing LLMs to the masses and smaller enterprises and companies like few others could offer.
The announcement Tweet is sort of epic and it really tells you the pace of LLMs being integrated in Enterprise and data companies now.
Given how important LLMs are going to be across the board, this could be a monumental acquisition looking back historically speaking.
The Future of the Database
Unlike a traditional relational database, such as Oracle, or a document database, such as MongdoDB, said Rao, where the schema is preordained, with a large language model, "the schema is discovered from [the data], it produces a latent representation based upon the data, it's flexible." And the query is also flexible, unlike fixed lookups into a database such as SQL, which dominates traditional databases.
The San Francisco-based vendor was founded in 2021 and had raised $37 million in venture capital. It seems like peak open-AI LLM hype, so well timed and well played.
1 Month ago
So did MosaicML have a data moat for its open-source enterprise LLM on-boarding?
Databricks said that the entire MosaicML team will join Databricks after the deal closes — a retention deal that likely could be one reason for the large price tag here.
MosaicML had made a name for itself with performance achievements by demonstrating its prowess in the MLPerf benchmark tests (ZDnet) that show how fast a neural network can be trained. Among the secrets to speeding up AI is the observation that smaller neural networks, built with greater focus, can be more efficient.
MosaicML seems like a way bigger fish than Snowflake bagging Neeva. But either way, databases and search are both really trending with LLMs it turns out.
Thanks for reading!