Google stunned the world with Gemini, a ChatGPT rival that could receive voice instructions and react to a real-time feed of the world while answering those voice prompts. It was miles ahead of what ChatGPT could do. Also, it was all fake. Disappointingly so.
We’ll certainly get to that future, and Google will help lead the way, but Gemini isn’t the ultimate ChatGPT rival quite yet. Google is clearly afraid of ChatGPT, going to such lengths to share a fake demo of its own chatbot or tout Gemini’s supremacy over ChatGPT in benchmark tests. But Google isn’t alone, and you should put Mistral’s AI on your radar.
Coming from a French startup that’s now valued at $2 billion, Mixtral 8x7B (that’s the AI’s name) might be one of the best ChatGPT alternatives out there. But you can’t do anything with it despite it being available in the wild.
According to TechCrunch, Mistral closed its Series A funding, collecting €385 million from investors, or about $415 million. That puts Mistral’s valuation at roughly $2 billion. The Series A round follows a $112 million seed round from earlier this year.
Mitral released its first AI model, Mistral 7B, in September. This was trained on a dataset of around 7 billion parameters, which is considered small in comparison to GPT-4 and Claude 2. The French startup made Mistral 7B (Mistral-tiny) available to anyone to download, though it was aimed at developers. You could not use Mistral like you use ChatGPT.
This month, Mistral unveiled Mixtral 8x7B (Mistral-small). Again, this is available to developers for free. If you know your way around AI software, you can start using the new model to incorporate in your apps. But Mitral 8x7B doesn’t have a commercial face like ChatGPT does.
Mistral isn’t going to give you the ChatGPT generative AI experience you’ve come to expect from such products. Mistral isn’t even the first company to take a different approach. We’ve seen Amazon unveil its own generative AI product that targets professionals only.
Instead, Mistral is opening its platform to developers in beta. A Mistral-medium version is also in the prototyping phase.
Like Google, Mistral uses benchmarks to show that Mixtral 8x7B is better than ChatGPT and LLaMA 2 70B in most tests. Interestingly, Mistral compares its AI to GPT-3.5, the language model that powers the free version of ChatGPT.
Mistral is “pre-trained on data extracted from the open Web – we train experts and routers simultaneously,” according to a company blog post. What are experts and routers? They’re part of how Mistral works:
Mixtral is a sparse mixture-of-experts network. It is a decoder-only model where the feedforward block picks from a set of 8 distinct groups of parameters. At every layer, for every token, a router network chooses two of these groups (the “experts”) to process the token and combine their output additively.
This technique increases the number of parameters of a model while controlling cost and latency, as the model only uses a fraction of the total set of parameters per token. Concretely, Mixtral has 45B total parameters but only uses 12B parameters per token. It, therefore, processes input and generates output at the same speed and for the same cost as a 12B model.
That might be exciting if one could do anything with it. I mean regular mortals like you and I, not developers, who are dying to put this new AI to good use. Elsewhere in the blog post, the company says Mixtral can “gracefully handle” the context of 32K tokens, handles English, French, Italian, German, and Spanish, and it “shows strong performance in code generation.”
While you can use Gemini AI via Bard and the Pixel 8, Google’s newest ChatGPT rival only works in English.
We’ll have to wait for developers to see what they make with Mistral AI next. But the emergence of this new AI company is certainly an exciting development.