Mixtral is a mystery to me. How in the world is that team on par with/beating GO...

n2d4 · on Dec 19, 2023

Mixtral is on-par with Gemini Pro, not Gemini Ultra (and even there it is further behind Gemini Pro than Gemini Pro is behind GPT 3.5). But to directly answer your question, they are quite well-funded, having raised over $700mil to date. I definitely wouldn't count them out.

coder543 · on Dec 19, 2023

Mixtral ranks higher than Gemini Pro on the (subjective) Chatbot Arena Leaderboard: https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboar...

Where are you seeing that it is "further behind Gemini Pro than Gemini Pro is behind GPT 3.5"?

jsnell · on Dec 19, 2023

Presumably in the very article this HN submission is for (https://arxiv.org/pdf/2312.11444.pdf), table 1.

coder543 · on Dec 19, 2023

Mixtral is missing in half of the benchmarks in that paper. Hardly conclusive. It’s also common knowledge that these benchmarks have a lot of issues[0]. A good litmus test, but not a substitute for actually seeing how the models do in the real world.

On the topic of “hardly conclusive” things, Gemini Pro literally told me just a few minutes ago[1] that the Avatar movies did not have humans in them. There was no funny business in the prompting. At least Mixtral knows that Avatar has humans in it. Most of Gemini Pro’s responses have been fine, but not exceptional.

[0]: one random article talking about these issues: https://www.surgehq.ai//blog/hellaswag-or-hellabad-36-of-thi...

[1]: https://i.imgur.com/En37EJD.png

euazOn · on Dec 19, 2023

Gemini Ultra is not out yet. With the same logic, you could compare an unreleased Mistral model with Gemini Ultra.

n2d4 · on Dec 19, 2023

Right. I'm just pointing out that comparing one model with a distilled version of another and then making broad statements about the companies behind them isn't really useful.

Surely you could make a comparison of two unreleased models, but it wouldn't be interesting because you don't have any real data (and benchmarks don't really mean anything).

jstummbillig · on Dec 19, 2023

Debating the usefulness of hn commentary is a somewhat philosophical issue, but I think it's entirely fair to draw parallels between what is, not what might be.

Gemini Ultra is self-evidently not ready for production. What the issues are? Who knows, but in a game that as of right now is mostly about reducing the amount of brute force required, something as "simple" as not being efficient enough is actually not something to gloss over. If your engines entire stick is having the greatest graphics but you can't make it run at acceptable fps, well, then it's not actually a usable product.

A LLM that is not actually released could very well be in a comparably dire state and fixing it while also delivering on the promised performance might be entirely non-trivial.

sp332 · on Dec 19, 2023

Mistral “Medium” is available (in beta, via API) and should give better results than the “Small” mixtral model.

carterschonwald · on Dec 19, 2023

My understanding, however fuzzy, is that all the safety/politeness tuning results in models that are at times less likely to give accurate responses. That said, I suspect that either way both types of models largely give similar answers for soft questions aside from those politeness and safety things

cfiggers · on Dec 19, 2023

There's a survivorship bias going on here. You've never heard of the thousands of teams out there that are Mistral's size but AREN'T getting results that compete on the global stage, but they do exist. But you've heard of Google, whether they're getting it right or not.

paxys · on Dec 19, 2023

"Thousands of teams" is a vast exaggeration. A tiny handful of companies out there have received funding to the tune of a billion dollars for model training like Mixtral. All of them have researchers with loaded resumes, and most are producing stuff of value. The thousands of other startups in the ecosystem are then taking these APIs and adding trivial abstractions on top.

pixl97 · on Dec 19, 2023

I wonder how much GPU compute time is effectively being discarded by these companies as the results turn out to be garbage?

moffkalast · on Dec 19, 2023

Mistral.AI was founded by three people from Deepmind, they're beating Google because Google no longer has them.

guyomes · on Dec 19, 2023

Slight correction, Mistral.AI was funded by two people from Meta (Guillaume Lample, Timothée Lacroix) and one from Deepmind (Arthur Mensch).

For new technologies, what matters most might be the universities where people are from, rather than the companies. The founders of Google graduated from Stanford. The founders of Mistra AI graduated from École Polytechnique and École Normale Supérieure, that are renowned in France, notably for their scientific formations.

paxys · on Dec 19, 2023

Same as OpenAI, Anthropic, Cohere, Adept and hundreds of other small-mid sized AI startups. When the dust settles and the space gets more mature the exodus from Google Brain/Deepmind over the last few years will be considered this generation's Fairchild moment.

bitshiftfaced · on Dec 19, 2023

I wonder if Google sees MoE as a sort of local maxima, and so they tried a different path hoping it might outperform it.