This week in tech: Open source LLM

davidn# · ‎12-20-2023

You may have heard about Mistral AI, a French company in artificial intelligence. It was founded in April 2023 by researchers previously employed by Meta and Google. It has raised 385 million euros, or about $415 million in October 2023. In December 2023, it attained a valuation of more than $2 billion.

The company produces open source large language models, most notable of them are Mistral and Mixtral. Mistral AI just released Mixtral 8x7B, a high-quality sparse mixture of experts model (SMoE) with open weights. Licensed under Apache 2.0. Mixtral outperforms Llama 2 70B on most benchmarks with 6x faster inference. It is the strongest open-weight model with a permissive license and the best model overall regarding cost/performance trade-offs. In particular, it matches or outperforms GPT3.5 on most standard benchmarks.

Mixtral has the following capabilities:

It gracefully handles a context of 32k tokens.
It handles English, French, Italian, German and Spanish.
It shows strong performance in code generation.
It can be finetuned into an instruction-following model that achieves a score of 8.3 on MT-Bench.

The following table compares Mixtral to the Llama 2 family and the GPT3.5 base model. Mixtral matches or outperforms Llama 2 70B, as well as GPT3.5, on most benchmarks.

Source: https://mistral.ai

You can download and run Mistral's 8x7B model and its uncensored varieties using open-source tools locally. It could be a good alternative to GPT-4, and you can fine tune it with your own data. To learn more about Mistral AI models, check out their documentation page