added

New Providers & Models: Mistral large, Perplexity, Gemma 7b and other models are added

Check out the newly added models on orq.ai. You can find them in the model garden.

PerplexityAzureAnycale
Llama-2-70b-chatMistral-LargeGoogle/Gemma-7b-it
Mistral-7b-instructLlama-2-7b
Mixtral-8x7b-instructLlama-2-7b-chat
Pplx-70b-chatLlama-2-70b
Pplx-70b-onlineLlama-2-70b-chat
Pplx-7b-chatLlama-2-13b
Pplx-7b-onlineLlama-2-13b-chat

The table above is an overview of all the newly added models categorized per provider

The Llama models were previously only available through Anyscale, but now Azure also provides them. This is great news for users that work solely with Azure-hosted models.

Perplexity models

We're excited to introduce Perplexity models to orq.ai. Perplexity is unique in a few different ways:

  • Freshness - Whereas other models can't access the internet, Pplx-7b-online and Pplx-70b-online actually can. This allows them to have information that is up-to-date.
  • Hallucinations - In order to prevent inaccurate statements, perplexity's online models can be used to check if the LLM output is similar to the latest information online.

Mistral's new flagship model: Mistral Large

Some key strengths and reasons why you should try out Mistral Large:

  • Fluency and cultural awareness - It can fluently communicate in English, French, Spanish, German, and Italian. It possesses a sophisticated understanding of grammar and cultural context.
  • 32k context window - With its 32k tokens context window, it can accurately recall information from extensive documents.
  • JSON format and function calling - It has an inbuilt function calling feature and constrained output mode which makes it ideal for app development.

Google's Gemma 7b model

Gemma might not be as 'good' as Gemini, but it stands out in other ways.

  • Open source - developers now have access to a transparent model that allows for customization.
  • Speed and cost - because Gemma has only 7 billion parameters instead of Gemini's 60 billion, it is much cheaper and faster. This even allows you to run it on your laptop or other consumer products where low latency is key.