LLMサプライヤー
Groqより
from llama_index.llms.groq import Groq
llm = Groq(model="mixtral-8x7b-32768", api_key="xxx")
response = llm.complete("Explain the importance of low latency LLMs")
print(response)
速度制限。
ID REQUESTS PER MINUTE REQUESTS PER DAY TOKENS PER MINUTE
llama2-70b-4096 30 14,400 15,000
mixtral-8x7b-32768 30 14,400 9,000
gemma-7b-it 30 14,400 15,000
llama3-70b-8192 30 14,400 7,000
llama3-8b-8192 30 14,400 12,000
Together AIについて
カテゴリー別分類
- CHAT(チャット)
- 言語の定義
- ……
チャットの種類が必要
Meta | LLaMA-3 Chat (8B) | meta-llama/Llama-3-8b-chat-hf | 8000 |
Meta | LLaMA-3 Chat (70B) | meta-llama/Llama-3-70b-chat-hf | 8000 |
Microsoft | WizardLM-2 (8x22B) | microsoft/WizardLM-2-8x22B | 65536 |
mistralai | Mistral (7B) Instruct | mistralai/Mistral-7B-Instruct-v0.1 | 8192 |
mistralai | Mistral (7B) Instruct v0.2 | mistralai/Mistral-7B-Instruct-v0.2 | 32768 |
mistralai | Mixtral-8x7B Instruct (46.7B) | mistralai/Mixtral-8x7B-Instruct-v0.1 | 32768 |
mistralai | Mixtral-8x22B Instruct (141B) | mistralai/Mixtral-8x22B-Instruct-v0.1 | 65536 |
curl -X POST "https://api.together.xyz/v1/chat/completions" \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "mistralai/Mixtral-8x7B-Instruct-v0.1",
"messages": [
{"role": "system", "content": "You are an expert travel guide"},
{"role": "user", "content": "Tell me fun things to do in San Francisco."}
]
}'
もっと簡単に
Tier | Rate limit |
---|---|
Free | 1 QPS |
Paid | 100 QPS |
LaIndexの
from llama_index.llms.together import TogetherLLM
# set api key in env or in llm
# import os
# os.environ["TOGETHER_API_KEY"] = "your api key"
llm = TogetherLLM(
model="mistralai/Mixtral-8x7B-Instruct-v0.1", api_key="xxx"
)
resp = llm.complete("Who is Paul Graham?")
print(resp)
BGEバージョンのみ。