LlamaIndex-第2編(QAと評価)
生産レベルのパラダイム
QA
User Case:
What
-意味クエリ(Semantic search/Top K) -まとめ
♪Where♪
- Over documents
- Building a multi-document agent over the LlamaIndex docs -Over structured data(例えば、JSON) -Searching Pandas tables -Text to SQL
♪How*
上のリンクはすべて指しています:次のQ&A patterns
最も単純なQ&Aです
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)
異なるデータソースを選択する
Compare/Contrast Queries
これはわかりません
Besides the explicit synthesis/routing flows described above,LlamaIndex can support more general multi-document queries as well。それは次の通りだ。Given a query,this query engine will generate a“query plan”containing sub-queries against sub-documents before synthesizing the final answer.
This query engine can execute any number of sub-queries against any subset of query engine tools before synthesizing the final answer.This makes it especially well-suited for compare/contrast queries across documents as well as queries pertaining to a specific document。
LlamaIndex can also support iterative multi-step queries.Given a complex query,break it down into an initial subquestions,and sequentially generate subquestions based on returned answers until the final answer is returned.
For instance,given a question“Who was in the first batch of the accelerator program the author started?”,the module will first decompose the query into a simpler initial question“What was the accelerator program the author started?”,query the index,and then followask questions.
Eval
-応答の評価 -評価検索
-応答の評価 -GPT-4を使用して評価 -評価の次元 -生成された答えと参照答え:正解性および意味的類似度 -生成された答えとretrieved contexts:Faithfulness -生成された答えとQuery:Answer Relevancy -retrieved contextsとQuery:Context Relevancy -参考回答の生成 -評価検索(Retrieval) -評価:ranking metrics like mean-reciprocal rank(MRR),hit-rate,precision,and more.
使用例
他のツールに統合します
- UpTrain: 1.9K:可试用,但是需要book demo,目测不便宜
- Tonic Validate(Includes Web UI for visualizing results):有商业版本,可试用,之后200美元/月
- DeepEval: 1.6K
- Ragas: 4.4K
-いい感じだ
-Llama index-->Ragas-->LangSmithおよび他のツール
-しかし、揉み、quick start実行に失敗し、一緒に
ModuleNotFoundError:No module named‘ragas.metrics’;‘ragas’is not a package
を提示します