RAG from begin to expert (下) - Query Transformations - 生成式AI從No-Code到Low-Code開發與應用四部曲

Query Transformations 查詢轉換是一組專注於重寫和/或修改檢索問題的方法。一樣，先環境設置或載入 ! pip install langchain_community tiktoken langchain-openai langchainhub chromadb langchain 使用 langchain: import os os.environ['LANGCHAIN_TRACING_V2'] = 'true' os.environ['LANGCHAIN_ENDPOINT'] = 'https://api.smith.langchain.com' os.environ['LANGCHAIN_API_KEY'] = ###draft_code_symbol_lessthen###your-api-key> Part 5: Multi Query Docs：https://python.langchain.com/docs/modules/data_connection/retrievers/MultiQueryRetriever 基於距離的向量資料庫檢索會將查詢嵌入（表示）在高維度空間中，並根據「距離」找出類似的嵌入文件。但是，檢索可能會因為查詢文字的細微變化而產生不同的結果，或者如果嵌入無法很好地擷取資料的語意。提示工程/調整有時會手動解決這些問題，但可能會很繁瑣。 MultiQueryRetriever 自動化提示調整程序，使用 LLM 從不同觀點為給定的使用者輸入查詢產生多個查詢。對於每個查詢，它會擷取一組相關文件，並針對所有查詢進行唯一聯集，以取得一組較大的潛在相關文件。透過對同一個問題產生多個觀點，MultiQueryRetriever 可能能夠克服基於距離的擷取的一些限制，並取得更豐富的結果。下面來run一個簡單的例子, 關於mutiquery的方法 1. 環境設置 # 搭建簡單的向量資料庫 from langchain_community.document_loaders import WebBaseLoader from langchain_community.vectorstores import Chroma from langchain_openai import OpenAIEmbeddings from langchain_text_splitters import RecursiveCharacterTextSplitter # 載入一個我們想觀察的部落格文件 loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/") data = loader.load() # 用RecursiveCharacterTextSplitter切割文件 text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0) splits = text_splitter.split_documents(data) # 實體化VectorDB來用 embedding = OpenAIEmbeddings() vectordb = Chroma.from_documents(documents=splits, embedding=embedding) 2. Simple usage簡單用法 from langchain.retrievers.multi_query import MultiQueryRetriever from langchain_openai import ChatOpenAI question = "What are the approaches to Task Decomposition?" llm = ChatOpenAI(temperature=0) retriever_from_llm = MultiQueryRetriever.from_llm( retriever=vectordb.as_retriever(), llm=llm ) # Set logging for the queries import logging logging.basicConfig() logging.getLogger("langchain.retrievers.multi_query").setLevel(logging.INFO) unique_docs = retriever_from_llm.get_relevant_documents(query=question) len(unique_docs) 輸出：INFO:langchain.retrievers.multi_query:Generated queries: ['1. How can Task Decomposition be achieved through different methods?', '2. What strategies are commonly used for Task Decomposition?', '3. What are the various techniques for breaking down tasks in Task Decomposition?'] 提供您自己的提示您也可以提供提示，並搭配輸出剖析器將結果分割成查詢清單。 from typing import List from langchain.chains import LLMChain from langchain.output_parsers import PydanticOutputParser from langchain.prompts import PromptTemplate from pydantic import BaseModel, Field # Output parser will split the LLM result into a list of queries class LineList(BaseModel): # "lines" is the key (attribute name) of the parsed output lines: List[str] = Field(description="Lines of text") class LineListOutputParser(PydanticOutputParser): def __init__(self) -> None: super().__init__(pydantic_object=LineList) def parse(self, text: str) -> LineList: lines = text.strip().split("\n") return LineList(lines=lines) output_parser = LineListOutputParser() QUERY_PROMPT = PromptTemplate( input_variables=["question"], template="""You are an AI language model assistant. Your task is to generate five different versions of the given user question to retrieve relevant documents from a vector database. By generating multiple perspectives on the user question, your goal is to help the user overcome some of the limitations of the distance-based similarity search. Provide these alternative questions separated by newlines. Original question: {question}""", ) llm = ChatOpenAI(temperature=0) # Chain llm_chain = LLMChain(llm=llm, prompt=QUERY_PROMPT, output_parser=output_parser) # Other inputs question = "What are the approaches to Task Decomposition?" # Run retriever = MultiQueryRetriever( retriever=vectordb.as_retriever(), llm_chain=llm_chain, parser_key="lines" ) # "lines" is the key (attribute name) of the parsed output # Results unique_docs = retriever.get_relevant_documents( query="What does the course say about regression?" ) len(unique_docs) output：INFO:langchain.retrievers.multi_query:Generated queries: ["1. What is the course's perspective on regression?", '2. Can you provide information on regression as discussed in the course?', '3. How does the course cover the topic of regression?', "4. What are the course's teachings on regression?", '5. In relation to the course, what is mentioned about regression?'] 以上簡例結束我們回到"索引過程" # 加載網誌 import bs4 from langchain_community.document_loaders import WebBaseLoader loader = WebBaseLoader( web_paths=("###draft_code_symbol_lessthen###https://lilianweng.github.io/posts/2023-06-23-agent/>",), # 指定要加載的網誌URL bs_kwargs=dict( parse_only=bs4.SoupStrainer( class_=("post-content", "post-title", "post-header") # 僅解析具有這些類的HTML元素 ) ), ) blog_docs = loader.load() # 加載並解析網誌內容 # 分割 from langchain.text_splitter import RecursiveCharacterTextSplitter text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder( chunk_size=300, # 每個分割的字符數 chunk_overlap=50 # 分割之間的重疊字符數 ) # 創建分割 splits = text_splitter.split_documents(blog_docs) # 將網誌內容分割成更小的片段 # 索引 from langchain_openai import OpenAIEmbeddings from langchain_community.vectorstores import Chroma vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings()) # 使用OpenAI嵌入模型將文檔片段轉換成向量 retriever = vectorstore.as_retriever() # 從向量存儲創建檢索器上面這段code的主要步驟包括：加載網誌：使用WebBaseLoader從指定的URL加載網誌內容，只解析特定的HTML類以獲取文章的主要部分。分割文檔：通過RecursiveCharacterTextSplitter根據字符數將文檔分割成較小的片段，以便更高效地處理和索引。創建向量存儲：使用OpenAIEmbeddings將文檔片段轉換成向量，並使用Chroma存儲這些向量。這一步骤是實現高效檢索的關鍵。創建檢索器：從向量存儲中創建一個檢索器，用於後續根據查詢檢索相關的文檔片段。這一過程為後續使用檢索增強生成（RAG）系統提供了準備，能夠根據用戶的查詢從預先索引的網誌內容中檢索信息。 Prompt from langchain.prompts import ChatPromptTemplate # Multi Query: Different Perspectives template = """You are an AI language model assistant. Your task is to generate five different versions of the given user question to retrieve relevant documents from a vector database. By generating multiple perspectives on the user question, your goal is to help the user overcome some of the limitations of the distance-based similarity search. Provide these alternative questions separated by newlines. Original question: {question}""" prompt_perspectives = ChatPromptTemplate.from_template(template) from langchain_core.output_parsers import StrOutputParser from langchain_openai import ChatOpenAI generate_queries = ( prompt_perspectives | ChatOpenAI(temperature=0) | StrOutputParser() | (lambda x: x.split("\n")) ) from langchain.load import dumps, loads def get_unique_union(documents: list[list]): """ Unique union of retrieved docs """ # Flatten list of lists, and convert each Document to string flattened_docs = [dumps(doc) for sublist in documents for doc in sublist] # Get unique documents unique_docs = list(set(flattened_docs)) # Return return [loads(doc) for doc in unique_docs] # Retrieve question = "What is task decomposition for LLM agents?" retrieval_chain = generate_queries | retriever.map() | get_unique_union docs = retrieval_chain.invoke({"question":question}) len(docs) from operator import itemgetter from langchain_openai import ChatOpenAI from langchain_core.runnables import RunnablePassthrough # RAG template = """Answer the following question based on this context: {context} Question: {question} """ prompt = ChatPromptTemplate.from_template(template) llm = ChatOpenAI(temperature=0) final_rag_chain = ( {"context": retrieval_chain, "question": itemgetter("question")} | prompt | llm | StrOutputParser() ) final_rag_chain.invoke({"question":question}) 此代碼展示了如何使用LangChain框架建立一個檢索增強生成（Retrieval-Augmented Generation，RAG）系統。該系統專注於從多個角度出發，生成問題查詢，以提高檢索質量和問題回答的準確性。以下是該過程的詳細說明：多角度問題生成：使用ChatPromptTemplate定義提示模板，要求AI語言模型助手為給定的使用者問題生成五個不同版本，以從向量數據庫中檢索相關文件。這樣做的目的是，通過提供使用者問題的多種視角，來克服基於距離的相似性搜索的某些限制。設置生成問題查詢的處理鏈：將上述提示模板與ChatOpenAI(設置溫度為0以生成更一致的回答）和StrOutputParser連接起來，並透過一個函數將生成的字符串按換行符分割，得到多個問題查詢。唯一聯合檢索文件：定義了一個函數get_unique_union，用於將檢索到的文件列表（列表的列表）扁平化、去重，並轉換回文件對象。檢索階段：定義了一個檢索鏈retrieval_chain，它將生成的多個問題查詢映射到檢索器上，以獲得相關文件，然後通過get_unique_union函數處理，得到一組唯一的文件。檢索增強生成（RAG）：使用另一個ChatPromptTemplate定義了一個用於RAG的提示模板，將檢索到的上下文與原始問題結合，以生成基於該上下文的答案。定義了最終的RAG處理鏈final_rag_chain，它將檢索鏈生成的上下文、原始問題與大型語言模型連接起來，以產生最終答案。此過程展現了LangChain在建立複雜的NLP系統時的運用，如何利用大型語言模型生成問題查詢的多個視角，進行高效的檢索，並根據檢索到的信息生成準確的答案。這種方法特別適用於需要從大量信息中找到最相關內容的場景，如問答系統、文件檢索和推薦系統等。 Part 6: RAG-Fusion Docs https://github.com/langchain-ai/langchain/blob/master/cookbook/rag_fusion.ipynb?ref=blog.langchain.dev Blog / repo https://towardsdatascience.com/forget-rag-the-future-is-rag-fusion-1147298d8ad1 from langchain.prompts import ChatPromptTemplate # RAG-Fusion: Related template = """You are a helpful assistant that generates multiple search queries based on a single input query. \n Generate multiple search queries related to: {question} \n Output (4 queries):""" prompt_rag_fusion = ChatPromptTemplate.from_template(template) from langchain_core.output_parsers import StrOutputParser from langchain_openai import ChatOpenAI generate_queries = ( prompt_rag_fusion | ChatOpenAI(temperature=0) | StrOutputParser() | (lambda x: x.split("\n")) ) def reciprocal_rank_fusion(results: list[list], k=60): """ Reciprocal_rank_fusion that takes multiple lists of ranked documents and an optional parameter k used in the RRF formula """ # Initialize a dictionary to hold fused scores for each unique document fused_scores = {} # Iterate through each list of ranked documents for docs in results: # Iterate through each document in the list, with its rank (position in the list) for rank, doc in enumerate(docs): # Convert the document to a string format to use as a key (assumes documents can be serialized to JSON) doc_str = dumps(doc) # If the document is not yet in the fused_scores dictionary, add it with an initial score of 0 if doc_str not in fused_scores: fused_scores[doc_str] = 0 # Retrieve the current score of the document, if any previous_score = fused_scores[doc_str] # Update the score of the document using the RRF formula: 1 / (rank + k) fused_scores[doc_str] += 1 / (rank + k) # Sort the documents based on their fused scores in descending order to get the final reranked results reranked_results = [ (loads(doc), score) for doc, score in sorted(fused_scores.items(), key=lambda x: x[1], reverse=True) ] # Return the reranked results as a list of tuples, each containing the document and its fused score return reranked_results retrieval_chain_rag_fusion = generate_queries | retriever.map() | reciprocal_rank_fusion docs = retrieval_chain_rag_fusion.invoke({"question": question}) len(docs) from langchain_core.runnables import RunnablePassthrough # RAG template = """Answer the following question based on this context: {context} Question: {question} """ prompt = ChatPromptTemplate.from_template(template) final_rag_chain = ( {"context": retrieval_chain_rag_fusion, "question": itemgetter("question")} | prompt | llm | StrOutputParser() ) final_rag_chain.invoke({"question":question}) Trace：https://smith.langchain.com/public/071202c9-9f4d-41b1-bf9d-86b7c5a7525b/r Part 7: Decomposition 2205.10625.pdfhttps://arxiv.org/pdf/2205.10625.pdfarxiv.org Part 8: Step Back 2310.06117.pdfhttps://arxiv.org/pdf/2310.06117.pdfarxiv.org 了解，我們將分段進行詳細解釋和註釋第一部分：問題分解 from langchain.prompts import ChatPromptTemplate from langchain_openai import ChatOpenAI from langchain_core.output_parsers import StrOutputParser # 定義問題分解的模板 template = """You are a helpful assistant that generates multiple sub-questions related to an input question. \\n The goal is to break down the input into a set of sub-problems / sub-questions that can be answered in isolation. \\n Generate multiple search queries related to: {question} \\n Output (4 queries):""" prompt_decomposition = ChatPromptTemplate.from_template(template) 這段code定義了一個模板，目的是將一個輸入問題分解為多個子問題或子查詢。這有助於後續的檢索和問題回答，因為它允許系統從多個角度探索問題。 # 初始化大型語言模型 llm = ChatOpenAI(temperature=0) 初始化ChatOpenAI對象，設置溫度為0，以產生更一致、更少隨機性的回答。 # 定義生成子問題的處理鏈 generate_queries_decomposition = ( prompt_decomposition | llm | StrOutputParser() | (lambda x: x.split("\\n"))) 這個處理鏈結合了問題分解模板、ChatOpenAI對象和輸出解析器，並通過一個lambda函數處理輸出，將生成的文本按換行符分割，得到一組子問題。 # 執行問題分解 question = "What are the main components of LLM-powered autonomous agent system?" decomposed_questions = generate_queries_decomposition.invoke({"question":question}) 對給定的問題進行分解，產生一組子問題。這是處理鏈的實際應用階段，其中question是要分解的原始問題。第二部分：子問題的檢索與回答生成（示例代碼未提供，以下為假設性說明）在這一步，我們假設有一個過程來處理每個子問題，包括檢索相關文檔和生成回答。這個過程可能會用到前文提到的 retriever 對象和一個或多個RAG處理鏈。 # 假設性代碼：對每個子問題進行檢索和生成回答 def retrieve_and_rag(question, prompt_rag, sub_question_generator_chain): """進行子問題檢索和回答生成""" # 使用子問題生成器鏈產生子問題 sub_questions = sub_question_generator_chain.invoke({"question": question}) rag_results = [] for sub_question in sub_questions: # 對每個子問題進行文檔檢索 retrieved_docs = retriever.get_relevant_documents(sub_question) # 使用檢索到的文檔和子問題進行RAG處理，生成回答 answer = (prompt_rag | llm | StrOutputParser()).invoke({"context": retrieved_docs, "question": sub_question}) rag_results.append(answer) return rag_results, sub_questions 這個假設性函數retrieve_and_rag旨在對每個由問題分解產生的子問題進行檢索和回答生成。首先，它使用子問題生成鏈產生子問題列表。然後，對於每個子問題，它使用一個檢索器（retriever）來找到相關的文檔，並基於這些文檔和子問題使用RAG技術生成回答。第三部分：合成回答假設接下來的步驟是使用前面的子問題回答來合成對原始問題的終極回答。 # 格式化問答對，用於合成回答 def format_qa_pairs(questions, answers): """格式化問題和答案對""" formatted_string = "" for i, (question, answer) in enumerate(zip(questions, answers), start=1): formatted_string += f"Question {i}: {question}\\nAnswer {i}: {answer}\\n\\n" return formatted_string.strip() # 使用問答對進行回答合成 answers, questions = retrieve_and_rag(question, prompt_rag, generate_queries_decomposition) context = format_qa_pairs(questions, answers) 這部分將所有子問題的回答格式化為一段文本，以便於後續使用這些信息來合成對原始問題的回答。 # 定義合成回答的模板 template = """Here is a set of Q+A pairs: {context} Use these to synthesize an answer to the question: {question} """ prompt = ChatPromptTemplate.from_template(template) # 定義最終的RAG處理鏈，用於合成回答 final_rag_chain = ( prompt | llm | StrOutputParser() ) # 執行處理鏈，獲得最終回答 final_answer = final_rag_chain.invoke({"context": context, "question": question}) 這部分code定義了一個新的提示模板，用於將前面格式化的問答對作為上下文，合成對原始問題的終極回答。然後，通過最終的RAG處理鏈執行這一過程，獲得並返回最終回答。這個流程展示了從問題分解到檢索相關信息，再到合成最終回答的整個過程。它利用了大型語言模型的能力來處理複雜的問題解答任務，通過分解和逐步回答子問題來提高回答的準確性和全面性。