GPT Researcher application build - 生成式AI從No-Code到Low-Code開發與應用四部曲 - Cupoy

🔎 GPT 研究員 Today I Learned for programmers - TiloidToday I Learned for programmerstiloid.com - [中...

🔎 GPT 研究員 Today I Learned for programmers - TiloidToday I Learned for programmerstiloid.com - [中文](README-zh_CN.md) GPT Researcher 是一個為了整合線上研究各種任務而設計的智能代理。這個代理能夠生成詳細、正式且客觀的研究報告，並提供自定義選項專注於相關資源、結構框架和經驗報告。受最近發表的 Plan-and-Solve 和 RAG 論文的啟發，GPT 研究員解決了速度、確定性和可靠性等問題，通過並行化的代理運行，而不是同步操作，提供了更穩定的性能和更高的速度。 [2305.04091] Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language ModelsLarge language models (LLMs) have recently been shown to deliver impressive performance in various NLP tasks. To tackle multi-step reasoning tasks, few-shot chain-of-thought (CoT) prompting includes a few manually crafted step-by-step reasoning demonstrations which enable LLMs to explicitly generate reasoning steps and improve their reasoning task accuracy. To eliminate the manual effort, Zero-shot-CoT concatenates the target problem statement with "Let's think step by step" as an input prompt to LLMs. Despite the success of Zero-shot-CoT, it still suffers from three pitfalls: calculation errors, missing-step errors, and semantic misunderstanding errors. To address the missing-step errors, we propose Plan-and-Solve (PS) Prompting. It consists of two components: first, devising a plan to divide the entire task into smaller subtasks, and then carrying out the subtasks according to the plan. To address the calculation errors and improve the quality of generated reasoning steps, we extend PS prompting with more detailed instructions and derive PS+ prompting. We evaluate our proposed prompting strategy on ten datasets across three reasoning problems. The experimental results over GPT-3 show that our proposed zero-shot prompting consistently outperforms Zero-shot-CoT across all datasets by a large margin, is comparable to or exceeds Zero-shot-Program-of-Thought Prompting, and has comparable performance with 8-shot CoT prompting on the math reasoning problem. The code can be found at https://github.com/AGI-Edgerunners/Plan-and-Solve-Prompting. [2005.11401] Retrieval-Augmented Generation for Knowledge-Intensive NLP TasksLarge pre-trained language models have been shown to store factual knowledge in their parameters, and achieve state-of-the-art results when fine-tuned on downstream NLP tasks. However, their ability to access and precisely manipulate knowledge is still limited, and hence on knowledge-intensive tasks, their performance lags behind task-specific architectures. Additionally, providing provenance for their decisions and updating their world knowledge remain open research problems. Pre-trained models with a differentiable access mechanism to explicit non-parametric memory can overcome this issue, but have so far been only investigated for extractive downstream tasks. We explore a general-purpose fine-tuning recipe for retrieval-augmented generation (RAG) -- models which combine pre-trained parametric and non-parametric memory for language generation. We introduce RAG models where the parametric memory is a pre-trained seq2seq model and the non-parametric memory is a dense vector index of Wikipedia, accessed with a pre-trained neural retriever. We compare two RAG formulations, one which conditions on the same retrieved passages across the whole generated sequence, the other can use different passages per token. We fine-tune and evaluate our models on a wide range of knowledge-intensive NLP tasks and set the state-of-the-art on three open domain QA tasks, outperforming parametric seq2seq models and task-specific retrieve-and-extract architectures. For language generation tasks, we find that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline. 利用人工智慧的力量，為個人和組織提供準確、客觀和事實的信息。為什麼選擇GPT 研究員? 人工研究任務形成客觀結論可能需要時間和經驗，有時甚至需要數週時間才能找到正確的資源和信息。目前的大型語言模型（LLM）是根據歷史和過時的信息進行訓練的，存在嚴重的幻覺風險，因此幾乎無法勝任研究任務。網絡搜索解決方案（例如 ChatGPT + Web 插件）僅考慮有限的資源和內容，在某些情況下可能導致膚淺的結論或不客觀的答案。僅使用部分資源可能會在確定研究問題或任務的正確結論時產生偏差。架構主要思想是運行“規劃者”和“執行者”代理，而規劃者生成問題進行研究，“執行者”代理根據每個生成的研究問題尋找最相關的信息。最後，“規劃者”過濾和聚合所有相關信息並創建研究報告。代理同時利用 gpt3.5-turbo 和 gpt-4-turbo（128K 上下文）來完成一項研究任務。我們僅在必要時使用這兩種方法對成本進行優化。研究任務平均耗時約 3 分鐘，成本約為 ~0.1 美元。詳細說明：根據研究搜索或任務創建特定領域的代理。生成一組研究問題，這些問題共同形成答案對任何給定任務的客觀意見。針對每個研究問題，觸發一個爬蟲代理，從在線資源中搜索與給定任務相關的信息。對於每一個抓取的資源，根據相關信息進行匯總，並跟踪其來源。最後，對所有匯總的資料來源進行過濾和匯總，並生成最終研究報告。演示 https://github.com/assafelovic/gpt-researcher/assets/13554167/a00c89a6-a295-4dd0-b58d-098a31c40fda 教程運行原理 How we built GPT Researcher | TavilyAfter AutoGPT was published, we immediately took it for a spin. The first use case that came to mind was autonomous online research. Forming objective conclusions for manual research tasks can take time, sometimes weeks, to find the right resources and information. Seeing how well AutoGPT created tasks and executed them got me thinking about the great potential of using AI to conduct comprehensive research and what it meant for the future of online research.docs.tavily.com 如何安裝 Loom | Free Screen & Video Recording Software | LoomUse Loom to record quick videos of your screen and cam. Explain anything clearly and easily – and skip the meeting. An essential tool for hybrid workplaces.www.loom.com 現場演示 Loom | Free Screen & Video Recording Software | LoomUse Loom to record quick videos of your screen and cam. Explain anything clearly and easily – and skip the meeting. An essential tool for hybrid workplaces.www.loom.com 特性 📝 生成研究問題、大綱、資源和課題報告 🌐 每項研究匯總超過20個網絡資源，形成客觀和真實的結論 🖥️ 包括易於使用的Web介面 (HTML/CSS/JS) 🔍 支持JavaScript網絡資源抓取功能 📂 追蹤訪問過和使用過的網絡資源和來源 📄 將研究報告導出為PDF或其他格式... 📖 文檔請參閱以下網站已了解完整文檔： Getting Started | TavilyQuickstartdocs.tavily.com 入門（安裝、設置環境、簡單示例）操作示例（演示、集成、docker 支持）參考資料（API完整文檔） Tavily 應用程序接口集成（核心概念的高級解釋）快速開始步驟 0 - 安裝 Python 3.11 或更高版本。 Install Python on Windows, Mac, and LinuxLearn how to setup Python development environment on Windows, Linux and Mac.www.tutorialsteacher.com 我自己是用建構docker image的方式如下在運行default的dockerfile之前要記得先安裝: “aiofiles” 的library 才能運行(不然重新建立image會很花時間) RUN pip install aiofiles 建議通過Docker來Run 如果在window底下run會出現很多bugs. docker build -t gpt-researcher-image . 確定可以讓image在 -p 8000:8000 port上運行: docker run -d -p 8000:8000 gpt-researcher-image 運行至網頁8000port http://localhost:8000 最後將成功的image推送到自己的 docke hub空間中: 成功推送！可以將整理資料統整成PDF檔提供下載！