forked from hangshuo652/aurak
0a9588abb7
- Add pagination support to findAll (page, limit query params) - Add findByTemplateId method to service - Add GET /by-template/:templateId endpoint to controller - Service already includes CRUD for QuestionBank and QuestionBankItem
3.2 KiB
3.2 KiB
Design: Cross-Document Comparison (Agentic Workflow)
1. Background & Problem
Users often need to compare multiple documents (e.g., "Compare the financial reports of Q1 and Q2" or "Differences between Product A and Product B specs"). Standard RAG retrieves chunks based on semantic similarity to the query. While "Multi-Query" helps, standard RAG might:
- Retrieve too many chunks from one document and miss the other.
- Fail to align comparable attributes (e.g., comparing "revenue" in Doc A with "profit" in Doc B).
- Produce a generic text answer instead of a structured comparison.
2. Solution: Agentic Comparison Workflow
We will implement a specialized workflow (or "Light Agent") that:
- Analyzes the Request: Identifies the subjects to compare (e.g., "Q1 Report", "Q2 Report") and the dimensions (e.g., "Revenue", "Risks").
- Targeted Retrieval:
- Explicitly filters/searches for Doc A.
- Explicitly filters/searches for Doc B.
- Structured Synthesis: Generates the answer, potentially forcing a Markdown Table format for clarity.
3. Technical Architecture
3.1 Backend (ComparisonService or extension to RagService)
- Intent Detection: Modify
ChatServiceorRagServiceto detect comparison intent (can utilize LLM or simple heuristics + keywords). - Planning: If comparison is detected:
- Identify Target Files: Resolve file names/IDs from the query (e.g., "Q1" -> matches file "2024_Q1_Report.pdf").
- Dimension Extraction: What to compare? (e.g., "summary", "key metrics").
- Execution:
- Run Search on File A with query "key metrics".
- Run Search on File B with query "key metrics".
- Combine context.
- Prompting: Use a prompt optimized for comparison (e.g., "Generate a comparison table...").
3.2 Frontend (ChatInterface)
- UI Trigger: (Optional) specific "Compare" button, or just natural language.
- Visuals: Render the response standard markdown (which supports tables).
- Source Attribution: Ensure citations map back to the correct respective documents.
4. Implementation Steps
-
Intent & Entity Extraction (Simple Version):
- In
RagService, add a stepdetectComparisonIntent(query). - Return
subjects: string[](approximate filenames) anddimensions: string.
- In
-
Targeted Search:
- Use
elasticsearchServiceto search specifically within the resolved file IDs (if we can map names to IDs). - Fall back to broad search if file mapping fails.
- Use
-
Comparison Prompt:
- Update
rag.service.tsto use acomparisonPromiseif intent is detected.
- Update
5. Risks & limitations
- File Name Matching: Mapping user spoken "Q1" to "2024_Q1_Report_Final.pdf" is hard without fuzzy matching or LLM resolution.
- Mitigation: Use a lightweight LLM call or fuzzy search on the file list to resolve IDs.
- Latency: Two searches + entity resolution might add latency.
- Mitigation: Run searches in parallel.
6. MVP Scope
- Automated detection of "Compare A and B".
- Attempt to identify if A and B refer to specific files in the selected knowledge base.
- If identified, restrict search scopes accordingly (or boost them).
- Generate a table response.