Files
aurak/CLAUDE.md
T
Developer c57c3028e2 fix: shuffleArray bug + Playwright多轮对话测试 + 初学者考核脚本
- 修复 shuffleArray 返回新数组但调用处用 const 未接收返回值(3处)
- 新增 test-multiround.mjs Playwright 多轮对话测试(简答+追问全流程)
- 新增 do-assessment.mjs / check-result.mjs 考核体验脚本
- CLAUDE.md 增加 AI 工作流指令规则
- package.json 添加 playwright 依赖

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 22:34:04 +08:00

8.5 KiB
Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

Simple Knowledge Base is a full-stack RAG (Retrieval-Augmented Generation) Q&A system built with React 19 + NestJS. It's a monorepo with Japanese/Chinese documentation but English code.

Key Features:

  • Multi-model support (OpenAI-compatible APIs + Google Gemini native SDK)
  • Dual processing modes: Fast (Tika text-only) and High-precision (Vision pipeline)
  • User isolation with JWT authentication and per-user knowledge bases
  • Hybrid search (vector + keyword) with Elasticsearch
  • Multi-language interface (Japanese, Chinese, English)
  • Streaming responses via Server-Sent Events (SSE)

Development Setup

Prerequisites

  • Node.js 18+
  • Yarn
  • Docker & Docker Compose

Initial Setup

# Install dependencies
yarn install

# Start infrastructure services
docker-compose up -d elasticsearch tika libreoffice

# Configure environment
cp server/.env.sample server/.env
# Edit server/.env with API keys and configuration

Development Commands

# Start both frontend and backend in development mode
yarn dev

# Frontend only (port 13001)
cd web && yarn dev

# Backend only (port 3001)
cd server && yarn start:dev

# Run tests
cd server && yarn test
cd server && yarn test:e2e

# Lint and format
cd server && yarn lint
cd server && yarn format

Docker Services

  • Elasticsearch: 9200 (vector storage)
  • Apache Tika: 9998 (document text extraction)
  • LibreOffice Server: 8100 (document conversion)
  • Backend API: 3001
  • Frontend: 13001 (dev), 80/443 (production via nginx)

Architecture

Project Structure

simple-kb/
├── web/                    # React frontend (Vite)
│   ├── components/         # UI components (ChatInterface, ConfigPanel, etc.)
│   ├── contexts/          # React Context providers
│   ├── services/          # API client services
│   └── utils/             # Utility functions
├── server/                # NestJS backend
│   ├── src/
│   │   ├── ai/            # AI services (embedding, etc.)
│   │   ├── api/           # API module
│   │   ├── auth/          # JWT authentication
│   │   ├── chat/          # Chat/RAG module
│   │   ├── elasticsearch/ # Elasticsearch integration
│   │   ├── import-task/   # Import task management
│   │   ├── knowledge-base/# Knowledge base management
│   │   ├── libreoffice/   # LibreOffice integration
│   │   ├── model-config/  # Model configuration management
│   │   ├── vision/        # Vision model integration
│   │   └── vision-pipeline/# Vision pipeline orchestration
│   ├── data/              # SQLite database storage
│   ├── uploads/           # Uploaded files storage
│   └── temp/              # Temporary files
├── docs/                  # Comprehensive documentation (Japanese/Chinese)
├── nginx/                 # Nginx configuration
├── libreoffice-server/    # LibreOffice conversion service (Python/FastAPI)
└── docker-compose.yml     # Docker orchestration

Key Architectural Concepts

Dual Processing Modes:

  1. Fast Mode: Apache Tika for text-only extraction (quick, no API cost)
  2. High-Precision Mode: Vision Pipeline (LibreOffice → PDF → Images → Vision Model) for mixed image/text documents (slower, incurs API costs)

Multi-Model Support:

  • OpenAI-compatible APIs (OpenAI, DeepSeek, Claude, etc.)
  • Google Gemini native SDK
  • Configurable LLM, Embedding, and Rerank models

RAG System:

  • Hybrid search (vector + keyword) with Elasticsearch
  • Streaming responses via Server-Sent Events (SSE)
  • Source citation and similarity scoring
  • Chunk configuration (size, overlap)

Code Standards

Language Requirements

  • Code comments must be in English
  • Log messages must be in English
  • Error messages must support internationalization to enable multi-language frontend interface
  • API response messages must support internationalization to enable multi-language frontend interface
  • Interface supports Japanese, Chinese, and English

Testing

  • Backend uses Jest for unit and e2e tests
  • Frontend currently has no test framework configured
  • Run tests: cd server && yarn test or yarn test:e2e

Code Quality

  • ESLint and Prettier configured for backend
  • Format code: cd server && yarn format
  • Lint code: cd server && yarn lint

Common Development Tasks

Adding a New API Endpoint

  1. Create controller in appropriate module under server/src/
  2. Add service methods with English comments
  3. Update DTOs and validation
  4. Add tests in *.spec.ts files

Adding a New Frontend Component

  1. Create component in web/components/
  2. Add TypeScript interfaces in web/types.ts
  3. Use Tailwind CSS for styling
  4. Connect to backend services in web/services/

Debugging

  • Backend logs are in Chinese
  • Check Elasticsearch: curl http://localhost:9200/_cat/indices
  • Check Tika: curl http://localhost:9998/tika
  • Check LibreOffice: curl http://localhost:8100/health

Environment Configuration

Key environment variables (server/.env):

  • OPENAI_API_KEY: OpenAI-compatible API key
  • GEMINI_API_KEY: Google Gemini API key
  • ELASTICSEARCH_HOST: Elasticsearch URL (default: http://localhost:9200)
  • TIKA_HOST: Apache Tika URL (default: http://localhost:9998)
  • LIBREOFFICE_URL: LibreOffice server URL (default: http://localhost:8100)
  • JWT_SECRET: JWT signing secret

Deployment

Development

docker-compose up -d elasticsearch tika libreoffice
yarn dev

Production

docker-compose up -d  # Builds and starts all services

Ports in Production

  • Frontend: 80/443 (via nginx)
  • Backend API: 3001 (proxied through nginx)
  • Elasticsearch: 9200
  • Tika: 9998
  • LibreOffice: 8100

AI 工作流指令

本项目已安装 gstack54 个技能)和 Superpowers(14 个技能)。请按以下规则协调使用:

自动触发规则

当用户的意图匹配以下场景时,自动调用对应的 gstack skill(使用 Skill 工具),并在调用前向用户说明启动了哪个技能:

  • 用户讨论需求、想法、产品方向 → 调用 office-hours skill,说"正在启动 office-hours(产品策略顾问)..."
  • 用户讨论功能范围、优先级 → 调用 plan-ceo-review skill,说"正在启动 plan-ceo-review(战略评审)..."
  • 用户讨论技术方案、架构设计 → 调用 plan-eng-review skill,说"正在启动 plan-eng-review(架构评审)..."
  • 用户要求审查代码 → 调用 review skill,说"正在启动 review(代码审查)..."
  • 用户要求测试/QA → 调用 qa skill,说"正在启动 qa(自动化测试)..."
  • 用户要求安全审查 → 调用 cso skill,说"正在启动 cso(安全审计)..."
  • 用户要求发布/发版 → 调用 ship skill,说"正在启动 ship(发布流程)..."
  • 用户报告 bug 需要调试 → 调用 investigate skill,说"正在启动 investigate(系统化调试)..."

Superpowers 保留自动触发

Superpowers 的技能(brainstorming、test-driven-development、systematic-debugging 等)继续保持原有自动触发机制,不做干预。

通知机制

每次自动调用 gstack skill 时,必须明确告知用户正在启动哪个技能及其作用。

Troubleshooting

Common Issues

  1. Elasticsearch not starting: Check memory limits in docker-compose.yml
  2. File upload failures: Ensure uploads/ and temp/ directories exist with proper permissions
  3. Vision pipeline errors: Verify LibreOffice server is running and accessible
  4. API key errors: Check environment variables in server/.env

Database Management

  • SQLite database: server/data/metadata.db
  • Elasticsearch indices: Managed automatically by the application
  • To reset: Delete server/data/metadata.db and Elasticsearch data volume

Documentation

  • README.md: Project overview in Japanese
  • docs/: Comprehensive documentation (mostly Japanese/Chinese)
  • DESIGN.md: System architecture and design
  • API.md: API reference
  • DEVELOPMENT_STANDARDS.md: Mandates English comments/logs and internationalized messages

When modifying code, always add English comments and logs as required by development standards. Error and UI messages must be properly internationalized. The project has extensive existing documentation in Japanese/Chinese - refer to docs/ directory for detailed technical information.