c57c3028e2
- 修复 shuffleArray 返回新数组但调用处用 const 未接收返回值(3处) - 新增 test-multiround.mjs Playwright 多轮对话测试(简答+追问全流程) - 新增 do-assessment.mjs / check-result.mjs 考核体验脚本 - CLAUDE.md 增加 AI 工作流指令规则 - package.json 添加 playwright 依赖 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
225 lines
8.5 KiB
Markdown
225 lines
8.5 KiB
Markdown
# CLAUDE.md
|
||
|
||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||
|
||
## Project Overview
|
||
|
||
Simple Knowledge Base is a full-stack RAG (Retrieval-Augmented Generation) Q&A system built with React 19 + NestJS. It's a monorepo with Japanese/Chinese documentation but English code.
|
||
|
||
**Key Features:**
|
||
- Multi-model support (OpenAI-compatible APIs + Google Gemini native SDK)
|
||
- Dual processing modes: Fast (Tika text-only) and High-precision (Vision pipeline)
|
||
- User isolation with JWT authentication and per-user knowledge bases
|
||
- Hybrid search (vector + keyword) with Elasticsearch
|
||
- Multi-language interface (Japanese, Chinese, English)
|
||
- Streaming responses via Server-Sent Events (SSE)
|
||
|
||
## Development Setup
|
||
|
||
### Prerequisites
|
||
- Node.js 18+
|
||
- Yarn
|
||
- Docker & Docker Compose
|
||
|
||
### Initial Setup
|
||
```bash
|
||
# Install dependencies
|
||
yarn install
|
||
|
||
# Start infrastructure services
|
||
docker-compose up -d elasticsearch tika libreoffice
|
||
|
||
# Configure environment
|
||
cp server/.env.sample server/.env
|
||
# Edit server/.env with API keys and configuration
|
||
```
|
||
|
||
### Development Commands
|
||
```bash
|
||
# Start both frontend and backend in development mode
|
||
yarn dev
|
||
|
||
# Frontend only (port 13001)
|
||
cd web && yarn dev
|
||
|
||
# Backend only (port 3001)
|
||
cd server && yarn start:dev
|
||
|
||
# Run tests
|
||
cd server && yarn test
|
||
cd server && yarn test:e2e
|
||
|
||
# Lint and format
|
||
cd server && yarn lint
|
||
cd server && yarn format
|
||
```
|
||
|
||
### Docker Services
|
||
- **Elasticsearch**: 9200 (vector storage)
|
||
- **Apache Tika**: 9998 (document text extraction)
|
||
- **LibreOffice Server**: 8100 (document conversion)
|
||
- **Backend API**: 3001
|
||
- **Frontend**: 13001 (dev), 80/443 (production via nginx)
|
||
|
||
## Architecture
|
||
|
||
### Project Structure
|
||
```
|
||
simple-kb/
|
||
├── web/ # React frontend (Vite)
|
||
│ ├── components/ # UI components (ChatInterface, ConfigPanel, etc.)
|
||
│ ├── contexts/ # React Context providers
|
||
│ ├── services/ # API client services
|
||
│ └── utils/ # Utility functions
|
||
├── server/ # NestJS backend
|
||
│ ├── src/
|
||
│ │ ├── ai/ # AI services (embedding, etc.)
|
||
│ │ ├── api/ # API module
|
||
│ │ ├── auth/ # JWT authentication
|
||
│ │ ├── chat/ # Chat/RAG module
|
||
│ │ ├── elasticsearch/ # Elasticsearch integration
|
||
│ │ ├── import-task/ # Import task management
|
||
│ │ ├── knowledge-base/# Knowledge base management
|
||
│ │ ├── libreoffice/ # LibreOffice integration
|
||
│ │ ├── model-config/ # Model configuration management
|
||
│ │ ├── vision/ # Vision model integration
|
||
│ │ └── vision-pipeline/# Vision pipeline orchestration
|
||
│ ├── data/ # SQLite database storage
|
||
│ ├── uploads/ # Uploaded files storage
|
||
│ └── temp/ # Temporary files
|
||
├── docs/ # Comprehensive documentation (Japanese/Chinese)
|
||
├── nginx/ # Nginx configuration
|
||
├── libreoffice-server/ # LibreOffice conversion service (Python/FastAPI)
|
||
└── docker-compose.yml # Docker orchestration
|
||
```
|
||
|
||
### Key Architectural Concepts
|
||
|
||
**Dual Processing Modes:**
|
||
1. **Fast Mode**: Apache Tika for text-only extraction (quick, no API cost)
|
||
2. **High-Precision Mode**: Vision Pipeline (LibreOffice → PDF → Images → Vision Model) for mixed image/text documents (slower, incurs API costs)
|
||
|
||
**Multi-Model Support:**
|
||
- OpenAI-compatible APIs (OpenAI, DeepSeek, Claude, etc.)
|
||
- Google Gemini native SDK
|
||
- Configurable LLM, Embedding, and Rerank models
|
||
|
||
**RAG System:**
|
||
- Hybrid search (vector + keyword) with Elasticsearch
|
||
- Streaming responses via Server-Sent Events (SSE)
|
||
- Source citation and similarity scoring
|
||
- Chunk configuration (size, overlap)
|
||
|
||
## Code Standards
|
||
|
||
### Language Requirements
|
||
- **Code comments must be in English**
|
||
- **Log messages must be in English**
|
||
- **Error messages must support internationalization** to enable multi-language frontend interface
|
||
- **API response messages must support internationalization** to enable multi-language frontend interface
|
||
- Interface supports Japanese, Chinese, and English
|
||
|
||
### Testing
|
||
- Backend uses Jest for unit and e2e tests
|
||
- Frontend currently has no test framework configured
|
||
- Run tests: `cd server && yarn test` or `yarn test:e2e`
|
||
|
||
### Code Quality
|
||
- ESLint and Prettier configured for backend
|
||
- Format code: `cd server && yarn format`
|
||
- Lint code: `cd server && yarn lint`
|
||
|
||
## Common Development Tasks
|
||
|
||
### Adding a New API Endpoint
|
||
1. Create controller in appropriate module under `server/src/`
|
||
2. Add service methods with English comments
|
||
3. Update DTOs and validation
|
||
4. Add tests in `*.spec.ts` files
|
||
|
||
### Adding a New Frontend Component
|
||
1. Create component in `web/components/`
|
||
2. Add TypeScript interfaces in `web/types.ts`
|
||
3. Use Tailwind CSS for styling
|
||
4. Connect to backend services in `web/services/`
|
||
|
||
### Debugging
|
||
- Backend logs are in Chinese
|
||
- Check Elasticsearch: `curl http://localhost:9200/_cat/indices`
|
||
- Check Tika: `curl http://localhost:9998/tika`
|
||
- Check LibreOffice: `curl http://localhost:8100/health`
|
||
|
||
## Environment Configuration
|
||
|
||
Key environment variables (`server/.env`):
|
||
- `OPENAI_API_KEY`: OpenAI-compatible API key
|
||
- `GEMINI_API_KEY`: Google Gemini API key
|
||
- `ELASTICSEARCH_HOST`: Elasticsearch URL (default: http://localhost:9200)
|
||
- `TIKA_HOST`: Apache Tika URL (default: http://localhost:9998)
|
||
- `LIBREOFFICE_URL`: LibreOffice server URL (default: http://localhost:8100)
|
||
- `JWT_SECRET`: JWT signing secret
|
||
|
||
## Deployment
|
||
|
||
### Development
|
||
```bash
|
||
docker-compose up -d elasticsearch tika libreoffice
|
||
yarn dev
|
||
```
|
||
|
||
### Production
|
||
```bash
|
||
docker-compose up -d # Builds and starts all services
|
||
```
|
||
|
||
### Ports in Production
|
||
- Frontend: 80/443 (via nginx)
|
||
- Backend API: 3001 (proxied through nginx)
|
||
- Elasticsearch: 9200
|
||
- Tika: 9998
|
||
- LibreOffice: 8100
|
||
|
||
## AI 工作流指令
|
||
|
||
本项目已安装 **gstack**(54 个技能)和 **Superpowers**(14 个技能)。请按以下规则协调使用:
|
||
|
||
### 自动触发规则
|
||
当用户的意图匹配以下场景时,**自动调用对应的 gstack skill**(使用 Skill 工具),并在调用前向用户说明启动了哪个技能:
|
||
|
||
- 用户讨论需求、想法、产品方向 → 调用 `office-hours` skill,说"正在启动 **office-hours**(产品策略顾问)..."
|
||
- 用户讨论功能范围、优先级 → 调用 `plan-ceo-review` skill,说"正在启动 **plan-ceo-review**(战略评审)..."
|
||
- 用户讨论技术方案、架构设计 → 调用 `plan-eng-review` skill,说"正在启动 **plan-eng-review**(架构评审)..."
|
||
- 用户要求审查代码 → 调用 `review` skill,说"正在启动 **review**(代码审查)..."
|
||
- 用户要求测试/QA → 调用 `qa` skill,说"正在启动 **qa**(自动化测试)..."
|
||
- 用户要求安全审查 → 调用 `cso` skill,说"正在启动 **cso**(安全审计)..."
|
||
- 用户要求发布/发版 → 调用 `ship` skill,说"正在启动 **ship**(发布流程)..."
|
||
- 用户报告 bug 需要调试 → 调用 `investigate` skill,说"正在启动 **investigate**(系统化调试)..."
|
||
|
||
### Superpowers 保留自动触发
|
||
Superpowers 的技能(brainstorming、test-driven-development、systematic-debugging 等)继续保持原有自动触发机制,不做干预。
|
||
|
||
### 通知机制
|
||
每次自动调用 gstack skill 时,必须明确告知用户正在启动哪个技能及其作用。
|
||
|
||
## Troubleshooting
|
||
|
||
### Common Issues
|
||
1. **Elasticsearch not starting**: Check memory limits in docker-compose.yml
|
||
2. **File upload failures**: Ensure `uploads/` and `temp/` directories exist with proper permissions
|
||
3. **Vision pipeline errors**: Verify LibreOffice server is running and accessible
|
||
4. **API key errors**: Check environment variables in `server/.env`
|
||
|
||
### Database Management
|
||
- SQLite database: `server/data/metadata.db`
|
||
- Elasticsearch indices: Managed automatically by the application
|
||
- To reset: Delete `server/data/metadata.db` and Elasticsearch data volume
|
||
|
||
## Documentation
|
||
|
||
- **README.md**: Project overview in Japanese
|
||
- **docs/**: Comprehensive documentation (mostly Japanese/Chinese)
|
||
- **DESIGN.md**: System architecture and design
|
||
- **API.md**: API reference
|
||
- **DEVELOPMENT_STANDARDS.md**: Mandates English comments/logs and internationalized messages
|
||
|
||
When modifying code, always add English comments and logs as required by development standards. Error and UI messages must be properly internationalized. The project has extensive existing documentation in Japanese/Chinese - refer to `docs/` directory for detailed technical information. |