Files
aurak/docs/superpowers/specs/2026-03-17-feishu-websocket-integration-design.md
T
Developer 0a9588abb7 feat: implement QuestionBank CRUD with pagination and template query
- Add pagination support to findAll (page, limit query params)
- Add findByTemplateId method to service
- Add GET /by-template/:templateId endpoint to controller
- Service already includes CRUD for QuestionBank and QuestionBankItem
2026-04-23 17:19:11 +08:00

457 lines
13 KiB
Markdown

# Feishu WebSocket Integration - Design Document
**Date**: 2026-03-17
**Author**: Sisyphus AI Agent
**Status**: Draft for Review
## Executive Summary
Add WebSocket long-connection support for Feishu bot integration, enabling internal network deployment without requiring public domain or internet-facing endpoints. The system will support both existing webhook mode and new WebSocket mode, allowing users to choose their preferred connection method.
---
## 1. Architecture Overview
### Current Architecture (Webhook Mode)
```
Feishu Cloud → Public Domain → NAT/Firewall → Internal Server
└→ POST /api/feishu/webhook/:appId
```
**Limitations:**
- Requires public domain with SSL certificate
- Requires NAT/firewall port forwarding or reverse proxy
- Not suitable for pure internal network deployment
### New Architecture (WebSocket Mode)
```
Feishu Cloud ←──────── WebSocket (wss://open.feishu.cn) ──────── Internal Server
Feishu Cloud → Webhook (optional backup) → Internal Server
```
**Advantages:**
- No public domain required
- No NAT/firewall configuration needed
- Direct connection from internal network to Feishu cloud
- Real-time message delivery (milliseconds vs minutes)
- Connection复用,资源效率更高
### Architecture Diagram
```
┌─────────────────────────────────────────────────────────────┐
│ Feishu Open Platform │
│ (WebSocket Event Subscription) │
└───────────────────────┬─────────────────────────────────────┘
┌─────────────┴─────────────┐
│ │
┌─────▼──────┐ ┌──────▼─────┐
│ Bot A │ │ Bot B │
│ ws://.../A │ │ ws://.../B │
└─────┬──────┘ └──────┬─────┘
│ │
┌─────▼──────────────────────────▼──────┐
│ AuraK Server │
│ ┌──────────────────────────────────┐ │
│ │ FeishuModule │ │
│ │ ┌────────────────────────────┐ │ │
│ │ │ FeishuWsManager │ │ │
│ │ │ - per-bot connections │ │ │
│ │ │ - auto-reconnect │ │ │
│ │ │ - message routing │ │ │
│ │ └────────────────────────────┘ │ │
│ │ ┌────────────────────────────┐ │ │
│ │ │ FeishuService │ │ │
│ │ │ - existing logic │ │ │
│ │ │ - new ws connect/disconnect│ │ │
│ │ └────────────────────────────┘ │ │
│ │ ┌────────────────────────────┐ │ │
│ │ │ FeishuController │ │ │
│ │ │ - webhook endpoints │ │ │
│ │ │ - ws management APIs │ │ │
│ │ └────────────────────────────┘ │ │
│ └──────────────────────────────────┘ │
└──────────────────────────────────────────┘
```
---
## 2. Implementation Plan
### 2.1 New Components
#### 2.1.1 FeishuWsManager
**Purpose**: Manage WebSocket connections for each bot
**Responsibilities:**
- Establish and maintain WebSocket connections
- Handle connection lifecycle (connect, disconnect, reconnect)
- Route incoming messages to appropriate bot handlers
- Manage connection state per bot
**Location**: `server/src/feishu/feishu-ws.manager.ts`
**Key Methods:**
```typescript
class FeishuWsManager {
// Start WebSocket connection for a bot
async connect(bot: FeishuBot): Promise<void>
// Stop WebSocket connection for a bot
async disconnect(botId: string): Promise<void>
// Get connection status
getStatus(botId: string): ConnectionStatus
// Get all active connections
getAllConnections(): Map<string, ConnectionStatus>
}
```
#### 2.1.2 ConnectionStatus Type
```typescript
enum ConnectionState {
DISCONNECTED = 'disconnected',
CONNECTING = 'connecting',
CONNECTED = 'connected',
ERROR = 'error'
}
interface ConnectionStatus {
botId: string
state: ConnectionState
connectedAt?: Date
lastHeartbeat?: Date
error?: string
}
```
### 2.2 Modified Components
#### 2.2.1 FeishuService
**New Methods:**
```typescript
class FeishuService {
// Start WebSocket connection for a bot
async startWsConnection(botId: string): Promise<void>
// Stop WebSocket connection for a bot
async stopWsConnection(botId: string): Promise<void>
// Get connection status
async getWsStatus(botId: string): Promise<ConnectionStatus>
// List all connection statuses
async getAllWsStatuses(): Promise<ConnectionStatus[]>
}
```
#### 2.2.2 FeishuController
**New Endpoints:**
```typescript
// POST /feishu/bots/:id/ws/connect - Start WebSocket connection
// POST /feishu/bots/:id/ws/disconnect - Stop WebSocket connection
// GET /feishu/bots/:id/ws/status - Get connection status
// GET /feishu/ws/status - Get all connection statuses
```
**Modified Endpoints:**
- Keep existing webhook endpoints unchanged
- Add WebSocket status indicator in bot list response
#### 2.2.3 FeishuBot Entity
**New Fields:**
```typescript
@Entity('feishu_bots')
export class FeishuBot {
// ... existing fields ...
@Column({ default: false })
useWebSocket: boolean
@Column({ nullable: true })
wsConnectionState: string
}
```
### 2.3 Feishu SDK Integration
**Package**: `@larksuiteoapi/node-sdk`
**Installation:**
```bash
cd server && yarn add @larksuiteoapi/node-sdk
```
**Configuration:**
```typescript
import { EventDispatcher, Conf } from '@larksuiteoapi/node-sdk'
const client = new EventDispatcher({
appId: bot.appId,
appSecret: bot.appSecret,
verificationToken: bot.verificationToken,
}, {
logger: console
})
```
### 2.4 Event Handling
**Flow for WebSocket Mode:**
```
Feishu Cloud ──WebSocket──> FeishuWsManager.on('message')
Parse event type
┌───────────────┼───────────────┐
│ │ │
im.message. im.message. other
receive_v1 p2p_msg_received
│ │
▼ ▼
FeishuService.processChatMessage()
Send response via
FeishuService.sendTextMessage()
```
### 2.5 Configuration Changes
**Feishu Open Platform:**
Users need to configure in Feishu developer console:
1. Go to "Event Subscription" (事件与回调)
2. Select "Use long connection to receive events" (使用长连接接收事件)
3. Add event: `im.message.receive_v1`
4. **Important**: Must start local WebSocket client first before saving
---
## 3. Data Flow
### WebSocket Message Flow
```
1. User triggers connect API
2. FeishuController.connect(botId)
3. FeishuService.startWsConnection(botId)
4. FeishuWsManager.connect(bot)
5. SDK establishes WebSocket to open.feishu.cn
6. Connection established, events flow:
├─> on('message') ──> _processEvent() ──> _handleMessage()
│ │
│ ▼
│ FeishuService.processChatMessage()
│ │
│ ▼
│ FeishuService.sendTextMessage() (via SDK)
├─> on('error') ──> log error ──> trigger reconnect
└─> on('close') ──> trigger auto-reconnect
```
---
## 4. Error Handling
### Connection Errors
| Error Type | Handling |
|------------|----------|
| Network timeout | Retry with exponential backoff (max 5 attempts) |
| Invalid credentials | Mark bot as error state, notify user |
| Token expired | Refresh token, reconnect |
| Feishu server error | Wait 30s, retry |
### Auto-Reconnect Strategy
```
Initial delay: 1 second
Max delay: 60 seconds
Backoff multiplier: 2x
Max attempts: 5
Reset on successful connection
```
---
## 5. API Design
### 5.1 Connect WebSocket
```
POST /api/feishu/bots/:id/ws/connect
Response 200:
{
"success": true,
"botId": "bot_xxx",
"status": "connecting"
}
Response 400:
{
"success": false,
"error": "Bot not found or disabled"
}
```
### 5.2 Disconnect WebSocket
```
POST /api/feishu/bots/:id/ws/disconnect
Response 200:
{
"success": true,
"botId": "bot_xxx",
"status": "disconnected"
}
```
### 5.3 Get Connection Status
```
GET /api/feishu/bots/:id/ws/status
Response 200:
{
"botId": "bot_xxx",
"state": "connected",
"connectedAt": "2026-03-17T10:00:00Z",
"lastHeartbeat": "2026-03-17T10:05:00Z"
}
```
### 5.4 Get All Statuses
```
GET /api/feishu/ws/status
Response 200:
{
"connections": [
{
"botId": "bot_xxx",
"state": "connected",
"connectedAt": "2026-03-17T10:00:00Z"
},
{
"botId": "bot_yyy",
"state": "disconnected"
}
]
}
```
---
## 6. Security Considerations
1. **Credential Storage**: App ID and App Secret stored encrypted in database
2. **Connection Validation**: Verify bot belongs to authenticated user before connect
3. **Rate Limiting**: Implement per-bot message rate limiting
4. **Connection Limits**: Max 10 concurrent WebSocket connections per server instance
---
## 7. Testing Strategy
### Unit Tests
- FeishuWsManager connection lifecycle
- Message routing logic
- Error handling and reconnection
### Integration Tests
- Full WebSocket connection flow
- Message send/receive cycle
- Reconnection after network failure
### Manual Testing
- Local development without ngrok
- Verify webhook still works alongside WebSocket
---
## 8. Backward Compatibility
- **Existing webhook endpoints**: Unchanged, continue to work
- **Bot configuration**: Existing bots keep webhook mode by default
- **Migration path**: Users can switch to WebSocket anytime via API
- **Dual mode**: Both modes can run simultaneously for different bots
---
## 9. Migration Guide
### For Existing Users
1. Update AuraK to new version (with WebSocket support)
2. Install Feishu SDK: `yarn add @larksuiteoapi/node-sdk`
3. In Feishu Developer Console:
- Start local WebSocket server
- Change event subscription to "Use long connection"
4. Call `POST /api/feishu/bots/:id/ws/connect` to activate
---
## 10. Limitations
1. **Outbound Network Required**: Server must be able to reach `open.feishu.cn` via WebSocket
2. **Single Connection Per Bot**: Each bot needs its own WebSocket connection
3. **Feishu SDK Required**: Must install official SDK, cannot use raw WebSocket
4. **Private Feishu**: Does not support Feishu private deployment (自建飞书)
---
## 11. File Changes Summary
### New Files
- `server/src/feishu/feishu-ws.manager.ts` - WebSocket connection manager
- `server/src/feishu/dto/ws-status.dto.ts` - WebSocket status DTOs
### Modified Files
- `server/src/feishu/feishu.service.ts` - Add WS methods
- `server/src/feishu/feishu.controller.ts` - Add WS endpoints
- `server/src/feishu/entities/feishu-bot.entity.ts` - Add WS fields
- `server/src/feishu/feishu.module.ts` - Register new manager
### Dependencies
- Add: `@larksuiteoapi/node-sdk`
---
## 12. Success Criteria
- [ ] Server can establish WebSocket connection to Feishu
- [ ] Messages received via WebSocket are processed correctly
- [ ] Responses sent back to Feishu via SDK
- [ ] Auto-reconnect works after network interruption
- [ ] Webhook mode continues to work unchanged
- [ ] Both modes can coexist for different bots
- [ ] Internal network deployment works without public domain