chore: SETUP.md + 测试报告脚本 + 文档更新
- SETUP.md: 完整环境搭建指南(同事用) - SETUP_QUICK.md: 快速搭环境(4步) - s22~s26: TNA端到端、覆盖率报告、回归检查 - procedure_grammar.lark: 实验性Lark语法 Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,364 @@
|
||||
# COBOL Test Data Generator — 环境搭建与运行指南
|
||||
|
||||
## 1. 系统概述
|
||||
|
||||
COBOL 测试数据生成器(cobol-java-v3)是一个 Python 工具链,用于解析 COBOL 程序、提取控制流结构、生成覆盖所有分支的测试数据,并输出为固定的 flat file 格式供 GnuCOBOL 编译运行。
|
||||
|
||||
### 核心能力
|
||||
|
||||
| 能力 | 说明 |
|
||||
|------|------|
|
||||
| 解析 COBOL DATA DIVISION | Lark 语法 (Earley parser) → 字段定义 |
|
||||
| 解析 COBOL PROCEDURE DIVISION | 行级状态机 → 决策点树 |
|
||||
| 分支覆盖数据生成 | 每决策点生成 True/False 路径 → 记录 |
|
||||
| Flat file 输出 | COBOL 固定长度二进制文件 |
|
||||
| GnuCOBOL 编译运行 | 测试数据 → cobc 编译 → 运行验证 |
|
||||
|
||||
---
|
||||
|
||||
## 2. 必要条件
|
||||
|
||||
### 2.1 硬件要求
|
||||
|
||||
| 项目 | 最低 | 推荐 |
|
||||
|------|------|------|
|
||||
| CPU | 2 cores | 4+ cores |
|
||||
| 内存 | 4 GB | 8 GB |
|
||||
| 磁盘 | 500 MB | 2 GB |
|
||||
| OS | Windows 10/11 64-bit | Windows 11 |
|
||||
|
||||
### 2.2 软件要求
|
||||
|
||||
| 软件 | 版本 | 用途 |
|
||||
|------|------|------|
|
||||
| **Python** | 3.12+ | 运行测试数据生成器 |
|
||||
| **GnuCOBOL (cobc)** | 3.2.0 | 编译 COBOL 程序 & 运行时验证 |
|
||||
| **Git** | 任意 | 拉取代码 |
|
||||
|
||||
### 2.3 Python 依赖
|
||||
|
||||
```
|
||||
lark>=1.1.0 # Lark Earley parser (DATA DIVISION 解析)
|
||||
pathlib>=1.0.1 # 路径处理
|
||||
```
|
||||
|
||||
安装命令:
|
||||
```bash
|
||||
pip install lark pathlib
|
||||
```
|
||||
|
||||
### 2.4 GnuCOBOL 安装
|
||||
|
||||
GnuCOBOL 3.2.0 (OpenCOBOL) 需要单独安装。
|
||||
|
||||
**下载**:
|
||||
- GnuCOBOL 3.2 Windows 二进制包
|
||||
- 推荐: GC32-BDB-SP1 版本(含 DB2/SQLite 支持)
|
||||
|
||||
**安装后确认**:
|
||||
```bash
|
||||
cobc --version
|
||||
# 输出示例: cobc (GnuCOBOL) 3.2.0
|
||||
```
|
||||
|
||||
**环境变量**:
|
||||
```bash
|
||||
# cobc 需要在 PATH 中
|
||||
# 典型路径: C:\GnuCOBOL\bin
|
||||
# 或自定义安装路径
|
||||
|
||||
# COB_LIBRARY_PATH 用于运行时定位 DLL(SHARED 编译的子程序)
|
||||
# 如: set COB_LIBRARY_PATH=D:\cobol-java\cobol-tna-system\bin
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. 环境搭建步骤
|
||||
|
||||
### 3.1 安装 Python 3.12+
|
||||
|
||||
```bash
|
||||
# 下载: https://www.python.org/downloads/
|
||||
# 安装时勾选 "Add Python to PATH"
|
||||
python --version
|
||||
# Python 3.12.x
|
||||
|
||||
pip install lark pathlib
|
||||
```
|
||||
|
||||
### 3.2 安装 GnuCOBOL 3.2
|
||||
|
||||
1. 下载 GC32-BDB-SP1 包
|
||||
2. 解压到 `D:\360安全浏览器下载\GC32-BDB-SP1-rename-7z-to-exe\`
|
||||
3. 将 `bin\` 子目录添加到系统 PATH
|
||||
4. 验证:
|
||||
```bash
|
||||
cobc --version
|
||||
# cobc (GnuCOBOL) 3.2.0
|
||||
```
|
||||
|
||||
### 3.3 克隆代码
|
||||
|
||||
```bash
|
||||
cd D:\
|
||||
git clone https://gittea.dev/hangshuo652/cobol-java-v3.git
|
||||
# 或从已有仓库拉取
|
||||
cd D:\cobol-java\cobol-java-v3
|
||||
git pull
|
||||
```
|
||||
|
||||
### 3.4 验证安装
|
||||
|
||||
```bash
|
||||
cd D:\cobol-java\cobol-java-v3
|
||||
python -c "from cobol_testgen import extract_structure; print('OK')"
|
||||
# 输出: OK
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. 目录结构
|
||||
|
||||
```
|
||||
cobol-java-v3/
|
||||
├── cobol_testgen/ # 核心代码
|
||||
│ ├── __init__.py # 公开 API (extract_structure, generate_data)
|
||||
│ ├── read.py # 预处理器 + DATA DIVISION 解析
|
||||
│ ├── core.py # 旧 PROCEDURE DIVISION 解析器 (BrParser)
|
||||
│ ├── cond.py # 条件解析器
|
||||
│ ├── coverage.py # 覆盖率统计
|
||||
│ ├── design_mcdc.py # 线性路径枚举 (O(N) 替代 O(2^N))
|
||||
│ ├── pipeline_bridge.py # 新旧解析器桥接层
|
||||
│ ├── procedure_parser.py # 新 PROCEDURE DIVISION 解析器
|
||||
│ ├── flatfile.py # Flat file 写入器
|
||||
│ ├── design.py # 值生成 + 约束应用
|
||||
│ ├── models.py # 数据模型 (BrSeq, BrIf, BrEval...)
|
||||
│ ├── grammar.lark # DATA DIVISION Lark 语法
|
||||
│ └── procedure_grammar.lark # PROCEDURE DIVISION Lark 语法 (实验性)
|
||||
├── test-data/ # 测试套件
|
||||
│ ├── s15_coverage_verification.py # 基础覆盖率验证 (8种控制结构)
|
||||
│ ├── s19_final_bridge_test.py # 桥接器验证
|
||||
│ ├── s21_cond_fix_verify.py # 条件解析验证
|
||||
│ ├── s25_per_program_report.py # 每程序详细报告
|
||||
│ └── s26_regression_check.py # 回归检查
|
||||
├── SETUP.md # 本文件
|
||||
└── docs/ # 设计文档
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. 运行测试
|
||||
|
||||
### 5.1 快速验证(10 秒)
|
||||
|
||||
```bash
|
||||
cd D:\cobol-java\cobol-java-v3
|
||||
python test-data/s15_coverage_verification.py
|
||||
```
|
||||
|
||||
期望输出:
|
||||
```
|
||||
S15: 17 PASS / 0 FAIL
|
||||
```
|
||||
|
||||
### 5.2 完整 43 程序覆盖率报告(2-3 分钟)
|
||||
|
||||
```bash
|
||||
python test-data/s25_per_program_report.py
|
||||
```
|
||||
|
||||
期望输出末尾:
|
||||
```
|
||||
100%: 43 programs
|
||||
TOTAL 3178 3178 100%
|
||||
```
|
||||
|
||||
### 5.3 回归快速检查(2 分钟)
|
||||
|
||||
```bash
|
||||
python test-data/s26_regression_check.py
|
||||
```
|
||||
|
||||
期望输出:
|
||||
```
|
||||
Total: 3178/3178 = 100.00%
|
||||
ALL 43/43 AT 100% — NO REGRESSIONS
|
||||
```
|
||||
|
||||
### 5.4 指定 COPYBOOK 目录
|
||||
|
||||
如果 COBOL 程序依赖 COPYBOOK,需要在调用 `generate_data` 时指定 `copybook_dirs`:
|
||||
|
||||
```python
|
||||
from cobol_testgen import extract_structure, generate_data
|
||||
|
||||
src = open("program.cbl", encoding="utf-8").read()
|
||||
st = extract_structure(src)
|
||||
recs = generate_data(src, st, copybook_dirs=["path/to/copybooks"])
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. 关键 API
|
||||
|
||||
### 6.1 extract_structure(cobol_source)
|
||||
|
||||
**输入**: COBOL 程序源码文本
|
||||
**返回**: dict — 包含总分支数、决策点列表、分支树对象等
|
||||
|
||||
```python
|
||||
st = extract_structure(src)
|
||||
branches = st["total_branches"] # 总分支数
|
||||
dps = st["decision_points"] # 决策点列表
|
||||
tree = st["branch_tree_obj"] # 分支树对象
|
||||
```
|
||||
|
||||
### 6.2 generate_data(cobol_source, structure, copybook_dirs=None)
|
||||
|
||||
**输入**:
|
||||
- `cobol_source`: COBOL 程序原始源码(未预处理)
|
||||
- `structure`: extract_structure 返回的 dict
|
||||
- `copybook_dirs`: COPYBOOK 搜索路径列表(可选)
|
||||
|
||||
**返回**: list[dict] — 每条记录包含所有字段的值
|
||||
|
||||
```python
|
||||
recs = generate_data(src, st)
|
||||
# 或带 COPYBOOK 目录
|
||||
recs = generate_data(src, st, copybook_dirs=["./cpy", "../common/copybooks"])
|
||||
```
|
||||
|
||||
### 6.3 覆盖率数据
|
||||
|
||||
`generate_data` 执行后,`structure` 对象包含 `coverage` 键:
|
||||
|
||||
```python
|
||||
cov = st["coverage"]
|
||||
total = cov["total"] # 总分支数
|
||||
covered = cov["covered"] # 覆盖分支数
|
||||
pct = cov["pct"] # 覆盖率百分比
|
||||
dps = cov["decision_points"] # 各决策点明细
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. 运行条件明细(同事配置检查清单)
|
||||
|
||||
### 必须满足
|
||||
|
||||
- [ ] Python 3.12+ 已安装,在 PATH 中
|
||||
- [ ] `pip install lark` 执行成功
|
||||
- [ ] GnuCOBOL (cobc) 3.2.0 已安装,在 PATH 中
|
||||
- [ ] `cobc --version` 输出正常
|
||||
- [ ] 无防火墙阻止 `gittea.dev` 的 git 访问
|
||||
- [ ] `D:\` 盘有至少 500MB 空闲
|
||||
|
||||
### 如果使用 GnuCOBOL 编译运行
|
||||
|
||||
- [ ] `cobc` 命令可用(`which cobc` 或 `where cobc`)
|
||||
- [ ] 子程序 DLL 路径在 `COB_LIBRARY_PATH` 环境中
|
||||
- [ ] EXEC SQL 需要 SQLite3 支持(GC32-BDB-SP1 版本含)
|
||||
|
||||
### 常见问题
|
||||
|
||||
| 问题 | 原因 | 解决 |
|
||||
|------|------|------|
|
||||
| `ModuleNotFoundError: No module named 'lark'` | 缺少 Lark | `pip install lark` |
|
||||
| `cobc: command not found` | GnuCOBOL 不在 PATH | 添加 `bin\` 到 PATH |
|
||||
| `Errno 13 Permission denied` | 文件权限 | 以管理员运行或修改文件权限 |
|
||||
| `gbk codec can't decode byte` | 编码问题 | 设置 `PYTHONIOENCODING=utf-8` |
|
||||
| `name 'pp_str' is not defined` | 报告脚本 Bug | 已修复,git pull 最新代码 |
|
||||
| `EXEC SQL ... not supported` | 需要 DB2/SQLite | 用 GC32-BDB-SP1 版本 GnuCOBOL |
|
||||
|
||||
---
|
||||
|
||||
## 8. 测试基准程序说明
|
||||
|
||||
系统包含两套测试基准程序:
|
||||
|
||||
### 电信计费系统 (37 程序)
|
||||
|
||||
```
|
||||
路径: D:\cobol-java\cobol-test-programs/
|
||||
COPYBOOK: common/copybooks/
|
||||
类型: Matching / KeyBreak / Division / CSV / Sort 等
|
||||
```
|
||||
|
||||
### 勤怠管理系统 (6 程序)
|
||||
|
||||
```
|
||||
路径: D:\cobol-java\cobol-tna-system/
|
||||
COPYBOOK: cpy/
|
||||
子程序: sub/*.cbl → bin/*.dll
|
||||
类型: 日企勤怠管理 (打工统计)
|
||||
EXEC SQL: ZAN06UPD 需要 SQLite3 支持
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 9. 快速启动脚本
|
||||
|
||||
### Windows (batch)
|
||||
|
||||
```batch
|
||||
@echo off
|
||||
cd /d D:\cobol-java\cobol-java-v3
|
||||
echo === COBOL Test Data Generator ===
|
||||
echo [1/3] Checking dependencies...
|
||||
python -c "import lark" 2>nul || pip install lark
|
||||
echo [2/3] Running regression test...
|
||||
python test-data\s15_coverage_verification.py
|
||||
if %errorlevel% neq 0 echo FAILED && exit /b 1
|
||||
echo [3/3] Running full coverage report...
|
||||
set PYTHONIOENCODING=utf-8
|
||||
python test-data\s25_per_program_report.py
|
||||
echo === DONE ===
|
||||
```
|
||||
|
||||
### Linux/macOS
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
cd /path/to/cobol-java-v3
|
||||
echo "=== COBOL Test Data Generator ==="
|
||||
echo "[1/3] Checking dependencies..."
|
||||
python3 -c "import lark" 2>/dev/null || pip3 install lark
|
||||
echo "[2/3] Running regression test..."
|
||||
python3 test-data/s15_coverage_verification.py
|
||||
if [ $? -ne 0 ]; then echo "FAILED"; exit 1; fi
|
||||
echo "[3/3] Running full coverage report..."
|
||||
PYTHONIOENCODING=utf-8 python3 test-data/s25_per_program_report.py
|
||||
echo "=== DONE ==="
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 10. 版本信息
|
||||
|
||||
| 版本 | 日期 | 说明 |
|
||||
|:----:|:----:|------|
|
||||
| v3.0 | 2026-06-25 | 当前版本。43/43 程序 100% 分支覆盖 |
|
||||
| v2.0 | 2026-06-20 | 新 PROCEDURE DIVISION 解析器 + 线性路径枚举 |
|
||||
| v1.0 | 2026-06-14 | 初始版本,BrParser regex 解析器 |
|
||||
|
||||
---
|
||||
|
||||
## 附录:覆盖率数据验证方法
|
||||
|
||||
系统使用三层验证确保覆盖率数据真实:
|
||||
|
||||
1. **S15 测试**: 8 个手动构建的 COBOL 片段,每个决策点的手工分支数与系统检测数逐一对比
|
||||
2. **所有约束通过 _match_constraint 精确匹配**:约束侧和解析侧的字段名都会去掉下标后再比较
|
||||
3. **无条件 fallback 已全部移除**:没有 "任何路径到达就标记全部" 的逻辑
|
||||
|
||||
```python
|
||||
# coverage.py 中 _mark_if 的真实覆盖逻辑(无 fallback):
|
||||
def _mark_if(dp, cons):
|
||||
# 只有约束侧字段名 == 解析侧字段名时标记覆盖
|
||||
# 加了防御性下标剥离
|
||||
if _match_constraint(c, simple):
|
||||
dp.active_branches.add('T' if c[3] else 'F')
|
||||
elif _match_constraint(c, inv_simple):
|
||||
dp.active_branches.add('F')
|
||||
# 没有任何 else + unconditional add
|
||||
```
|
||||
Reference in New Issue
Block a user