NB-076
|
9cefbdf114
|
R16: 专家漏洞评审 — 发现并修复嵌套COPYBOOK解析bug
评审方法:14项实机验证,非静态审查
1. 非确定性输出检测 ✓ 5次运行值一致
2. 边缘COBOL功能crash测试 (ALTER/ENTRY) ✓ 不崩溃
3. 大规模程序性能 (500字段+250IF) ✓ 数秒完成
4. 路径爆炸防护 (10IF in PERFORM UNTIL) ✓ 不爆炸
5. 嵌套COPYBOOK解析 → 发现BUG并修复
6. 嵌套IF深度 ✓
7. 畸形JCL输入 (二进制/BOM/1000行延续) ✓ 不崩溃
8. 注释中KEY字串误触发matching ✓ 不误报
9. 变量名包含关键词子串FP ✓ WS-SORT-KEY不触发SORT
10. 非COBOL输入 (中日文/HTML/二进制) ✓ 不误报
11. OPEN I-O方向解析 ✓
12. DataWriter JSON格式 ✓
13. 跨运行隔离 ✓
14. Config加载 ✓
修复: resolve_copybooks 增加递归参数+深度保护
之前: COPY L1 -> L1.cpy含'COPY L2.'不被解析
之后: 递归解析,上限10层防循环
Co-Authored-By: Claude <noreply@anthropic.com>
|
2026-06-22 10:49:18 +08:00 |
|
NB-076
|
abb283669c
|
R13: final sweep — EXEC stripping + INSPECT bugfix + more EQ assertions
1. Lark: preprocess strips EXEC CICS/SQL...END-EXEC blocks
-> CI01_CICS/DB01_SELECT_UPDATE now parse, 75/75 samples pass
2. propagate_assignments INSPECT TALLYING bugfix:
was reading source from count_var (wrong field) instead of
asgn['tgt']. Now CNT='005' instead of '003' for len(HELLO)=5.
3. 26 new EQ/falsifiable assertions added (propagate chains,
orchestrator state, data_writer, report generator)
4. Hardened: ACCEPT DATE string len check, DataWriter JSON format
16 suites / 0 FAIL.
Co-Authored-By: Claude <noreply@anthropic.com>
|
2026-06-22 09:37:58 +08:00 |
|
NB-076
|
4be2aae66d
|
fix: 生产级 COBOL 程序解析 — COPY + OCCURS TO + FD 修复
对抗性测试发现的生产程序解析缺陷和修复:
缺陷1: COPY 语句从未被预处理(18 个月 bug)
- resolve_copybooks() 在 main() CLI 中调用但在 extract_structure() 路径中从未被调用
- 修复: preprocess() 函数头部调用 resolve_copybooks()
- 不可解析的 COPY 行被移除(避免 Lark 在 FD 块内遇到无法识别的指令)
缺陷2: Lark 语法的 fd 规则要求 data_item+ (至少一个记录)
- 生产程序 FD 可以通过 COPY 引入记录定义
- COPY 被移除后 FD 内无 data_item 导致 Lark 崩溃
- 修复: fd 改为 data_item* (零或多个)
缺陷3: OCCURS 1 TO 100 TIMES(变量范围表)
- 语法只支持 OCCURS INT TIMES,不支持 OCCURS 1 TO 100 TIMES
- 修复: occurs_clause 增加 'TO' INT 可选部分
效果: 4 个生产程序中 2 个成功解析(CRDVAL, GENDATA)
- 剩余 2 个(CRDCALC, CRDRPT)因固定格式续行限制未修复
全回归: 767 passed(0 new failures)
|
2026-06-21 16:13:58 +08:00 |
|
NB-076
|
dbee3b7251
|
fix: Lark grammar + parse_file_section SD/ASCENDING KEY support
Bug fixes found through statement benchmark testing:
1. grammar.lark: Add ASCENDING/DESCENDING KEY IS + INDEXED BY to
occurs_clause — fixes HINA024 (SEARCH ALL) parsing crash
2. grammar.lark: Add SD (Sort Description) entry type to file_section
— fixes HINA034 (SORT), ST01, ST02 parsing crashes
3. read.py parse_file_section(): Handle SD blocks alongside FD blocks
— enables SORT/MERGE file structure extraction
4 previously crashing files now parse successfully:
- HINA024.cbl (SEARCH ALL): paras=3, files=0
- HINA034.cbl (SORT): paras=1, files=3
- ST01_SORT.cbl: paras=2, files=3
- ST02_MERGE.cbl: paras=1, files=4
Regression: 749 passed (unchanged — classify_program internally caught
the crashes, so tests already 'passed'; real improvement is in data
quality: structure extraction now works for these programs)
|
2026-06-21 12:21:36 +08:00 |
|
hangshuo652
|
bc1d56d1a4
|
feat: Phase 2 complete — 13 Phases of COBOL type classification and test benchmark
P0.6: gcov infrastructure
P1: extract_structure output expansion (11 new feature fields)
P2: Confusion group rule engine (8 pairs + contradiction + backtrack)
P3: 4-factor confidence calculation + quality gate update
P4: 33+2 COBOL program type test samples (22 files, 7 categories)
P5: parametrized/ test data generation engine
P6: japanese_data.py lookup tables
P7-10: Type-specific test suites (~159 parametrized tests)
P11: Full classification pipeline (classify_program) + orchestrator integration
P12: Documentation (module-interfaces, test-plan v3.0, coverage-matrix)
Architecture decisions:
- classification_pipeline/ merged to hina/pipeline/
- parametrized/ as independent module
- japanese_data.py as root-level file
- hina/__all__ only exports classify_program()
Co-Authored-By: Claude <noreply@anthropic.com>
|
2026-06-19 23:51:55 +08:00 |
|
hangshuo652
|
e2486db510
|
fix: 3 issues found during real COBOL validation
|
2026-06-18 16:26:44 +08:00 |
|