NB-076
|
dbee3b7251
|
fix: Lark grammar + parse_file_section SD/ASCENDING KEY support
Bug fixes found through statement benchmark testing:
1. grammar.lark: Add ASCENDING/DESCENDING KEY IS + INDEXED BY to
occurs_clause — fixes HINA024 (SEARCH ALL) parsing crash
2. grammar.lark: Add SD (Sort Description) entry type to file_section
— fixes HINA034 (SORT), ST01, ST02 parsing crashes
3. read.py parse_file_section(): Handle SD blocks alongside FD blocks
— enables SORT/MERGE file structure extraction
4 previously crashing files now parse successfully:
- HINA024.cbl (SEARCH ALL): paras=3, files=0
- HINA034.cbl (SORT): paras=1, files=3
- ST01_SORT.cbl: paras=2, files=3
- ST02_MERGE.cbl: paras=1, files=4
Regression: 749 passed (unchanged — classify_program internally caught
the crashes, so tests already 'passed'; real improvement is in data
quality: structure extraction now works for these programs)
|
2026-06-21 12:21:36 +08:00 |
|
hangshuo652
|
bc1d56d1a4
|
feat: Phase 2 complete — 13 Phases of COBOL type classification and test benchmark
P0.6: gcov infrastructure
P1: extract_structure output expansion (11 new feature fields)
P2: Confusion group rule engine (8 pairs + contradiction + backtrack)
P3: 4-factor confidence calculation + quality gate update
P4: 33+2 COBOL program type test samples (22 files, 7 categories)
P5: parametrized/ test data generation engine
P6: japanese_data.py lookup tables
P7-10: Type-specific test suites (~159 parametrized tests)
P11: Full classification pipeline (classify_program) + orchestrator integration
P12: Documentation (module-interfaces, test-plan v3.0, coverage-matrix)
Architecture decisions:
- classification_pipeline/ merged to hina/pipeline/
- parametrized/ as independent module
- japanese_data.py as root-level file
- hina/__all__ only exports classify_program()
Co-Authored-By: Claude <noreply@anthropic.com>
|
2026-06-19 23:51:55 +08:00 |
|