cobol-java-v3

hangshuo652/cobol-java-v3

Fork 0

Commit Graph

Author	SHA1	Message	Date
NB-076	65e9919933	feat: matching program full recognition — L1 regex keyword + confidence consensus Three-part fix for matching program classification: 1. L1 regex keyword WS-[-\w]*KEY (confidence 0.65): - Captures WS-KEY, WS-MAST-KEY, WS-TRAN-KEY, WS-PREV-KEY etc. - Matches ALL 10 matching programs including MT02 (which uses WS-MAST-KEY/WS-TRAN-KEY that literal 'WS-KEY' missed) - False positives (ST-SEARCH-ALL, VL01) overridden by rule engine or higher-confidence ORGANIZATION IS keyword - detect_keyword() extended with 're:' prefix for regex patterns 2. Consensus bonus in compute_confidence_v2: - When L1 keyword category matches rule engine's final category, context_factor boosted by +0.15 - Pushes matching programs from manual (0.50-0.69) toward review (0.70-0.89) range 3. Confidence calibration for confusion groups (previous commit): - dedup_vs_nodedup: 0.85→0.50 for negative detection - validation_vs_keybreak: 0.80→0.55 for has_counter - simple_vs_two_stage: 0.80→0.50 for sequential OPEN Results - matching programs: MT01: 0.38→0.75, MT02: 0.30→0.60, MT03: 0.30→0.60, MT16: 0.45→0.81, MT17: 0.36→0.65, MT18: 0.60→0.60, MT19: 0.30→0.60, MT20: 0.30→0.65, MT33: 0.30→0.60 All now rule_engine (not fallback), no false negatives. Subtype discrimination remains for future work: all matching programs classified as マッチング without 1:1/1:N/N:1 subtype.	2026-06-21 13:25:39 +08:00
hangshuo652	bc1d56d1a4	feat: Phase 2 complete — 13 Phases of COBOL type classification and test benchmark P0.6: gcov infrastructure P1: extract_structure output expansion (11 new feature fields) P2: Confusion group rule engine (8 pairs + contradiction + backtrack) P3: 4-factor confidence calculation + quality gate update P4: 33+2 COBOL program type test samples (22 files, 7 categories) P5: parametrized/ test data generation engine P6: japanese_data.py lookup tables P7-10: Type-specific test suites (~159 parametrized tests) P11: Full classification pipeline (classify_program) + orchestrator integration P12: Documentation (module-interfaces, test-plan v3.0, coverage-matrix) Architecture decisions: - classification_pipeline/ merged to hina/pipeline/ - parametrized/ as independent module - japanese_data.py as root-level file - hina/__all__ only exports classify_program() Co-Authored-By: Claude <noreply@anthropic.com>	2026-06-19 23:51:55 +08:00

Author

SHA1

Message

Date

NB-076

65e9919933

feat: matching program full recognition — L1 regex keyword + confidence consensus

Three-part fix for matching program classification:
1. L1 regex keyword WS-[-\w]*KEY (confidence 0.65):
   - Captures WS-KEY, WS-MAST-KEY, WS-TRAN-KEY, WS-PREV-KEY etc.
   - Matches ALL 10 matching programs including MT02 (which uses
     WS-MAST-KEY/WS-TRAN-KEY that literal 'WS-KEY' missed)
   - False positives (ST-SEARCH-ALL, VL01) overridden by rule engine
     or higher-confidence ORGANIZATION IS keyword
   - detect_keyword() extended with 're:' prefix for regex patterns

2. Consensus bonus in compute_confidence_v2:
   - When L1 keyword category matches rule engine's final category,
     context_factor boosted by +0.15
   - Pushes matching programs from manual (0.50-0.69) toward
     review (0.70-0.89) range

3. Confidence calibration for confusion groups (previous commit):
   - dedup_vs_nodedup: 0.85→0.50 for negative detection
   - validation_vs_keybreak: 0.80→0.55 for has_counter
   - simple_vs_two_stage: 0.80→0.50 for sequential OPEN

Results - matching programs:
  MT01: 0.38→0.75, MT02: 0.30→0.60, MT03: 0.30→0.60,
  MT16: 0.45→0.81, MT17: 0.36→0.65, MT18: 0.60→0.60,
  MT19: 0.30→0.60, MT20: 0.30→0.65, MT33: 0.30→0.60
  All now rule_engine (not fallback), no false negatives.

Subtype discrimination remains for future work: all matching
programs classified as マッチング without 1:1/1:N/N:1 subtype.

2026-06-21 13:25:39 +08:00

hangshuo652

bc1d56d1a4

feat: Phase 2 complete — 13 Phases of COBOL type classification and test benchmark

P0.6: gcov infrastructure
P1: extract_structure output expansion (11 new feature fields)
P2: Confusion group rule engine (8 pairs + contradiction + backtrack)
P3: 4-factor confidence calculation + quality gate update
P4: 33+2 COBOL program type test samples (22 files, 7 categories)
P5: parametrized/ test data generation engine
P6: japanese_data.py lookup tables
P7-10: Type-specific test suites (~159 parametrized tests)
P11: Full classification pipeline (classify_program) + orchestrator integration
P12: Documentation (module-interfaces, test-plan v3.0, coverage-matrix)

Architecture decisions:
- classification_pipeline/ merged to hina/pipeline/
- parametrized/ as independent module
- japanese_data.py as root-level file
- hina/__all__ only exports classify_program()

Co-Authored-By: Claude <noreply@anthropic.com>

2026-06-19 23:51:55 +08:00

2 Commits