feat: Phase 2 complete — 13 Phases of COBOL type classification and test benchmark

P0.6: gcov infrastructure
P1: extract_structure output expansion (11 new feature fields)
P2: Confusion group rule engine (8 pairs + contradiction + backtrack)
P3: 4-factor confidence calculation + quality gate update
P4: 33+2 COBOL program type test samples (22 files, 7 categories)
P5: parametrized/ test data generation engine
P6: japanese_data.py lookup tables
P7-10: Type-specific test suites (~159 parametrized tests)
P11: Full classification pipeline (classify_program) + orchestrator integration
P12: Documentation (module-interfaces, test-plan v3.0, coverage-matrix)

Architecture decisions:
- classification_pipeline/ merged to hina/pipeline/
- parametrized/ as independent module
- japanese_data.py as root-level file
- hina/__all__ only exports classify_program()

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
hangshuo652
2026-06-19 23:51:55 +08:00
parent 63b5284715
commit bc1d56d1a4
129 changed files with 19378 additions and 261 deletions
@@ -0,0 +1,24 @@
* ==== TYPE: CV01 CSV(NO NEWLINE) ====
* FEATURE: STRING concatenation without newline
* BRANCHES: 2, DECISIONS: 1
IDENTIFICATION DIVISION.
PROGRAM-ID. CV01.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 WS-FIELD1 PIC X(10) VALUE 'ALPHA'.
01 WS-FIELD2 PIC X(10) VALUE 'BETA'.
01 WS-FIELD3 PIC X(10) VALUE 'GAMMA'.
01 WS-CSV-LINE PIC X(100).
01 WS-PTR PIC 9(3) VALUE 1.
PROCEDURE DIVISION.
MAIN-PROCEDURE.
MOVE 1 TO WS-PTR.
STRING WS-FIELD1 DELIMITED BY SPACES
',' DELIMITED BY SIZE
WS-FIELD2 DELIMITED BY SPACES
',' DELIMITED BY SIZE
WS-FIELD3 DELIMITED BY SPACES
INTO WS-CSV-LINE
WITH POINTER WS-PTR.
DISPLAY 'CSV: ' WS-CSV-LINE.
STOP RUN.
@@ -0,0 +1,25 @@
* ==== TYPE: CV02 CSV(WITH NEWLINE) ====
* FEATURE: INSPECT REPLACING for newline handling
* BRANCHES: 2, DECISIONS: 1
IDENTIFICATION DIVISION.
PROGRAM-ID. CV02.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 WS-LINE PIC X(100) VALUE
'FIELD1,FIELD2,FIELD3'.
01 WS-PTR PIC 9(3) VALUE 1.
01 WS-COM-COUNT PIC 9(3) VALUE 0.
PROCEDURE DIVISION.
MAIN-PROCEDURE.
INSPECT WS-LINE TALLYING WS-COM-COUNT
FOR ALL ','.
DISPLAY 'COMMA COUNT: ' WS-COM-COUNT.
INSPECT WS-LINE REPLACING ALL ',' BY '|'.
DISPLAY 'PIPE LINE: ' WS-LINE.
MOVE 1 TO WS-PTR.
STRING WS-LINE DELIMITED BY SPACES
';' DELIMITED BY SIZE
INTO WS-LINE
WITH POINTER WS-PTR.
DISPLAY 'CSV+TERM: ' WS-LINE.
STOP RUN.
@@ -0,0 +1,27 @@
* ==== TYPE: CV03 ASCII-EBCDIC ====
* FEATURE: ASCII to EBCDIC conversion simulation
* BRANCHES: 2, DECISIONS: 1
IDENTIFICATION DIVISION.
PROGRAM-ID. CV03.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 WS-ASCII-DATA PIC X(10) VALUE 'ABCDEF0123'.
01 WS-EBCDIC-DATA PIC X(10).
01 WS-I PIC 9(2) VALUE 1.
01 WS-CHAR PIC X(1).
PROCEDURE DIVISION.
MAIN-PROCEDURE.
MOVE SPACES TO WS-EBCDIC-DATA.
PERFORM VARYING WS-I FROM 1 BY 1 UNTIL WS-I > 10
MOVE WS-ASCII-DATA(WS-I:1) TO WS-CHAR
IF WS-CHAR >= 'A' AND <= 'Z'
DISPLAY 'ALPHA AT ' WS-I
ELSE IF WS-CHAR >= '0' AND <= '9'
DISPLAY 'DIGIT AT ' WS-I
ELSE
DISPLAY 'OTHER AT ' WS-I
END-IF
END-PERFORM.
MOVE WS-ASCII-DATA TO WS-EBCDIC-DATA.
DISPLAY 'EBCDIC: ' WS-EBCDIC-DATA.
STOP RUN.