Files
cobol-java-v3/data/test_case.py
hangshuo652 bc1d56d1a4 feat: Phase 2 complete — 13 Phases of COBOL type classification and test benchmark
P0.6: gcov infrastructure
P1: extract_structure output expansion (11 new feature fields)
P2: Confusion group rule engine (8 pairs + contradiction + backtrack)
P3: 4-factor confidence calculation + quality gate update
P4: 33+2 COBOL program type test samples (22 files, 7 categories)
P5: parametrized/ test data generation engine
P6: japanese_data.py lookup tables
P7-10: Type-specific test suites (~159 parametrized tests)
P11: Full classification pipeline (classify_program) + orchestrator integration
P12: Documentation (module-interfaces, test-plan v3.0, coverage-matrix)

Architecture decisions:
- classification_pipeline/ merged to hina/pipeline/
- parametrized/ as independent module
- japanese_data.py as root-level file
- hina/__all__ only exports classify_program()

Co-Authored-By: Claude <noreply@anthropic.com>
2026-06-19 23:51:55 +08:00

67 lines
2.1 KiB
Python
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
"""测试数据模型 — 测试用例 + 测试套件 + Spark 配置
使用例:
tc = TestCase(id="TC-001", fields={"TX-AMOUNT": 1500000})
suite = TestSuite(test_cases=[tc], spark_config=SparkConfig(num_records=1000))
"""
from __future__ import annotations
from dataclasses import dataclass, field
from typing import Optional
@dataclass
class SparkConfig:
"""Spark 测试数据生成配置。
────────── 字段说明 ──────────
num_records — 生成的记录数
replication — 复制策略: "key_varied" / "exact_copy"
key_field — 键字段名(key_varied 用)
edge_cases — 边缘 case: ["null","max","min","empty"]
"""
num_records: int = 100
replication: str = "key_varied"
key_field: str = ""
edge_cases: list[str] = field(default_factory=list)
@dataclass
class TestCase:
"""单条测试用例 — 一条待验证的字段值组合。
────────── 字段说明 ──────────
id — 用例 ID(如 "TC-001"
fields — {字段名: 值}
coverage_targets — 覆盖的决策点 ID 列表
"""
id: str
fields: dict = field(default_factory=dict)
coverage_targets: list[str] = field(default_factory=list)
@dataclass
class TestSuite:
"""测试套件 — 多条用例 + 可选 Spark 配置。
────────── 字段说明 ──────────
schema — 可选的字段 schema
test_cases — 测试用例列表
spark_config — None 表示非 Spark 模式
"""
schema: Optional[dict] = None
test_cases: list[TestCase] = field(default_factory=list)
spark_config: Optional[SparkConfig] = None
@property
def has_spark(self) -> bool:
return self.spark_config is not None
_tc = TestCase(id="TC-001", fields={"BR-AMT": 1500000})
assert _tc.id == "TC-001"
assert _tc.fields["BR-AMT"] == 1500000
_ts = TestSuite(test_cases=[_tc], spark_config=SparkConfig(num_records=1000))
assert _ts.spark_config.num_records == 1000