feat: matching program full recognition — L1 regex keyword + confidence consensus

Three-part fix for matching program classification: 1. L1 regex keyword WS-[-\w]*KEY (confidence 0.65): - Captures WS-KEY, WS-MAST-KEY, WS-TRAN-KEY, WS-PREV-KEY etc. - Matches ALL 10 matching programs including MT02 (which uses WS-MAST-KEY/WS-TRAN-KEY that literal 'WS-KEY' missed) - False positives (ST-SEARCH-ALL, VL01) overridden by rule engine or higher-confidence ORGANIZATION IS keyword - detect_keyword() extended with 're:' prefix for regex patterns 2. Consensus bonus in compute_confidence_v2: - When L1 keyword category matches rule engine's final category, context_factor boosted by +0.15 - Pushes matching programs from manual (0.50-0.69) toward review (0.70-0.89) range 3. Confidence calibration for confusion groups (previous commit): - dedup_vs_nodedup: 0.85→0.50 for negative detection - validation_vs_keybreak: 0.80→0.55 for has_counter - simple_vs_two_stage: 0.80→0.50 for sequential OPEN Results - matching programs: MT01: 0.38→0.75, MT02: 0.30→0.60, MT03: 0.30→0.60, MT16: 0.45→0.81, MT17: 0.36→0.65, MT18: 0.60→0.60, MT19: 0.30→0.60, MT20: 0.30→0.65, MT33: 0.30→0.60 All now rule_engine (not fallback), no false negatives. Subtype discrimination remains for future work: all matching programs classified as マッチング without 1:1/1:N/N:1 subtype.
2026-06-21 13:25:39 +08:00
parent 958b12e9a9
commit 65e9919933
3 changed files with 32 additions and 5 deletions
@@ -20,6 +20,7 @@ def compute_confidence_v2(
    structure_features: dict[str, Any],
    contradictions: list[dict[str, Any]] | None = None,
    resolution: dict[str, Any] | None = None,
+    consensus_category: str | None = None,
 ) -> dict[str, Any]:
    """4 因子确信度计算。

@@ -31,6 +32,8 @@ def compute_confidence_v2(
        contradictions: 矛盾列表，每条包含 {"type": str, "resolved": bool, ...}
        resolution: 矛盾解决方案，
            例如 {"resolved_count": 0, "total_count": 0}
+        consensus_category: 当不为 None 且与 keyword_result 中的 category 一致时，
+            表示 L1 关键字和规则引擎对最终分类达成一致，给予共识奖励。

    Returns:
        dict: {
@@ -46,7 +49,7 @@ def compute_confidence_v2(
    # ── 1. 基础确信度 ──
    base = keyword_result.get("base_confidence", 0.7)

-    # ── 2. 上下文因子（关键字匹配数）──
+    # ── 2. 上下文因子（关键字匹配数 + 共识奖励）──
    match_count = keyword_result.get("match_count", 0)
    if match_count >= 3:
        context_factor = 1.0
@@ -57,6 +60,11 @@ def compute_confidence_v2(
    else:
        context_factor = 0.50

+    # L1 关键字与规则引擎分类一致的共识奖励
+    kw_category = keyword_result.get("category", "")
+    if consensus_category and kw_category and kw_category == consensus_category:
+        context_factor = min(context_factor + 0.15, 1.0)
+
    # ── 3. 一致性因子（矛盾检测）──
    contradictions = contradictions or []
    unresolved_count = sum(1 for c in contradictions if not c.get("resolved", False))