feat: matching program full recognition — L1 regex keyword + confidence consensus
Three-part fix for matching program classification:
1. L1 regex keyword WS-[-\w]*KEY (confidence 0.65):
- Captures WS-KEY, WS-MAST-KEY, WS-TRAN-KEY, WS-PREV-KEY etc.
- Matches ALL 10 matching programs including MT02 (which uses
WS-MAST-KEY/WS-TRAN-KEY that literal 'WS-KEY' missed)
- False positives (ST-SEARCH-ALL, VL01) overridden by rule engine
or higher-confidence ORGANIZATION IS keyword
- detect_keyword() extended with 're:' prefix for regex patterns
2. Consensus bonus in compute_confidence_v2:
- When L1 keyword category matches rule engine's final category,
context_factor boosted by +0.15
- Pushes matching programs from manual (0.50-0.69) toward
review (0.70-0.89) range
3. Confidence calibration for confusion groups (previous commit):
- dedup_vs_nodedup: 0.85→0.50 for negative detection
- validation_vs_keybreak: 0.80→0.55 for has_counter
- simple_vs_two_stage: 0.80→0.50 for sequential OPEN
Results - matching programs:
MT01: 0.38→0.75, MT02: 0.30→0.60, MT03: 0.30→0.60,
MT16: 0.45→0.81, MT17: 0.36→0.65, MT18: 0.60→0.60,
MT19: 0.30→0.60, MT20: 0.30→0.65, MT33: 0.30→0.60
All now rule_engine (not fallback), no false negatives.
Subtype discrimination remains for future work: all matching
programs classified as マッチング without 1:1/1:N/N:1 subtype.
This commit is contained in:
@@ -92,8 +92,9 @@ def _build_keyword_result_for_v2(keyword_info: dict | None) -> dict:
|
||||
return {
|
||||
"base_confidence": keyword_info["confidence"],
|
||||
"match_count": len(keyword_info["all_matches"]),
|
||||
"category": keyword_info.get("category"),
|
||||
}
|
||||
return {"base_confidence": 0.0, "match_count": 0}
|
||||
return {"base_confidence": 0.0, "match_count": 0, "category": None}
|
||||
|
||||
|
||||
def _build_structure_features(structure: dict) -> dict:
|
||||
@@ -213,11 +214,16 @@ def _path_rule_engine(
|
||||
|
||||
structure_features = _build_structure_features(structure)
|
||||
|
||||
# 共识检测: L1 关键字分类与规则引擎最终分类一致时给予奖励
|
||||
kw_cat = keyword_info["category"] if keyword_info else None
|
||||
consensus_cat = kw_cat if (kw_cat and kw_cat == final_category) else None
|
||||
|
||||
v2_confidence = compute_confidence_v2(
|
||||
keyword_result=keyword_result_v2,
|
||||
structure_features=structure_features,
|
||||
contradictions=contradictions,
|
||||
resolution=resolution_map,
|
||||
consensus_category=consensus_cat,
|
||||
)
|
||||
|
||||
# 6. 组装结果
|
||||
|
||||
Reference in New Issue
Block a user