You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The #16 concept-merge fix improved overall graph score (0.293→0.331) and 4/5 categories, but multi-hop (cat4) regressed 0.132 → 0.104 on sample 0 (199 QA).
Hypothesis: the pre-fix mega-hubs (3 L1 nodes absorbing everything) accidentally bridged some 2-hop reasoning chains — any two entities were ~2 hops apart through a hub. With 688 properly-separated concepts, those incidental bridges are gone, so some multi-hop answers are now less reachable during context assembly.
To investigate:
Inspect the cat4 questions that flipped graph 1→0 between results/baseline_s0_full.json and results/postfix_s0_graph.json.
The #16 concept-merge fix improved overall graph score (0.293→0.331) and 4/5 categories, but multi-hop (cat4) regressed 0.132 → 0.104 on sample 0 (199 QA).
Hypothesis: the pre-fix mega-hubs (3 L1 nodes absorbing everything) accidentally bridged some 2-hop reasoning chains — any two entities were ~2 hops apart through a hub. With 688 properly-separated concepts, those incidental bridges are gone, so some multi-hop answers are now less reachable during context assembly.
To investigate:
results/baseline_s0_full.jsonandresults/postfix_s0_graph.json.get_context(and/or the bi-temporal fact edges from v2 [Epic] Graph model v2 — bi-temporal relational facts #18 giving real relational paths), not reintroducing hubs.Low priority vs the inference/adversarial gains; tracking so it isn't lost. Refs #16, #18.