This paper identifies that object hallucinations in large vision-language models (LVLMs) originate from visual encoders, uncovering three core issues: statistical bias, inherent bias, and vulnerability. To address these, SHIELD is introduced as a training-free framework that applies three strategies: re-weighting visual tokens to reduce statistical bias, injecting noise-derived tokens to counteract inherent bias, and employing adversarial attacks with contrastive decoding to mitigate vulnerability. Experiments across multiple benchmarks and LVLM families demonstrate SHIELD effectively reduces object hallucinations while maintaining strong general performance, and the code is publicly available.
IncidentMind is a token-budget multi-agent system for autonomous root cause analysis of production AI failures. It pre-syncs Slack, Confluence, and Jira into a HydraDB temporal knowledge graph via MCP, converting all agent queries into a single graph traversal. A tri-tier inference strategy uses minilm-l6 for sync-time tasks, quantized Llama-3-14B for agent reasoning, and GPT-4o-mini only when confidence falls below 85%, reducing per-incident cost from $1.50 to $0.003. Structured token budgeting compresses 50,000 raw log tokens to 1,050 tokens (98% reduction). Across 847 production incidents, IncidentMind achieved 91% fix accuracy and reduced mean time to detect from 4.2 hours to 3 minutes.
The paper introduces Self-Aligned Reward (SAR), a fine-grained RL signal that complements verifiable rewards to improve both accuracy and efficiency of LLM reasoning. SAR is defined as the relative perplexity difference between a query-conditioned answer and the standalone answer, thereby favoring concise, query-specific responses and penalizing redundancy. Quantitative analysis confirms that SAR reliably ranks answer quality, assigning higher scores to concise correct answers than to verbose ones. Integrating SAR with PPO or GRPO reduces average answer length by 30% while boosting accuracy by 4% across four model families and seven benchmarks, with strong out-of-domain generalization. The approach achieves a Pareto-optimal frontier between correctness and efficiency, shortening unnecessary elaboration without hurting advanced reasoning behaviors. Code and data are publicly released.
Autoregressive language model inference is not fully determined by fixed weights; instability phenomena like drift and hallucination arise from structural trajectory dynamics. Causal isolation experiments using gradient scrambling demonstrate that trajectory geometry constitutes a control field, and state-dependent feedback (e.g., switching between two frozen models without parameter updates) is both necessary and sufficient for stability. Fixed-setpoint control fails due to control friction, while the proposed boundary-aware Dynamic Operator Mixing (Band DOM) achieves stability with approximately 79% of inference steps requiring zero control input. A fundamental limit is identified: dynamic stability and semantic consistency are decoupled; stabilized trajectories exhibit mode-switching in over 85% of trials while maintaining geometric smoothness, revealing a kinetic/potential decomposition of inference dynamics.
The paper identifies five failure modes specific to production AI systems that traditional observability misses. It proposes an observability architecture integrating Prometheus, Grafana, and OpenObserve. Metrics are defined across retrieval quality, vector database health, LLM inference performance, and end-to-end pipeline latency. The framework was validated in a production environment handling 2 million daily queries. It reduced mean time to detection by up to 97% for previously undetectable incidents.