UncertaiNLP 2025 Paper Index
- Fellow Traveler

- 11 hours ago
- 2 min read
# | Paper Title | Authors | Relevance | Links | Key Connection to Entropy Engine | |
23 | ERGO: Entropy-guided Resetting for Generation Optimization in Multi-turn Language Models | Khalid, Jeyaganthan, Do, Fu, Sharma, O'Brien, Zhu | CRITICAL | ACL: .../main.23/ | Independent convergence with Entropy Engine. Uses Shannon H over next-token distributions, ΔH̄(t) as drift signal, non-prescriptive input restructuring. 56.6% performance gain, 24.7% aptitude increase, 35.3% unreliability reduction. | |
arXiv: 2510.14077 | ||||||
26 | DeLTa: A Decoding Strategy based on Logit Trajectory Prediction Improves Factuality and Reasoning Ability | He, Takase, Ishibashi, Shimodaira | HIGH | ACL: .../main.26/ | Logit trajectory analysis across transformer layers for factuality. Validates logit-level monitoring approach relevant to Engine's planned hardware implementation path. | |
arXiv: 2503.02343 | ||||||
21 | Read Your Own Mind: Reasoning Helps Surface Self-Confidence Signals in LLMs | Podolak, Verma | HIGH | ACL: .../main.21/ | Shows reliable uncertainty estimation requires explicit exploration of generative space. Semantic entropy validated as reliable signal. Supports Engine's premise that distributional monitoring reveals genuine model state. | |
arXiv: 2505.23845 | ||||||
16 | Confidence-Based Response Abstinence: Improving LLM Trustworthiness via Activation-Based Uncertainty Estimation | Huang, Datla, Zhu, Samuel, Liu, Kumar, Soni (Capital One) | HIGH | ACL: .../main.16/ | FFN activations carry more info than post-softmax probabilities. Financial industry RAG application. Abstention principle: cost of wrong answer > cost of no answer. Validates Engine's deeper-than-probability monitoring and high-stakes deployment context. | |
arXiv: 2510.13750 | ||||||
2 | Phases of Uncertainty: Confidence-Calibration Dynamics in Language Model Training | Durai | MODERATE | ACL: .../main.2/ | Uncertainty as training-dependent property, not static. Relevant to Engine's temporal dynamics approach—uncertainty changes over time, not fixed characteristic. | |
6 | Certain but not Probable? Differentiating Certainty from Probability in LLM Token Outputs for Probabilistic Scenarios | Toney, Wails | MODERATE | ACL: .../main.6/ | Token-level probability diverges from theoretical distributions. Highlights gap between model confidence and actual correctness—the exact problem Engine monitors. | |
5 | The Geometry of Creative Variability: How Credal Sets Expose Calibration Gaps in Language Models | Garces Arias, Rodemann, Heumann | MODERATE | ACL: .../main.5/ | Credal sets (imprecise probability) for LLM uncertainty. Geometric approach to distributional analysis connects to Engine's distributional monitoring framework. |


Comments