作者通过10万行聚合任务基准测试,构建确定性引擎替代RAG处理计算查询
英文摘要
In this blog post, the author benchmarks retrieval-augmented generation (RAG) pipelines against a deterministic full-scan engine across 100,000 rows for aggregation tasks. The results show that larger context windows do not improve accuracy—they actually make errors harder to detect. The author finds that computation-heavy queries must be routed away from RAG entirely, and builds a system that directs such queries to a deterministic full-scan engine to preserve accuracy.
中文摘要
该文章对检索增强生成(RAG)流水线和确定性全扫描引擎在10万行数据上进行聚合任务基准测试。结果表明,增大上下文窗口并不能提高准确性,反而让错误更难发现。作者得出结论:计算密集型查询必须完全绕开RAG,并构建了一个将此类查询导向确定性全扫描引擎的系统,以保持准确性。
关键要点
Increasing context window size in RAG systems does not improve accuracy for aggregation tasks; it makes errors harder to detect.
在RAG系统中增加上下文窗口大小并不会提高聚合任务的准确性,反而让错误更难检测。
Benchmarks on 100,000 rows comparing retrieval pipelines to a deterministic full-scan engine show computation queries should be routed away from RAG entirely.
对10万行数据进行检索流水线与确定性全扫描引擎的对比基准测试表明,计算类查询应完全绕开RAG。
The author built a system that automatically diverts computation queries to a deterministic full-scan engine, avoiding RAG's inaccuracies.
作者构建了一个自动将计算查询导向确定性全扫描引擎的系统,从而避免了RAG的不准确问题。