Thinkgap feed

AI signal, minus the noise.

Curated items are read from the processed items table and served as a bilingual feed.

78 items

GITHUBJun 15, 2026

llama.cpp Release b9659 Fixes Miscounting of n_tokens in mtmd (#24656)

The llama.cpp project released tag b9659, which includes a bug fix for the mtmd component that was miscounting n_tokens (PR #24656). This release also provides pre-built binaries for a wide range of platforms, including macOS (ARM64, Intel), Linux (x64, ARM64, s390x with Vulkan, ROCm, OpenVINO, SYCL), Android (ARM64), and Windows (x64, ARM64 with CUDA 12/13, Vulkan, SYCL, HIP). Notably, the macOS Apple Silicon build with KleidiAI enabled is marked as disabled, while the iOS XCFramework artifact is available.

GITHUBJun 15, 2026

llama.cpp b9658 Now Includes Full Unparsed Prompt in Debug on Chat Parse Errors

The llama.cpp project released build b9658. A key change improves chat debugging: on parse errors, the debug output now includes the full unparsed prompt. The release also provides pre-built binaries for many platforms, including macOS (Apple Silicon, Intel), Linux (CPU, Vulkan, ROCm, OpenVINO, SYCL), Android (arm64 CPU), and Windows (CPU, CUDA, Vulkan, SYCL, HIP). The KleidiAI-enabled macOS Apple Silicon build is currently disabled in this release.

GITHUBJun 15, 2026

llama.cpp Release b9656 Hardens PEG Tool Call Parsing and Error Handling

llama.cpp release b9656 hardens the PEG-native tool call parsing. It now accepts an optional leading "type":"function" field to accommodate OpenAI-style tool call serialization. On a final parse failure, the parser returns a clean error and logs the unparsed fragment instead of throwing raw internal state. The raw arguments string is preserved when it is not valid JSON, preventing an abort of the prompt rendering. Parse failures are surfaced with clearer error messages, eliminating silent empty assistant turns. The lenient handling of the "type":"function" field is gated behind an analysis flag.

GITHUBJun 15, 2026

llama.cpp Release b9655 Fixes Long-standing Grammar Generator Bug in Chat

The llama.cpp project released tag b9655, which fixes an 'oldie but goodie' grammar generator bug in the chat feature that surfaced during recent changes (PR #24653). Additionally, an erroneous case in the PEG parser test was updated. The release provides pre-built binaries for a wide range of platforms including macOS (Apple Silicon, Intel, KleidiAI), Linux (x64, arm64, s390x, Vulkan, ROCm, OpenVINO, SYCL), Android (arm64), and Windows (x64, arm64, CUDA 12/13, Vulkan, SYCL, HIP). openEuler builds and UI components are also included.

GITHUBJun 15, 2026

llama.cpp Adds Post-Decode Callback to mtmd for Multimodal Processing

The llama.cpp release b9654 adds a post-decode callback to the mtmd (multimodal text decode) module, implemented in PR #24645. The development was assisted by the Qwen3.6-27B language model. Pre-built binaries are provided for macOS Apple Silicon, Linux x64/arm64, Windows x64/arm64, and Android, with various GPU backends (Vulkan, CUDA 12/13, ROCm, SYCL, HIP) and some configurations disabled.

GITHUBJun 15, 2026

llama.cpp Release b9653 Adds Vulkan Support for More CONCAT Operations and Multi-Platform Binaries

The b9653 release of llama.cpp extends the Vulkan backend to handle additional CONCAT tensor operation types, improving compatibility for models that rely on these operations. It also ships pre-built binaries for macOS (Apple Silicon, Intel), Linux (multiple GPU backends including Vulkan, ROCm, OpenVINO, SYCL), Android, Windows (CUDA 12/13, Vulkan, SYCL, HIP), and openEuler platforms. The release was published automatically on June 15, 2026.