AI intelligence feed

R LOCALLLAMAJun 28, 2026

User Claims Qwen3-VL-2B is the Only Viable VLM for JSON Extraction on Low-End Hardware

A Reddit user reports that after extensive testing on three low-end laptops (Intel i3, 8GB RAM, integrated GPU), Qwen3-VL-2B in Q4_K_M GGUF quantization reliably extracts data from images to JSON, outperforming Qwen3-VL-4B and Qwen3.5 2B. The user notes this model is absent from major benchmarks like Artificial Analysis and the Open LLM Leaderboard, which list the 4B version instead. The post questions why it is ignored and asks if any other model can handle the task on similarly constrained devices like phones or Raspberry Pis. No quantitative benchmarks or replication details are provided.

R LOCALLLAMAJun 28, 2026

Clark Labs Releases Ternary Sana 1.6B Text-to-Image Transformer, 8.6× Smaller with Near-FP16 Quality

Clark Labs has compressed the Sana 1.6B text-to-image transformer to ternary quantization (~1.85 bits per weight), achieving an 8.6× size reduction from 3.21 GB (FP16) to just 374 MB while retaining near-FP16 image quality. The model uses group-wise scales and maintains a small high-precision tail (~5% of parameters for conditioning and projection layers) to preserve important details. The packed ternary weights are provided alongside an unpacked bf16 version that is a drop-in replacement for diffusers. Released under the Apache-2.0 license, this compressed model enables efficient local deployment of Sana 1.6B on resource-constrained hardware.

R LOCALLLAMAJun 28, 2026

Koboldcpp v1.116 Released

Koboldcpp version 1.116 has been released. The announcement provides no additional details about changes, fixes, or new features.

R LOCALLLAMAJun 28, 2026Highlight

55-LLM blind peer evaluation reveals systematic same-family bias in LLM judges

An open evaluation pitted 55 LLMs from 11 developer families against 198 hand-written prompts; models then blind-graded each other across 22,254 judgments, excluding self-ratings. All eight families with sufficient data showed statistically significant same-family rating bias: Qwen judges favored other Qwen models by +0.91 points, while Mistral judges penalized other Mistral models by −1.02 points—the largest absolute bias. Other families ranged from xAI (+0.75) to Meta (−0.68). Aggregate leaderboards obscured category-level variation, with six different models topping nine categories, and code tasks provoked the highest judge disagreement. The full dataset, code, and prompts are MIT-licensed, and the author outlines next steps including anchoring to ground truth and isolating judge bias from response quality.

R LOCALLLAMAJun 28, 2026Highlight

claude_converter: Convert Claude Code Sessions into Fine-tuning Data for Local Models

A developer released claude_converter, an open-source tool that converts Claude Code session .jsonl files into the messages format accepted by fine-tuning frameworks like TRL/SFTTrainer, Axolotl, and LLaMA-Factory (ShareGPT format). It includes a clean_messages() helper to strip tool-use blocks and an inspect_session() function for token counts and breakdowns. The tool has zero dependencies and can be installed via `uv pip install claude-converter`. Users are advised to filter sessions to only those where the final assistant turn solved the problem before training.

R LOCALLLAMAJun 28, 2026

Model Registry: Torrents for open models using Hugging Face as a fallback web seed

Developer Ravindra Marella created Model Registry, a GitHub repository and website for sharing .torrent files of popular open models. The system uses a custom backend service to redirect BitTorrent client requests to Hugging Face URLs, providing a web seed fallback when no peers are available. The service is currently experimental, with occasional CDN errors that usually succeed after retries. Plans include automating torrent creation and publishing via GitHub Actions, though the free runners' 100 GB disk limit is a hurdle for models over 100 GB.

AI signal, minus the noise.

User Claims Qwen3-VL-2B is the Only Viable VLM for JSON Extraction on Low-End Hardware

Clark Labs Releases Ternary Sana 1.6B Text-to-Image Transformer, 8.6× Smaller with Near-FP16 Quality

Koboldcpp v1.116 Released

55-LLM blind peer evaluation reveals systematic same-family bias in LLM judges

claude_converter: Convert Claude Code Sessions into Fine-tuning Data for Local Models

Model Registry: Torrents for open models using Hugging Face as a fallback web seed