AI intelligence feed

GITHUBJul 3, 2026Highlight

Anthropic Releases Claude Code: An Agentic Terminal-Based Coding Assistant

Claude Code is an agentic coding tool that operates directly in the terminal. It comprehends the entire codebase and assists developers by executing routine tasks through natural language commands. The tool can explain complex code and manage git workflows, streamlining common development operations. All interactions are driven by conversational instructions, enabling faster coding without leaving the command line.

GITHUBJul 3, 2026

safishamsi's AI Coding Skill: From Any Folder to Queryable Knowledge Graph Across Claude Code, Cursor, and Gemini CLI

Developer safishamsi has published via GitHub Sponsors an AI coding assistant skill that works with tools like Claude Code, Codex, OpenCode, Cursor, and Gemini CLI. The skill can ingest a folder containing code, SQL schemas, R scripts, shell scripts, documentation, papers, images, or videos, and transform the contents into a single queryable knowledge graph. It unifies application code, database schema, and infrastructure representations within the graph. No further details on availability or pricing are provided in the listing.

GITHUBJul 3, 2026Highlight

vLLM Integrates aiter Backend for NVFP4 MOE on AMD gfx950 (MI350/MI355), Delivering 30-40% End-to-End Speedup

This pull request adds support for the aiter backend in vLLM to accelerate NVFP4 Mixture of Experts (MOE) inference on AMD's gfx950 GPUs (MI350/MI355). It incorporates a 2-stage fused NVFP4 MOE implementation from the ROCm/aiter project, activated via the --moe-backend aiter flag. Benchmarking shows approximately 2-3x speedup on the fused MOE operation and a 30-40% improvement in end-to-end throughput for NVFP4 MOE models like Qwen3-30B-A3B-FP4. The implementation still uses BF16 MFMA instructions and relies on upstream integration of the aiter PR #4021.

GITHUBJul 3, 2026

llama.cpp Pull Request Adds `get_rows_back` Operation with fp32/fp16 Support

A pull request to the llama.cpp project introduces the `get_rows_back` operator. The implementation currently only supports fp32 and fp16 data types, as these are supported by the CPU backend. All related unit tests are passing.

GITHUBJul 3, 2026Highlight

SGLang Adds LingBot Realtime Prompt, KV Window, and Lazy VAE Controls for Diffusion Models

A pull request to the SGLang project introduces three new optional controls for LingBot diffusion models. Realtime support now accepts composite prompt and camera-action inputs in a single event and resets the cross-attention cache only when the prompt changes. An optional interactive KV window control, disabled by default and activated via the SGLANG_LINGBOT_ENABLE_INTERACTIVE_KV_WINDOW environment variable, dynamically adjusts sampling windows for static versus moving camera-control chunks to improve motion continuity. A lazy VAE encode control, also disabled by default and configured through SGLANG_LINGBOT_LAZY_VAE_ENCODE_BLACK_FRAMES, encodes only the initial image plus a configurable number of black padding frames and then extends the latent condition to avoid redundant encoding on long padding tails. The additions include accuracy tests for static hover and forward-moving dragon scenarios.

GITHUBJul 3, 2026Highlight

llama.cpp Release b9864: SSE Pings Prevent Connection Drops During Slow Prefill

llama.cpp release b9864 addresses a server-side issue where healthy client connections could be dropped during slow prompt prefill. The server and WebUI now ping silent SSE streams every 1 second and only kick a client after 3 seconds of inactivity, ensuring long-running prefill phases do not trigger timeouts. The SSE ping interval is exposed as a per-request field (`sse_ping_interval`) in the WebUI request body (set to 1 second), while the global default remains 30 seconds for API clients, preserving backward compatibility. The server implementation moves the parameter into the request schema with proper type and range validation. Pre-built binaries are provided for macOS, Linux, Windows, and Android across multiple backends.

AI signal, minus the noise.

Anthropic Releases Claude Code: An Agentic Terminal-Based Coding Assistant

safishamsi's AI Coding Skill: From Any Folder to Queryable Knowledge Graph Across Claude Code, Cursor, and Gemini CLI

vLLM Integrates aiter Backend for NVFP4 MOE on AMD gfx950 (MI350/MI355), Delivering 30-40% End-to-End Speedup

llama.cpp Pull Request Adds `get_rows_back` Operation with fp32/fp16 Support

SGLang Adds LingBot Realtime Prompt, KV Window, and Lazy VAE Controls for Diffusion Models

llama.cpp Release b9864: SSE Pings Prevent Connection Drops During Slow Prefill