Thinkgap feed

AI signal, minus the noise.

Curated items are read from the processed items table and served as a bilingual feed.

29 items

REDDIT MACHINELEARNINGJun 15, 2026

Cleo: Finetuning Qwen3.5-2B-Base into a Full Text-to-SQL Analyst with a Unified Harness

Cleo is an open-source text-to-SQL model built by finetuning Qwen3.5-2B-Base, designed to encapsulate full analyst behavior within a 2B parameter model. The system uses the same structured harness for training, evaluation, and inference, implementing a gather-repair-answer contract that includes live execution evidence during candidate query search. Key design choices include co-optimization of the model contract, SQL safety layer, dialect handling, timeouts, and clarification behavior. The model, harness, and datasets are fully open-source on GitHub and Hugging Face. This project demonstrates how tightly coupling training and inference in a single harness can enable small models to handle complex SQL generation and interactive debugging.

REDDIT MACHINELEARNINGJun 15, 2026

FeynRL: An Open-Source Framework for Transparent RL Post-Training of LLMs, VLMs, and Agents

Reddit user /u/summerday10 released FeynRL, an open-source framework designed to make reinforcement learning post-training for large language models, vision-language models, and agents fully transparent and modifiable. The framework exposes the entire training loop—data loading, rollout generation, reward computation, loss construction, optimization, and evaluation—so researchers can develop new algorithms without fighting hidden systems. It currently includes examples for supervised fine-tuning, DPO, and RL-style training and supports single-GPU, multi-GPU, and cluster setups. The project was motivated by the belief that open weights alone are insufficient; open training codebases that keep algorithms explicit and systems separate are necessary for advancing open ML/AI research.

REDDIT MACHINELEARNINGJun 15, 2026

LLMs Have Model-Specific Favorite Names: 'Elena Vasquez' and 'Marcus Chen' Strongly Indicate Claude-Generated Content

Researchers discovered that large language models exhibit strong, model-specific and version-specific priors over character names. The names 'Elena Vasquez' and 'Marcus Chen' frequently appear as a correlated ensemble across dozens of websites in diverse roles, including volcano experts, podcast hosts, thriller protagonists, and authors of 1,000+ papers published in two months, making them a reliable signal that content was generated by Claude. The team identified a third name in the ensemble, further solidifying the fingerprint. The finding emerged as a side observation from a model diffing method (CDD) and grew into a standalone paper (arXiv:2606.02184).

REDDIT MACHINELEARNINGJun 15, 2026

Reddit User Explores Feasibility of Decentralized AI Training with a Proof-of-Training Mechanism

A Reddit user proposed a decentralized AI training framework inspired by Bitcoin mining, where participants would contribute GPU resources to train an open-source model and receive tokens as rewards. The post highlights several technical obstacles: verifying genuine training work, preventing the submission of fake or harmful gradients, objectively measuring model improvements for reward distribution, and comparing efficiency against centralized data centers. The user specifically asks whether a 'proof-of-training' mechanism could exist, linking rewards directly to measurable model improvement rather than mere compute rental. The discussion invites input from experts in distributed systems, machine learning, and crypto economics on the viability of such an architecture.

REDDIT MACHINELEARNINGJun 15, 2026

Recent CS Graduate Seeks GPU Compute Access for LLM/VLM Research, Offering Co-authorship

A recent CS graduate with publications at EACL 2026, IJCNLP-AACL 2025, MICCAI 2026, an EMNLP 2025 workshop, and an ARR submission is seeking access to multi-GPU compute (4x/8x L40S, A100, H100, H200) for LLM and VLM research. The researcher offers weekly progress updates, detailed compute usage reports, reproducible code, documentation, and co-authorship on papers targeting top conferences like *CL, CVPR, and ICLR. The request highlights the compute bottleneck faced by early-career researchers with ideas but insufficient infrastructure.

REDDIT MACHINELEARNINGJun 15, 2026

PhD study: UX Designers & AI/ML Practitioners to test a "Trust in LLM-based Chatbots" Design Method

A PhD researcher from Mainz University of Applied Sciences is recruiting UX designers and AI/ML practitioners to evaluate a structured method for designing interface elements that calibrate user trust in LLM-based chatbots. Participants complete an anonymous 20-30 minute online survey where they apply the method to a worked example, then rate its clarity, usefulness, and applicability and provide open feedback. The study seeks critical feedback to refine the method for the dissertation, focusing on avoiding over-reliance or under-trust. No personal data is collected beyond optional professional background questions, and no compensation is provided.