Thinkgap feed

AI signal, minus the noise.

Curated items are read from the processed items table and served as a bilingual feed.

8 items

MEDIUM LARGE LANGUAGE MODELSJun 14, 2026

Building a Production LLM Memory System from Scratch (Part 3 — FastAPI + STM + LTM + RAG)

This Medium tutorial is Part 3 of a series on constructing a production-grade LLM memory system. The accessible content only shows a teaser linking to the previous article and a prompt to continue reading on Medium. The title suggests the tutorial covers the integration of FastAPI, short-term memory (STM), long-term memory (LTM), and retrieval-augmented generation (RAG), but no concrete technical details are available from the raw feed content, which is limited to a brief promotional snippet.

MEDIUM LARGE LANGUAGE MODELSJun 12, 2026

A Machine Learning Engineer’s Guide to LLM Concepts: Tokens, Transformers, Embeddings, Prompts, RAG, and Fine-Tuning

This tutorial provides a practical overview of core LLM concepts for machine learning engineers. It begins with foundational elements like tokens, transformer architectures, and embeddings, then covers advanced techniques including prompt engineering, retrieval-augmented generation (RAG), and fine-tuning. The guide emphasizes developing sound engineering judgment to move beyond trial-and-error prompting. No new research or product announcements are made; it serves as an educational resource.

MEDIUM LARGE LANGUAGE MODELSJun 11, 2026

Stop Building LLM Wrappers: Why 2026 Belongs to RAG Architects

The article declares the 'wrapper startup' era dead. It states that in 2024 and 2025, building a decent product by simply wrapping a large language model was feasible, but that approach is now obsolete. It predicts that 2026 will belong to RAG architects, though the truncated content does not provide supporting details or evidence.

MEDIUM LARGE LANGUAGE MODELSJun 10, 2026

How Embeddings Power Retrieval-Augmented Generation (RAG) Systems

This Medium tutorial by Cletus Jay Ajibade provides a beginner-friendly guide to how Retrieval-Augmented Generation (RAG) systems leverage embeddings, vector databases, and large language models to search private company data. It explains the workflow of converting data into vector embeddings, performing similarity search, and using LLMs to generate context-aware answers. The piece is an introductory overview aimed at demystifying enterprise AI search architecture.

MEDIUM LARGE LANGUAGE MODELSJun 10, 2026

How Retrieval-Augmented Generation (RAG) Extends LLM Knowledge Beyond Training Data

This tutorial explains that language model knowledge is frozen after training, and introduces Retrieval-Augmented Generation (RAG) as a method to let LLMs read new information such as private documents or real-time data. It highlights RAG’s role in giving models access to up-to-date answers beyond their original training cut-off.

MEDIUM LARGE LANGUAGE MODELSJun 10, 2026

Tutorial: How RAG-Anything Handles Complex PDF Documents for RAG

PDF files are complex, containing formatted content rather than plain text, which makes them challenging for retrieval-augmented generation (RAG) pipelines. This article introduces RAG-Anything, a tool designed to process such complex PDFs and extract usable content for RAG systems. The tutorial explains how RAG-Anything overcomes common PDF extraction hurdles.