An Interactive Mini Transformer Demo with Editable Weights Illustrates the Forward Pass in a Single HTML File
English summary
A software engineer built a minimal transformer (single attention head, single block, 6-token vocabulary, 3-dimensional embeddings) that predicts the next word from four input words. All computations from word embeddings to logits are displayed in a self-contained web page where weights and word vectors can be edited live, and downstream numbers update instantly. A randomise button scrambles the weights to show that untrained models produce meaningless predictions, underscoring the necessity of training. The tool is deliberately focused only on the forward pass and omits backpropagation; the creator plans to add it next.
Chinese summary
一位软件工程师构建了一个最小化Transformer(单注意力头、单块、6词词表、3维嵌入),由四个词预测下一个词。所有从嵌入到logits的计算展现在一个独立的HTML页面中,权重和词向量可实时编辑,下游数值即时更新。随机化按钮打乱权重,展示未训练模型产生无意义预测,强调训练的必要性。该工具刻意只展示前向传播,不含反向传播,创建者计划后续加入。
Key points
A complete forward pass of a tiny transformer is visualised, with all matrices and vectors displayed numerically on screen.
完整展示微型Transformer的前向传播,所有矩阵和向量以数字形式呈现在屏幕上。
Weights and word embeddings are editable, and every subsequent calculation recomputes in real time, allowing hands-on exploration.
权重和词嵌入可编辑,后续所有计算实时重算,支持动手探索。
A Randomize button scrambles all weights, turning the model’s prediction into nonsense and concretely demonstrating that untrained weights are meaningless.
随机化按钮打乱所有权重,使预测变为无意义结果,具体证明未训练的权重毫无价值。
The entire tool is a single, self-contained HTML file with no external libraries, making it easy to share and run.
整个工具是一个独立的HTML文件,无外部库,易于分享和运行。
Backpropagation is not covered; the creator intends to build a companion visualisation for that stage next.
不包含反向传播;创建者计划下一步构建配套的反向传播可视化。