A Blind Visual Paradigm for Testing Skill Transfer in Small Models Without Fine-Tuning
English summary
A user proposes an experimental paradigm to test whether a large language model can extract a reusable 'procedural scaffold' from its superior performance on a Three.js task and transfer it to a small model, making its outputs deeper without fine-tuning. The paradigm uses a cross-domain setup: the large model improves a complex scene (domain 1) to generate a scaffold, which is then applied to the small model for a completely different Three.js task (domain 2, a low-poly turret). A blind third large model judges rendered outputs from the small model with and without the scaffold, comparing visual quality and structural coherence. The experiment has not been run yet; the core claim is that if the scaffolded small model outperforms the baseline on an unseen domain, it demonstrates genuine transferable procedural knowledge.
Chinese summary
一位用户提出了一种实验范式,检验大型语言模型能否从其在Three.js任务上的优势中提取可复用的“过程脚手架”,并将其迁移至小模型,使其无需微调即可生成更深层的输出。该范式采用跨领域设计:大模型先在领域一(复杂场景)上生成脚手架,再将其应用到小模型的领域二任务(低多边形炮塔)中。一个不知情的第三方大模型作为盲审评委,对小模型在有无脚手架情况下的渲染图像进行评分,比较视觉质量和结构连贯性。该实验尚未执行;核心假设是,若添加脚手架的小模型在未见领域上的表现优于基线,则证明其具备可迁移的过程性知识。
Key points
The author observed that small models (~9B) often produce 'shallow' outputs lacking planning depth and procedural discipline compared to large models.
作者观察到,小模型(约9B参数)的输出往往“浅层”,缺乏大模型所具备的计划深度和过程性规范。
A blind, cross-domain experimental paradigm is proposed using Three.js visual rendering tasks as a testbed, where output quality is directly exposed by the rendered image.
提出了一种盲测、跨领域的实验范式,以Three.js视觉渲染任务为测试平台,渲染图像直接暴露输出质量。
The large model extracts a 'procedural scaffold' from its improvement over the small model on one domain, and this scaffold is applied to the small model for a completely different domain task.
大模型从其在领域一任务上对小模型的改进中提取“过程脚手架”,并将该脚手架应用于小模型的领域二任务,这两个领域完全不同。
A fresh instance of the large model acts as blind judge, scoring images on visual quality, silhouette recognizability, structural coherence, and detail density without any context.
一个新的同型号大模型实例作为盲审评委,在完全不知情的情况下,对图像的视觉质量、轮廓可辨识性、结构连贯性和细节密度进行评分。
The experiment has not yet been executed; the post is a proposal to validate transferable procedural knowledge in small models without fine-tuning.
该实验尚未执行;帖子是一个验证小模型无需微调即可获得可迁移过程性知识的提案。