ComfyUI Workflow Uses Qwen VLM and KREA2 to Achieve Character Consistency Across Unlimited Scenes Without IPAdapter or LoRAs
English summary
A user discovered a method to generate an effectively unlimited number of high-resolution images with consistent characters by having the language model reconstruct the full semantic state of every frame from scratch, rather than relying on image-based memory like IPAdapter or character LoRAs. The workflow involves writing a single story prompt containing detailed character sheets and scene descriptions; a Qwen VLM node splits the story and rewrites each character’s description completely for every panel before feeding it to Krea 2. The approach yields surprising consistency, requiring no reference images or multi-panel tricks. The method works with Krea 2 and likely other capable models, and the full ComfyUI workflow is publicly shared for others to try with Flux, HiDream, or Seedream.
Chinese summary
一位用户发现一种方法,通过让语言模型从零开始重建每一帧的完整语义状态,而非依赖IPAdapter或角色LoRAs等图像记忆,生成几乎无限张高分辨率且角色一致性的图像。该工作流将整个故事写成单条提示,包含详细的角色设定和场景描述;一个Qwen VLM节点拆分故事,为每个面板完全重写每个角色的描述,再送入Krea 2。结果出人意料地保持了一致性,无需任何参考图像或参考图技巧。该方法在Krea 2上效果良好,可能也适用于其他强模型,完整的ComfyUI工作流已公开,供他人在Flux、HiDream或Seedream上尝试。
Key points
Consistency is achieved by having an LLM (Qwen VLM) re-describe every character from scratch for each panel, not by using reference images or adapters.
一致性是通过LLM(Qwen VLM)为每个面板从零开始重新描述每个角色实现的,而非使用参考图像或适配器。
The workflow supports an effectively unlimited number of separate high-resolution images with only a single story prompt.
该工作流仅需一条故事提示,即可生成几乎无限张独立的高分辨率图像。
No reference images, IPAdapter, character LoRAs, or multi-panel generation tricks are needed.
无需参考图像、IPAdapter、角色LoRAs或多面板生成技巧。
The method was demonstrated with Krea 2 and Qwen VLM inside ComfyUI, and may work with other models like Flux or HiDream.
该方法在ComfyUI内使用Krea 2和Qwen VLM进行了演示,并可能适用于Flux、HiDream等其他模型。
Full workflow and explanation are publicly available for experimentation.
完整的工作流和说明已公开,供他人实验。