社交来源: R LOCALLLAMA2026年6月28日重要度: 3/5

用户用四块RTX 4090打造Jarvis级语音助手，称Gemma 4 31B QAT模型表现最佳

英文摘要

A local AI enthusiast built a personal voice assistant with premium capabilities including voice verification, wake words, continuous conversation, Home Assistant control, Hermes Agent integration, and deep research. The system runs on a custom server with four modified RTX 4090s (192GB VRAM total), 128GB DDR5 RAM, and a 3000W PSU powered via a 240V/30A dryer line. After testing large models like Qwen 397B, MiniMax M3, Nemotron 3 Ultra, and GLM 4.7/5.2, the user found that Google's Gemma 4 31B QAT outperforms them all and is significantly faster for its size. The assistant is deployed across the house using conference speaker-mics, with heat managed by a laundry room exhaust fan.

中文摘要

一名本地AI爱好者打造了一款私人语音助手，具备语音验证、唤醒词、持续对话、Home Assistant控制、Hermes Agent集成和深度研究等高级功能。系统运行在一台定制服务器上，配备四块改造后的RTX 4090（共192GB显存）、128GB DDR5内存和3000W电源，通过240V/30A烘干机线路供电。在测试了Qwen 397B、MiniMax M3、Nemotron 3 Ultra和GLM 4.7/5.2等大模型后，用户发现谷歌的Gemma 4 31B QAT表现优于所有这些模型，且在同尺寸下速度惊人。该助手通过会议扬声器麦克风部署全家，热量由洗衣房排风扇管理。

关键要点

Custom hardware: 4 modded RTX 4090s with 48GB each (192GB VRAM), 128GB DDR5, 3000W PSU, powered by a dryer line.
定制硬件：四块改造后的RTX 4090（每块48GB，共192GB显存），128GB DDR5内存，3000W电源，由烘干机线路供电。
The personal assistant features voice verification, wake words, continuous conversation, long-term memory, dynamic system prompt, Home Assistant and Hermes Agent integration, and deep research.
私人助手具备语音验证、唤醒词、持续对话、长期记忆、动态系统提示、Home Assistant和Hermes Agent集成以及深度研究功能。
Among tested models, Gemma 4 31B QAT was found to outperform Qwen 397B, MiniMax M3, Nemotron 3 Ultra, GLM 4.7, and a lobotomized GLM 5.2, while being shockingly fast for its size.
在测试的模型中，Gemma 4 31B QAT被发现在性能上超越了Qwen 397B、MiniMax M3、Nemotron 3 Ultra、GLM 4.7和一个大幅阉割的GLM 5.2，并且同尺寸下速度惊人。
The setup is noisy and hot but managed with an exhaust fan; idle cards consume power comparable to half a hand dryer, full load like 2-3 hand dryers.
设备噪音大且发热高，但通过排风扇管理；待机功耗相当于半个干手器，满载相当于2-3个干手器。

打开原文