沉浸式阅读 · 应用内指南 · 多窗口弹出 · 全文搜索 · PET 模式
Immersive Reading · In-App Guide · Multi-Window Popout · Full-Text Search · PET Mode
- 新增沉浸式阅读模式——全屏阅读浮层,支持唤醒词监听、TTS 降音量、AudioModeManager 状态机,解放双手听文章
- 新增应用内使用指南——8 章交互式指南,涵盖所有功能,从入门到进阶
- 新增多窗口弹出——将单个会话弹出为独立浏览器窗口,BroadcastChannel 同步,网格布局管理
- 新增全文搜索——渐进式历史加载,MCP 搜索桥接,跨会话搜索所有对话记录
- 新增浏览器预览——实时 CDP 截图伴侣,Playwright 浏览器预览嵌入会话卡片
- 新增 PET 模式——聊天气泡界面,智能滚动,透明背景,设置面板重新定位
- 新增上下文窗口显示——输入栏和消息卡片中实时显示上下文用量,按模型动态调整限制
- 新增 ChatGPT 图像和 Midjourney 视频生成——AI 生成内容画廊,MCP 工具集成
- 新增反馈系统——应用内 Bug 报告和功能请求面板,验证动画,管理员 MCP 工具
- 新增 OpenRouter 集成——通过 OpenCode MVP 使用免费编程模型
- 重新设计阅读器 UI——播客风格侧边栏,居中播放条、波形可视化、语言标签
- 重新设计阅读队列——播放器模式,自动播放下一篇,本地存储持久化
- 重新设计设置页——统一为侧边栏导航堆栈
- 升级唤醒词引擎——新 ONNX 运行时,声纹验证优化,48kHz 音频管线
- Added Immersive Mode — full-screen reading overlay with wake word listener, TTS ducking, and hands-free article listening
- Added In-App User Guide — 8-chapter interactive guide covering all features from basics to advanced
- Added Multi-Window Popout — pop sessions into separate browser windows with BroadcastChannel sync and grid layout
- Added Full-Text Search — progressive history loading with MCP search bridge for searching across all transcripts
- Added Browser Preview — live CDP screenshot sidecar embedding Playwright browser previews in session cards
- Added PET Mode — chat bubble interface with smart scrolling, transparent background, and relocated settings
- Added Context Window Display — real-time context usage in input bar and message cards with dynamic per-model limits
- Added ChatGPT Image and Midjourney Video Apps — AI-generated content gallery with MCP tool integration
- Added Feedback System — in-app bug reports and feature requests with validation animations and admin MCP tools
- Added OpenRouter Integration — free coding models via OpenCode MVP
- Redesigned Reader UI as podcast-style sidebar with centered PlaybackBar, waveform visualizer, and language labels
- Redesigned Reading Queue with player mode, auto-advance, and localStorage persistence
- Redesigned Settings into unified sidebar navigation stack
- Upgraded Wakeword Engine with new ONNX runtime, refined voiceprint verifier, and 48kHz audio pipeline
OpenCode Agent · 更快的唤醒词 · Web 语音聊天重写
OpenCode Agent · Faster Wake Word · Web Voice Chat Rebuilt
- 新增 OpenCode agent 支持,至此可用 Agent 扩展到 Claude Code、Codex、Gemini CLI、Openclaw、Hermes、OpenCode 六种
- 唤醒词与声纹模型写入 IndexedDB 持久缓存,二次进入无需重复下载
- Web 语音聊天用原生滚动列表重写:可跨消息卡片选中并复制,自动锚定到未读位置
- 结束词(endword)处理更可靠:自动取消仍在播放的 TTS,清空嵌入缓冲避免触发幻象命令
- 新增 Claude / Codex CLI 用量统计,内嵌在设置中直接看 token 消耗与成本
- 修复声纹注册失败(CORS)、复制粘贴夹带按钮图标、语音输入框右缘错位等细节
- Added OpenCode as a new agent flavor — now supports Claude Code, Codex, Gemini CLI, Openclaw, Hermes, and OpenCode
- Wake-word and speaker models now cached in IndexedDB, so return visits skip the download entirely
- Web voice chat rebuilt with native scrolling: free cross-card text selection, scroll pinned to the unread divider
- Endword handling hardened — in-flight TTS is cancelled, embedding buffer cleared, no more phantom commands
- Added Claude and Codex CLI usage tracking right inside Settings, with token and cost breakdown
- Fixed voiceprint enrollment CORS error, icon glyphs sneaking into copied text, and voice input right-edge alignment
自托管服务器 · Hermes Agent · 多设备同步
Self-hosted Server · Hermes Agent · Multi-device Sync
- 新增 Hermes Agent(ACP 协议)
- 支持自托管服务器,多设备自动同步
- Web 屏幕录制,同时捕获麦克风与 TTS
- 长消息 TTS 自动分段播放
- Claude 模型升级到 Opus 4.7
- 语音与同步稳定性修复
- New Hermes Agent via ACP protocol
- Self-hosted server with multi-device sync
- Web screen recording with mic + TTS capture
- Auto-chunked TTS for long messages
- Claude model upgraded to Opus 4.7
- Voice and sync stability fixes
HeraldVox 正式发布
HeraldVox Initial Release
- 语音唤醒 + 语音输入,说中文就能控制 AI Agent 干活
- 支持 Claude Code、Codex、Gemini CLI、Openclaw 四大主流 Agent
- Pilot 模式:多个 Agent session 卡片列表,一句话切换
- Console 模式:实时看 AI 的执行过程
- Focus 模式:专注跟一个 Agent 对话
- 手机和电脑都能用,Web App 不用装东西
- 端到端加密,不收集任何数据
- Voice wake word + voice input — control AI Agents by speaking Chinese
- Supports Claude Code, Codex, Gemini CLI, and Openclaw
- Pilot mode: multi-session card list, switch agents by voice
- Console mode: watch AI execution in real time
- Focus mode: dedicated conversation with one agent
- Works on mobile and desktop, no installation needed
- End-to-end encryption, zero data collection