| Name |
Last commit
|
Last update |
|---|---|---|
| .. | ||
| data/jvs-dw | ||
| prisma | ||
| sql | ||
| src | ||
| tests | ||
| .dockerignore | ||
| .env.example | ||
| .gitignore | ||
| .swcrc | ||
| Dockerfile | ||
| jest.config.cjs | ||
| nest-cli.json | ||
| package.json | ||
| tsconfig.build.json | ||
| tsconfig.json |
交互(用户要求):点🎤 开始 → 说话时文字实时滚动进输入框 → 再点 = 纯退出(已上屏文字保留)。 实现(复用 realtime-coach 成熟模式,ASR 容器零改动): - 前端:PCM16 16k 采音 + RMS 静音门控(无声不发帧)→ socket.io 推帧; dictation:partial(当前句滚动覆盖)/ dictation:final(句定稿累加)→ setInput 实时渲染; base 保留输入框已有文字,听写追加其后。 - 后端 DictationGateway(socket.io,JWT 握手鉴权同 coach):按"帧到达间隙"断句 —— 说话中每 700ms 把当前句 PCM 包 44 字节 WAV 头调 TranscribeService 出 partial; 停顿 ≥800ms / 超 30s 整句 final 并清缓冲。inFlight 防解码重叠;先清缓冲再 final 解码(下一句帧不混入)。SenseVoice 离线模型 RTF~0.1 → 句级重解码远快于实时。 实测(模拟浏览器推帧):开口 1.3s 首个 partial,~0.7s/次滚动更新,停顿 1.3s 出整句 final, 文本与一次性识别完全一致。两端 tsc 0。 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
| Name |
Last commit
|
Last update |
|---|---|---|
| .. | ||
| data/jvs-dw | Loading commit data... | |
| prisma | Loading commit data... | |
| sql | Loading commit data... | |
| src | Loading commit data... | |
| tests | Loading commit data... | |
| .dockerignore | Loading commit data... | |
| .env.example | Loading commit data... | |
| .gitignore | Loading commit data... | |
| .swcrc | Loading commit data... | |
| Dockerfile | Loading commit data... | |
| jest.config.cjs | Loading commit data... | |
| nest-cli.json | Loading commit data... | |
| package.json | Loading commit data... | |
| tsconfig.build.json | Loading commit data... | |
| tsconfig.json | Loading commit data... |