Skip to content
/CreatorBox/images/jms_logo.png

数字人生成器

字数
1829 字
阅读
10 分钟

将数字人视频画面与声音高度同步,实现自然流畅的口型匹配

预览

离线 实时
gradio_app_dh gradio_app_dh_streaming

示例

离线模式

视频、音频驱动:支持跨语言、对口型、加水印、超分辨率、往返循环

原始 1080P
原始 480P

实时模式

摄像头、麦克风驱动:支持跨语言、对口型、实时打断、实时插播、实时切换

gradio_app_dh_streaming

扩展

AI数字人辅助直播,通过数字人直播间消息实时互动,提供直播设置、实时消息回复、音色模版克隆等功能,助力主播轻松打造高质量直播内容

直播设置 实时消息 消息回复 音色模版 实时集成 new
gradio_app_dh_liveio1 gradio_app_dh_liveio2 gradio_app_dh_liveio3 gradio_app_dh_liveio4 gradio_app_dh_liveio5

📅 计划支持

💡 预览模式 new

实时模式支持四种预览方式,可根据实际使用场景选择:

模式 说明 视频源限制
obs 直接推送到 OBS 虚拟摄像头,无需显示窗口 视频
cv2 本地弹窗预览(默认) 摄像头 / 视频
ffplay 使用 ffplay 弹窗预览,需安装 FFmpeg 摄像头 / 视频

流程

⚡ 性能参考 new

下列是30帧实时数字人推理日志片段,仅供参考,实际性能表现可能因设备配置等因素有所不同

log
[DH] -> Enter audio path or (!path|!m|b|c|q): Audio input: D:\Projects\creator\creator-box\webapp\upload\video_product.wav [Interrupt]
[DH] 2026-05-11 14:17:39.189 | INFO  1364  interface_streaming_v2.py:579 - Feeding audio #0: source=AudioSource.FILE, path=D:\Projects\creator\creator-box\webapp\upload\video_product.wav, interrupt=True
[DH] 2026-05-11 14:17:39.189 | INFO  1364  interface_streaming_v2.py:594 - Audio task #0 set as priority (interrupt mode), pending queue untouched
[DH] -> Enter audio path or (!path|!m|b|c|q): 2026-05-11 14:17:39.365 | INFO  1332  interface_streaming_v2.py:635 - Processing audio task #0 (interrupt=True)
[DH] 2026-05-11 14:17:39.365 | INFO  1332  interface_streaming_v2.py:672 - Audio #0: Loading audio file: D:\Projects\creator\creator-box\webapp\upload\video_product.wav
[DH] 2026-05-11 14:17:40.782 | INFO  1332  interface_streaming_v2.py:678 - Audio #0: Loaded: samples=489291, duration=30.581s, sr=16000
[DH] 2026-05-11 14:17:40.782 | INFO  1332  interface_streaming_v2.py:687 - Audio #0: Precomputing audio features (fps=30.0) ...
[DH] 2026-05-11 14:17:41.012 | INFO  1332  features.py:44 - Wenet features cached: key=wenet_feat:e31898fd38810..., shape=(784, 256)
[DH] 2026-05-11 14:17:41.014 | INFO  1332  features.py:62 - Audio features precomputation completed: total_frames=917, feat_shape=(20, 256), audio_duration=30.581s, fps=30.0
[DH] 2026-05-11 14:17:41.014 | INFO  1332  interface_streaming_v2.py:696 - Audio #0: Audio feature precomputation completed: 917 frames
[DH] 2026-05-11 14:17:41.015 | INFO  1332  interface_streaming_v2.py:713 - Audio #0: Sync thread started
[DH] 2026-05-11 14:17:41.015 | INFO  1660  interface_streaming_v2.py:837 - Audio #0: sync_worker_file started: fps=30.0, total_features=917
[DH] 2026-05-11 14:17:41.018 | INFO  1332  interface_streaming_v2.py:722 - Audio #0: Audio playback thread started
[DH] 2026-05-11 14:17:42.029 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=29.74ms, av_gap=-75.33ms, display=0.15ms, skipped=0
[DH] 2026-05-11 14:17:43.053 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=20.69ms, av_gap=12.22ms, display=0.25ms, skipped=1
[DH] 2026-05-11 14:17:44.080 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=20.76ms, av_gap=7.11ms, display=0.03ms, skipped=1
[DH] 2026-05-11 14:17:45.101 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=20.94ms, av_gap=8.67ms, display=0.08ms, skipped=0
[DH] 2026-05-11 14:17:46.132 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=20.81ms, av_gap=12.22ms, display=0.13ms, skipped=1
[DH] 2026-05-11 14:17:47.156 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=21.02ms, av_gap=11.33ms, display=0.10ms, skipped=1
[DH] 2026-05-11 14:17:48.177 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=21.23ms, av_gap=7.33ms, display=0.03ms, skipped=1
[DH] 2026-05-11 14:17:49.197 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=21.00ms, av_gap=8.67ms, display=0.02ms, skipped=0
[DH] 2026-05-11 14:17:50.229 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=21.90ms, av_gap=10.22ms, display=0.28ms, skipped=1
[DH] 2026-05-11 14:17:51.251 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=21.60ms, av_gap=11.56ms, display=0.02ms, skipped=1
[DH] 2026-05-11 14:17:52.283 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=21.82ms, av_gap=9.11ms, display=0.57ms, skipped=1
[DH] 2026-05-11 14:17:53.306 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=21.36ms, av_gap=7.33ms, display=0.49ms, skipped=0
[DH] 2026-05-11 14:17:54.327 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=21.02ms, av_gap=10.89ms, display=0.52ms, skipped=1
[DH] 2026-05-11 14:17:55.353 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=21.79ms, av_gap=9.33ms, display=0.05ms, skipped=1
[DH] 2026-05-11 14:17:56.384 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=21.94ms, av_gap=9.33ms, display=0.49ms, skipped=1
[DH] 2026-05-11 14:17:57.409 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=22.16ms, av_gap=4.44ms, display=0.49ms, skipped=1
[DH] 2026-05-11 14:17:58.430 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=21.05ms, av_gap=14.00ms, display=0.62ms, skipped=0
[DH] 2026-05-11 14:17:59.464 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=20.89ms, av_gap=11.33ms, display=0.56ms, skipped=1
[DH] 2026-05-11 14:18:00.492 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=21.17ms, av_gap=6.89ms, display=0.03ms, skipped=1
[DH] 2026-05-11 14:18:01.517 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=20.91ms, av_gap=9.78ms, display=0.57ms, skipped=1
[DH] 2026-05-11 14:18:02.544 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=20.99ms, av_gap=7.11ms, display=0.53ms, skipped=1
[DH] 2026-05-11 14:18:03.576 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=20.38ms, av_gap=11.33ms, display=0.64ms, skipped=1
[DH] 2026-05-11 14:18:04.607 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=20.24ms, av_gap=7.78ms, display=0.72ms, skipped=1
[DH] 2026-05-11 14:18:05.634 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=20.55ms, av_gap=12.67ms, display=0.62ms, skipped=0
[DH] 2026-05-11 14:18:06.666 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=20.82ms, av_gap=10.00ms, display=0.57ms, skipped=1
[DH] 2026-05-11 14:18:07.695 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=19.97ms, av_gap=12.22ms, display=0.07ms, skipped=1
[DH] 2026-05-11 14:18:08.720 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=20.54ms, av_gap=10.00ms, display=0.45ms, skipped=1
[DH] 2026-05-11 14:18:09.752 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=21.15ms, av_gap=9.78ms, display=0.57ms, skipped=1
[DH] 2026-05-11 14:18:10.775 | INFO  2164  interface_streaming_v2.py:414 - [STATS] phase=steady, n=30, infer=20.40ms, av_gap=7.78ms, display=0.62ms, skipped=1
[DH] 2026-05-11 14:18:11.899 | INFO  7332  interface_streaming_v2.py:814 - Audio #0: Playback finished: played_sec=30.581, total_sec=30.581, interrupted=False
[DH] 2026-05-11 14:18:11.899 | INFO  1332  interface_streaming_v2.py:728 - Audio #0: Playback finished
[DH] [OK] Audio #0 playback completed
[DH] [COMPLETE] id=0 source=AudioSource.FILE path=D:\Projects\creator\creator-box\webapp\upload\video_product.wav interrupt=True

指标

  • phase 表示当前阶段(启动阶段或稳定阶段)
  • n 表示每秒处理的帧数(固定30帧)
  • infer 表示每帧推理时间(该值应当越小越好)
  • av_gap 表示音视频对齐差,正值表示视频帧时间领先于音频(视频“超前”),负值表示视频落后于音频(视频“滞后”)
  • skipped 表示跳过的帧数(如果推理时间过长导致无法按时输出,则会跳过一些帧以保持同步)

🎥 视频教程

注意

  • liveio 需预处理音视频并配置导播, 暂不支持实时生成文案的功能,可使用 SocketIO API 进行二次开发扩展
  • 使用 实时模式 时建议关闭不使用的 组件 应用,以降低推理时间,提升性能表现
  • 使用 Microsoft Edge 浏览器,需开启 edge://settings/content/mediaAutoplay 设置
  • 请勿使用该功能进行违法违规内容的处理,禁止用于侵犯他人隐私、版权等合法权益的行为
  • 以上内容仅展示能力。示例来源于互联网。如果有任何内容侵犯了您的权利,请联系我们申请删除