# KTV Synth Service KTV/MTV video synthesis service using FFmpeg. Creates karaoke videos with dual audio tracks (accompaniment + original) and synchronized ASS subtitles. ## Overview This service processes video clips, audio tracks, and subtitles to produce: - **MTV (Music Television)**: Single audio track with original vocals and subtitles - **KTV (Karaoke Television)**: Dual audio tracks - accompaniment (default) and original vocals ## Architecture - **Framework**: ahserver + longtasks + Redis - **Port**: 9084 - **Queue**: `ktv_synth` - **Worker**: FFmpeg subprocess (no GPU required) ## Features - Two-step FFmpeg synthesis pipeline - ASS subtitle rendering with karaoke effects - Dual audio track support with proper metadata - Configurable video looping for scene clips - 1920x1080 output resolution with Lanczos scaling - Automatic duration calculation ## Installation ### Prerequisites - Python 3.8+ - FFmpeg with libx264 and AAC support - Redis server ### Setup ```bash # Clone repository cd /data/ymq/ktv-synth-service # Ensure FFmpeg is installed ffmpeg -version # Ensure Redis is running redis-cli ping ``` ## Usage ### Starting the Service ```bash ./start.sh ``` The service will start on port 9084 and begin processing tasks from the Redis queue. ### Stopping the Service ```bash ./stop.sh ``` ### Health Check Visit `http://localhost:9084/app/health.dspy` or check the service status. ## API ### Task Payload Submit tasks to the Redis queue `ktv_synth`: ```json { "task_type": "synthesize", "video_files": [ "/path/to/scene1.mp4", "/path/to/scene2.mp4", "/path/to/scene3.mp4" ], "original_audio": "/path/to/original.wav", "accompaniment": "/path/to/no_vocals.wav", "subtitle_path": "/path/to/subtitles.ass", "output_dir": "/tmp/ktv-synth-outputs", "title": "SongName", "duration": 240.5, "loops": 3, "output_modes": ["mtv", "ktv"] } ``` ### Parameters - `video_files` (required): List of video file paths (scene clips to loop) - `original_audio` (required): Path to original full audio with vocals - `accompaniment` (required for KTV): Path to accompaniment track (no vocals) - `subtitle_path` (required): Path to ASS subtitle file - `output_dir` (optional): Output directory (default: `/tmp/ktv-synth-outputs`) - `title` (optional): Song title for output naming (default: `output`) - `duration` (optional): Target duration in seconds (auto-calculated if not provided) - `loops` (optional): Number of video loops (auto-calculated if not provided) - `output_modes` (optional): List of outputs to generate: `["mtv"]`, `["ktv"]`, or `["mtv", "ktv"]` ### Response ```json { "mtv_path": "/tmp/ktv-synth-outputs/SongName_MTV.mp4", "ktv_path": "/tmp/ktv-synth-outputs/SongName_KTV.mp4", "mtv_size_mb": 125.45, "ktv_size_mb": 145.67, "duration": 240.5 } ``` ## Technical Details ### Two-Step Synthesis Process #### Step 1: Create Silent Looped Video Track Concatenates and loops scene clips to match target duration: ```bash ffmpeg -y -f concat -safe 0 -stream_loop {loops} -i {concat_list} \ -t {duration} -an -c:v libx264 -preset fast -crf 23 {temp_video} ``` #### Step 2a: MTV Synthesis (Single Track) Combines video with original audio and ASS subtitles: ```bash ffmpeg -y -i {temp_video} -i {original_audio} \ -map 0:v -map 1:a \ -vf "ass={subtitle_path},scale=1920:1080:flags=lanczos" \ -c:v libx264 -preset fast -crf 23 \ -c:a aac -b:a 192k \ {mtv_output} ``` #### Step 2b: KTV Synthesis (Dual Track) Creates dual audio tracks with accompaniment as default: ```bash ffmpeg -y -i {temp_video} -i {accompaniment} -i {original_audio} \ -map 0:v -map 1:a -map 2:a \ -vf "ass={subtitle_path},scale=1920:1080:flags=lanczos" \ -c:v libx264 -preset fast -crf 23 \ -c:a:0 aac -b:a:0 192k -metadata:s:a:0 handler_name="伴奏(Accompaniment)" \ -c:a:1 aac -b:a:1 192k -metadata:s:a:1 handler_name="原唱(Original)" \ -disposition:a:0 default -disposition:a:1 0 \ {ktv_output} ``` ### Video Encoding Settings - **Codec**: H.264 (libx264) - **Preset**: fast - **CRF**: 23 (balanced quality/size) - **Resolution**: 1920x1080 - **Scaling**: Lanczos (high quality) ### Audio Encoding Settings - **Codec**: AAC - **Bitrate**: 192 kbps - **Tracks**: 1 (MTV) or 2 (KTV) ### KTV Audio Track Metadata - **Track 0**: Accompaniment (default playback) - Handler: "伴奏(Accompaniment)" - Disposition: default - **Track 1**: Original with vocals - Handler: "原唱(Original)" - Disposition: 0 (not default) ## Configuration Edit `conf/config.json`: ```json { "port": 9084, "queue": "ktv_synth", "filesroot": "/tmp/ktv-synth-outputs", "redis_url": "redis://127.0.0.1:6379", "worker_cnt": 1, "stuck_seconds": 1800, "max_age_hours": 24 } ``` ## Troubleshooting ### FFmpeg Errors Check FFmpeg installation and codec support: ```bash ffmpeg -codecs | grep libx264 ffmpeg -codecs | grep aac ``` ### Redis Connection Verify Redis is running: ```bash redis-cli ping ``` ### Permission Issues Ensure the service has write access to output directories: ```bash chmod 755 /tmp/ktv-synth-outputs ``` ### High Memory Usage Reduce worker count in `conf/config.json`: ```json { "worker_cnt": 1 } ``` ## Performance - **MTV Generation**: ~2-3x real-time (240s video in ~80-120s) - **KTV Generation**: ~2-3x real-time - **Concurrent Tasks**: Limited by `worker_cnt` (default: 1) - **Memory**: ~500MB-1GB per worker (depends on video resolution) ## Integration This service integrates with: - **demucs-service**: Audio source separation (provides accompaniment tracks) - **whisper-service**: Subtitle generation (provides ASS files) - **wan22-service**: Video generation (provides scene clips) ## License Internal use only. ## Support For issues or questions, contact the development team.