256 lines
5.8 KiB
Markdown
256 lines
5.8 KiB
Markdown
# KTV Synth Service
|
|
|
|
KTV/MTV video synthesis service using FFmpeg. Creates karaoke videos with dual audio tracks (accompaniment + original) and synchronized ASS subtitles.
|
|
|
|
## Overview
|
|
|
|
This service processes video clips, audio tracks, and subtitles to produce:
|
|
- **MTV (Music Television)**: Single audio track with original vocals and subtitles
|
|
- **KTV (Karaoke Television)**: Dual audio tracks - accompaniment (default) and original vocals
|
|
|
|
## Architecture
|
|
|
|
- **Framework**: ahserver + longtasks + Redis
|
|
- **Port**: 9084
|
|
- **Queue**: `ktv_synth`
|
|
- **Worker**: FFmpeg subprocess (no GPU required)
|
|
|
|
## Features
|
|
|
|
- Two-step FFmpeg synthesis pipeline
|
|
- ASS subtitle rendering with karaoke effects
|
|
- Dual audio track support with proper metadata
|
|
- Configurable video looping for scene clips
|
|
- 1920x1080 output resolution with Lanczos scaling
|
|
- Automatic duration calculation
|
|
|
|
## Installation
|
|
|
|
### Prerequisites
|
|
|
|
- Python 3.8+
|
|
- FFmpeg with libx264 and AAC support
|
|
- Redis server
|
|
|
|
### Setup
|
|
|
|
```bash
|
|
# Clone repository
|
|
cd /data/ymq/ktv-synth-service
|
|
|
|
# Ensure FFmpeg is installed
|
|
ffmpeg -version
|
|
|
|
# Ensure Redis is running
|
|
redis-cli ping
|
|
```
|
|
|
|
## Usage
|
|
|
|
### Starting the Service
|
|
|
|
```bash
|
|
./start.sh
|
|
```
|
|
|
|
The service will start on port 9084 and begin processing tasks from the Redis queue.
|
|
|
|
### Stopping the Service
|
|
|
|
```bash
|
|
./stop.sh
|
|
```
|
|
|
|
### Health Check
|
|
|
|
Visit `http://localhost:9084/app/health.dspy` or check the service status.
|
|
|
|
## API
|
|
|
|
### Task Payload
|
|
|
|
Submit tasks to the Redis queue `ktv_synth`:
|
|
|
|
```json
|
|
{
|
|
"task_type": "synthesize",
|
|
"video_files": [
|
|
"/path/to/scene1.mp4",
|
|
"/path/to/scene2.mp4",
|
|
"/path/to/scene3.mp4"
|
|
],
|
|
"original_audio": "/path/to/original.wav",
|
|
"accompaniment": "/path/to/no_vocals.wav",
|
|
"subtitle_path": "/path/to/subtitles.ass",
|
|
"output_dir": "/tmp/ktv-synth-outputs",
|
|
"title": "SongName",
|
|
"duration": 240.5,
|
|
"loops": 3,
|
|
"output_modes": ["mtv", "ktv"]
|
|
}
|
|
```
|
|
|
|
### Parameters
|
|
|
|
- `video_files` (required): List of video file paths (scene clips to loop)
|
|
- `original_audio` (required): Path to original full audio with vocals
|
|
- `accompaniment` (required for KTV): Path to accompaniment track (no vocals)
|
|
- `subtitle_path` (required): Path to ASS subtitle file
|
|
- `output_dir` (optional): Output directory (default: `/tmp/ktv-synth-outputs`)
|
|
- `title` (optional): Song title for output naming (default: `output`)
|
|
- `duration` (optional): Target duration in seconds (auto-calculated if not provided)
|
|
- `loops` (optional): Number of video loops (auto-calculated if not provided)
|
|
- `output_modes` (optional): List of outputs to generate: `["mtv"]`, `["ktv"]`, or `["mtv", "ktv"]`
|
|
|
|
### Response
|
|
|
|
```json
|
|
{
|
|
"mtv_path": "/tmp/ktv-synth-outputs/SongName_MTV.mp4",
|
|
"ktv_path": "/tmp/ktv-synth-outputs/SongName_KTV.mp4",
|
|
"mtv_size_mb": 125.45,
|
|
"ktv_size_mb": 145.67,
|
|
"duration": 240.5
|
|
}
|
|
```
|
|
|
|
## Technical Details
|
|
|
|
### Two-Step Synthesis Process
|
|
|
|
#### Step 1: Create Silent Looped Video Track
|
|
|
|
Concatenates and loops scene clips to match target duration:
|
|
|
|
```bash
|
|
ffmpeg -y -f concat -safe 0 -stream_loop {loops} -i {concat_list} \
|
|
-t {duration} -an -c:v libx264 -preset fast -crf 23 {temp_video}
|
|
```
|
|
|
|
#### Step 2a: MTV Synthesis (Single Track)
|
|
|
|
Combines video with original audio and ASS subtitles:
|
|
|
|
```bash
|
|
ffmpeg -y -i {temp_video} -i {original_audio} \
|
|
-map 0:v -map 1:a \
|
|
-vf "ass={subtitle_path},scale=1920:1080:flags=lanczos" \
|
|
-c:v libx264 -preset fast -crf 23 \
|
|
-c:a aac -b:a 192k \
|
|
{mtv_output}
|
|
```
|
|
|
|
#### Step 2b: KTV Synthesis (Dual Track)
|
|
|
|
Creates dual audio tracks with accompaniment as default:
|
|
|
|
```bash
|
|
ffmpeg -y -i {temp_video} -i {accompaniment} -i {original_audio} \
|
|
-map 0:v -map 1:a -map 2:a \
|
|
-vf "ass={subtitle_path},scale=1920:1080:flags=lanczos" \
|
|
-c:v libx264 -preset fast -crf 23 \
|
|
-c:a:0 aac -b:a:0 192k -metadata:s:a:0 handler_name="伴奏(Accompaniment)" \
|
|
-c:a:1 aac -b:a:1 192k -metadata:s:a:1 handler_name="原唱(Original)" \
|
|
-disposition:a:0 default -disposition:a:1 0 \
|
|
{ktv_output}
|
|
```
|
|
|
|
### Video Encoding Settings
|
|
|
|
- **Codec**: H.264 (libx264)
|
|
- **Preset**: fast
|
|
- **CRF**: 23 (balanced quality/size)
|
|
- **Resolution**: 1920x1080
|
|
- **Scaling**: Lanczos (high quality)
|
|
|
|
### Audio Encoding Settings
|
|
|
|
- **Codec**: AAC
|
|
- **Bitrate**: 192 kbps
|
|
- **Tracks**: 1 (MTV) or 2 (KTV)
|
|
|
|
### KTV Audio Track Metadata
|
|
|
|
- **Track 0**: Accompaniment (default playback)
|
|
- Handler: "伴奏(Accompaniment)"
|
|
- Disposition: default
|
|
- **Track 1**: Original with vocals
|
|
- Handler: "原唱(Original)"
|
|
- Disposition: 0 (not default)
|
|
|
|
## Configuration
|
|
|
|
Edit `conf/config.json`:
|
|
|
|
```json
|
|
{
|
|
"port": 9084,
|
|
"queue": "ktv_synth",
|
|
"filesroot": "/tmp/ktv-synth-outputs",
|
|
"redis_url": "redis://127.0.0.1:6379",
|
|
"worker_cnt": 1,
|
|
"stuck_seconds": 1800,
|
|
"max_age_hours": 24
|
|
}
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### FFmpeg Errors
|
|
|
|
Check FFmpeg installation and codec support:
|
|
|
|
```bash
|
|
ffmpeg -codecs | grep libx264
|
|
ffmpeg -codecs | grep aac
|
|
```
|
|
|
|
### Redis Connection
|
|
|
|
Verify Redis is running:
|
|
|
|
```bash
|
|
redis-cli ping
|
|
```
|
|
|
|
### Permission Issues
|
|
|
|
Ensure the service has write access to output directories:
|
|
|
|
```bash
|
|
chmod 755 /tmp/ktv-synth-outputs
|
|
```
|
|
|
|
### High Memory Usage
|
|
|
|
Reduce worker count in `conf/config.json`:
|
|
|
|
```json
|
|
{
|
|
"worker_cnt": 1
|
|
}
|
|
```
|
|
|
|
## Performance
|
|
|
|
- **MTV Generation**: ~2-3x real-time (240s video in ~80-120s)
|
|
- **KTV Generation**: ~2-3x real-time
|
|
- **Concurrent Tasks**: Limited by `worker_cnt` (default: 1)
|
|
- **Memory**: ~500MB-1GB per worker (depends on video resolution)
|
|
|
|
## Integration
|
|
|
|
This service integrates with:
|
|
|
|
- **demucs-service**: Audio source separation (provides accompaniment tracks)
|
|
- **whisper-service**: Subtitle generation (provides ASS files)
|
|
- **wan22-service**: Video generation (provides scene clips)
|
|
|
|
## License
|
|
|
|
Internal use only.
|
|
|
|
## Support
|
|
|
|
For issues or questions, contact the development team.
|