# 在GPU上跑的服务 ## clip.service 多媒体embedding服务,提供文本,图像,视频和音频的embedding服务 ``` [Unit] Wants=systemd-networkd.service [Service] WorkingDirectory=/share/run/clip ExecStart=/share/run/clip/start.sh ExecStop=/share/run/clip/stop.sh StandardOutput=append:/var/log/clip/clip.log StandardError=append:/var/log/clip/clip.log SyslogIdentifier=clip [Install] WantedBy=multi-user.target ``` ## http服务 https://embedding.opencomputing.net:1044/mme ### 模型 * 使用的模型名:laion/CLIP-ViT-B-32-laion2B-s34B-b79K * 模型路径 ### 运行脚本 ``` #!/bin/bash CUDA_VISIBLE_DEVICES=4 /share/vllm-0.8.5/bin/python -m llmengine .mm_embedding -p 8882 ``` ## entities.service 实体抽取服务 ``` [Unit] Wants=systemd-networkd.service [Service] WorkingDirectory=/share/run/entities ExecStart=/share/run/entities/start.sh ExecStop=/share/run/entities/stop.sh StandardOutput=append:/var/log/entities/entities.log StandardError=append:/var/log/entities/entities.log SyslogIdentifier=entities [Install] WantedBy=multi-user.target ``` ### http服务 https://entities.opencomputing.net:10443 ### 模型 路径:/share/models/LTP/small ### 运行脚本 ``` #!/bin/bash CUDA_VISIBLE_DEVICES=7 /share/vllm-0.8.5/bin/python -m llmengine.entity -p 9990 /share/models/LTP/small ``` ## f5tts.service ``` [Unit] Wants=systemd-networkd.service [Service] User=ymq Group=ymq WorkingDirectory=/share/ymq/run/f5tts Type=forking ExecStart=/share/ymq/run/f5tts/start.sh ExecStop=/share/ymq/run/f5tts/stop.sh StandardOutput=append:/var/log/f5tts/f5tts.log StandardError=append:/var/log/f5tts/f5tts.log SyslogIdentifier=f5tts [Install] WantedBy=multi-user.target ``` ### http服务 https://tts.opencomputing.net:10443 ### 模型 * 模型路径 /share/models/SWivid/F5-TTS/F5TTS_v1_Base/model_1250000.safetensors ### 软件仓库 https://git.opencomputing.cn/yumoqing/f5tts ### 执行脚本 ``` #!/usr/bin/bash echo start 3 instances for f5tts engine rundir=/share/ymq/run/f5tts CUDA_VISIBLE_DEVICES=6 ${rundir}/f5tts.env/bin/python ${rundir}/app/f5tts.py -w ${rundir} -p 9995 & CUDA_VISIBLE_DEVICES=6 ${rundir}/f5tts.env/bin/python ${rundir}/app/f5tts.py -w ${rundir} -p 9995 & CUDA_VISIBLE_DEVICES=6 ${rundir}/f5tts.env/bin/python ${rundir}/app/f5tts.py -w ${rundir} -p 9995 & CUDA_VISIBLE_DEVICES=6 ${rundir}/f5tts.env/bin/python ${rundir}/app/f5tts.py -w ${rundir} -p 9995 & ``` ## fastwhisper.service ASR模型 ``` [Unit] Wants=systemd-networkd.service [Service] User=ymq Group=ymq Type=forking WorkingDirectory=/d/ymq/run/fastwhisper ExecStart=/d/ymq/run/fastwhisper/start.sh ExecStop=/d/ymq/run/fastwhisper/stop.sh StandardOutput=append:/d/ymq/run/fastwhisper/logs/fastwhisper.log StandardError=append:/d/ymq/run/fastwhisper/logs/fastwhisper.log SyslogIdentifier=fastwhisper [Install] WantedBy=multi-user.target ``` ### http服务 https://asr.opencomputing.net:10443/fw ### 模型 * 路径 /data/ymq/models/deepdml/faster-whisper-large-v3-turbo-ct2 ### 软件仓库 https://git.opencomputing.cn/yumoqing/fastwhisper ### 执行脚本 ``` cd /d/ymq/run/fastwhisper /d/ymq/run/fastwhisper/py3/bin/python /d/ymq/run/fastwhisper/app/fastwhisper.py -w /d/ymq/run/fastwhisper & exit 0 ``` ## fvlm.service fast vlm 模型服务 ``` [Unit] Wants=systemd-networkd.service [Service] User=ymq Group=ymq Type=forking WorkingDirectory=/share/ymq/run/fvlm # ExecStart=/share/ymq/run/fvlm/fvlm.env/bin/python app/fastvlm.py -p 9994 ExecStart=/share/ymq/run/fvlm/start.sh ExecStop=/share/ymq/run/fvlm/stop.sh StandardOutput=append:/var/log/fvlm/fvlm.log StandardError=append:/var/log/fvlm/fvlm.log SyslogIdentifier=fvlm [Install] WantedBy=multi-user.target ``` ### http服务 https://fastvlm.opencomputing.net:10443 ### 模型 * 路径 /share/models/apple/llava-fastvithd_0.5b_stage3 ### 软件仓库 https://git.kaiyuancloud.cn/yumoqing/fvlm ### 脚本 ``` #!/usr/bin/bash CUDA_VISIBLE_DEVICES=6 /share/ymq/run/fvlm/fvlm.env/bin/python app/fastvlm.py -p 9994 & ``` ## m2m.service 多语言翻译 ``` [Unit] Wants=systemd-networkd.service [Service] WorkingDirectory=/share/run/m2m ExecStart=/share/run/m2m/start.sh ExecStop=/share/run/m2m/stop.sh StandardOutput=append:/var/log/m2m/m2m.log StandardError=append:/var/log/m2m/m2m.log SyslogIdentifier=m2m [Install] WantedBy=multi-user.target ``` ### http服务 https://t2t.opencomputing.net:10443/m2m ### 模型 * 路径 /share/models/facebook/m2m100_1.2B ### 运行脚本 #!/bin/bash ``` /share/vllm-0.8.5/bin/python -m llmengine.m2m -p 8883 /share/models/facebook/m2m100_1.2B ``` ## neo4j.service 知识图谱数据库 ``` [Unit] Wants=systemd-networkd.service [Service] Type=forking WorkingDirectory=/share/run/neo4j ExecStart=/share/run/neo4j/start.sh ExecStop=/share/run/neo4j/stop.sh StandardOutput=append:/var/log/neo4j/neo4j.log StandardError=append:/var/log/neo4j/neo4j.log SyslogIdentifier=/share/run/neo4j TimeoutStartSec=300 [Install] WantedBy=multi-user.target ``` ### http服务 https://graphdb.opencomputing.net:10443 ### 模型路径 * 路径 ### 软件仓库 https://git.opencomputing.cn/yumoqing/llmengine ### 运行脚本 ``` #!/bin/bash CUDA_VISIBLE_DEVICES=7 /share/vllm-0.8.5/bin/python -m llmengine.neo4j -p 8885 Neo4j & ``` ## nvidia-asr.service 英伟达的ASR模型 ``` [Unit] Wants=systemd-networkd.service [Service] User=ymq Group=ymq Type=forking WorkingDirectory=/share/ymq/run/nvidia-asr ExecStart=/share/ymq/run/nvidia-asr/start.sh ExecStop=/share/ymq/run/nvidia-asr/stop.sh StandardOutput=append:/var/log/asr/asr.log StandardError=append:/var/log/asr/asr.log SyslogIdentifier=asr [Install] WantedBy=multi-user.target ``` ### http服务 https://asr.opencomputing.net:10443 ### 模型 * 路径 /share/models/nvidia/parakeet-tdt-0.6b-v2/parakeet-tdt-0.6b-v2.nemo ### 软件仓库 https://git.opencomputing.cn/yumoqing/nvidia-asr.git ### 执行脚本 ``` #!/usr/bin/bash rundir=/share/ymq/run/nvidia-asr cd $rundir CUDA_VISIBLE_DEVICES=6 $rundir/nvidia-asr.env/bin/python app/asr.py -p 9992 & # CUDA_VISIBLE_DEVICES=6 $rundir/nvidia-asr.env/bin/python app/asr.py -p 9992 & ``` ## ollama.service ``` [Unit] Description=Ollama Service After=network-online.target [Service] ExecStart=/d/ollama/start.sh # CUDA_VISIBLE_DEVICES=2,3 /usr/local/bin/ollama serve User=ollama Group=ollama Restart=always RestartSec=3 Environment="PATH=/d/ymq/.local/bin:/d/ymq/bin:/d/ymq/bin:/usr/l ocal/cuda-12/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/b in:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin" [Install] WantedBy=default.target ``` ### http服务 https://ollama.opencomputing.net:10443 ## qwen3.service qwen3-0.6b模型 ``` [Unit] Wants=systemd-networkd.service [Service] WorkingDirectory=/share/run/qwen3 Type=forking ExecStart=/share/run/qwen3/start.sh ExecStop=/share/run/qwen3/stop.sh StandardOutput=append:/var/log/qwen3/qwen3.log StandardError=append:/var/log/qwen3/qwen3.log SyslogIdentifier=qwen3 [Install] ``` ### http服务 https://t2t.opencomputing.net:10443/qwen3-6b ### 模型 * 路径 /d/models/Qwen/Qwen3-0.6B ### 执行脚本 ``` #!/usr/bin/bash rundir=/share/run/qwen3 CUDA_VISIBLE_DEVICES=7 /share/vllm-0.8.5/bin/python -m llmengine.server -w ${rundir} -p 9089 /d/models/Qwen/Qwen3-0.6B & exit 0 ``` ## qwen3coder.service ``` [Unit] Wants=systemd-networkd.service [Service] WorkingDirectory=/share/run/qwen3coder Type=forking ExecStart=/share/run/qwen3coder/start.sh ExecStop=/share/run/qwen3coder/stop.sh StandardOutput=append:/var/log/qwen3coder/qwen3coder.log StandardError=append:/var/log/qwen3coder/qwen3coder.log SyslogIdentifier=qwen3coder [Install] WantedBy=multi-user.target ``` ### http服务 https://t2t.opencomputing.net:10443 ### 模型 * 路径 /d/models/Qwen/Qwen3-Coder-30B-A3B-Instruct ### 执行脚本 ``` #!/usr/bin/bash rundir=/share/run/qwen3coder CUDA_VISIBLE_DEVICES=0 /share/vllm-0.8.5/bin/python -m llmengine.server -w ${rundir} -p 9088 /d/models/Qwen/Qwen3-Coder-30B-A3B-Instruct & exit 0 ``` ## reranker.service ``` [Unit] Description=A Rerank Service using Qwen3-Reranker-0.6B # After=network.target DeepSeek70B-kyyds671b-ray.service # Requires=DeepSeek70B-kyyds671b-ray.service StartLimitIntervalSec=60 StartLimitBurst=5 [Service] # 核心启动参数(保持原有配置) WorkingDirectory=/share/run/reranker #定义环境变量,所有节点的启动脚本与服务需一致 #Environment="NCCL_SOCKET_IFNAME=enp196s0f0np0" #ExecStartPre=/data/kyyds671b/ray_check.sh ExecStart=/share/run/reranker/start.sh ExecStop=/share/run/reranker/stop.sh # 超时与停止控制(新增部分) # 启动超时延长至 120 秒 # TimeoutStartSec=120 # 停止等待时间 30 秒 # TimeoutStopSec=30 # 优先发送 SIGINT 信号(更适合 Python 程序) # KillSignal=SIGINT # 最终强制终止信号 # RestartKillSignal=SIGKILL # 混合终止模式 # KillMode=mixed # 重启策略 # Restart=on-failure # RestartSec=10s # 服务管理(保持原有配置+增强) #Restart=always #RestartSec=10 # 重启间隔从 5 秒调整为 10 秒 #append 是继续写入相当于>> file是从新写入 相当于> StandardOutput=append:/var/log/rerank/rerank.log StandardError=append:/var/log/rerank/error.log SyslogIdentifier=rerank # 资源限制(保持可选配置) #LimitNOFILE=65536 #LimitNPROC=65536 # GPU 支持 #Environment=CUDA_VISIBLE_DEVICES=0,1 [Install] WantedBy=multi-user.target ``` ### http服务 https://reranker.opencomputing.net:10443 ### 模型 * 路径 /share/models/BAAI/bge-reranker-v2-m3 ### 脚本 ``` #!/bin/bash # CUDA_VISIBLE_DEVICES=7 /share/vllm-0.8.5/bin/python -m llmengine.rerank -p 9997 /d/ymq/models/Qwen/Qwen3-Reranker-0___6B CUDA_VISIBLE_DEVICES=4 /share/vllm-0.8.5/bin/python -m llmengine.rerank -p 9997 /share/models/BAAI/bge-reranker-v2-m3 ``` ## subtitler.service 歌曲歌词对齐服务 ``` [Unit] Wants=systemd-networkd.service [Service] User=ymq Group=ymq Type=forking WorkingDirectory=/d/ymq/py/subtitle ExecStart=/d/ymq/py/subtitle/start.sh ExecStop=/d/ymq/py/subtitle/stop.sh StandardOutput=append:/var/log/subtitler/subtitler.log StandardError=append:/var/log/subtitler/subtitler.log SyslogIdentifier=subtitler [Install] WantedBy=multi-user.target ``` ### 模型 * 路径 1 /data/ymq/models/MahmoudAshraf/mms-300m-1130-forced-aligner 对齐模型 2 /data/ymq/models/mdx_models/UVR-MDX-NET-Voc_FT.onnx 分割人声和伴奏 ### 仓库 https://git.opencomputing.cn/yumoqing/subtitle ### 脚本 ``` #!/usr/bin/bash cd /d/ymq/py/subtitle /d/ymq/py/subtitle/py3/bin/python /d/ymq/py/subtitle/app/subtitler.py -p 9901 -w /d/ymq/py/subtitle & exit 0 ``` ## triples.service 获取三元组 ``` [Unit] Wants=systemd-networkd.service [Service] Type=forking WorkingDirectory=/share/run/triples ExecStart=/share/run/triples/start.sh ExecStop=/share/run/triples/stop.sh StandardOutput=append:/var/log/triples/triples.log StandardError=append:/var/log/triples/triples.log SyslogIdentifier=/share/run/triples [Install] WantedBy=multi-user.target ``` ### http服务 https://triples.opencomputing.net:10443 ### 模型 * 路径 /share/models/Babelscape/mrebel-large ### 脚本 ``` #!/bin/bash CUDA_VISIBLE_DEVICES=7 /share/vllm-0.8.5/bin/python -m llmengine.triple -p 9991 /share/models/Babelscape/mrebel-large & CUDA_VISIBLE_DEVICES=7 /share/vllm-0.8.5/bin/python -m llmengine.triple -p 9991 /share/models/Babelscape/mrebel-large & CUDA_VISIBLE_DEVICES=7 /share/vllm-0.8.5/bin/python -m llmengine.triple -p 9991 /share/models/Babelscape/mrebel-large & ``` ## vdb.service milvus向量数据库 ``` [Unit] Wants=systemd-networkd.service [Service] User=wangmeihua Group=wangmeihua Type=forking WorkingDirectory=/d/wangmeihua/vdbtest/vdb ExecStart=/d/wangmeihua/vdbtest/vdb/start.sh ExecStop=/d/wangmeihua/vdbtest/vdb/stop.sh StandardOutput=append:/var/log/vdb/vdb.log StandardError=append:/var/log/vdb/vdb.log SyslogIdentifier=vdb [Install] WantedBy=multi-user.target ``` ### http服务 https://vectordb.opencomputing.net:10443/milvus ### 仓库 https://git.opencomputing.cn/yumoqing/vdb ### 脚本 ``` #!/usr/bin/bash # 检查是否有 vdb 进程正在运行 PID=$(ps aux | grep "vdbapp.py" | grep -v grep | awk '{print $2}' | head -1) if [ -n "$PID" ]; then echo "错误:VDB 服务已在运行 (PID: $PID)" echo "请先停止现有服务:kill $PID" exit 1 fi cd /share/wangmeihua/vdbtest/vdb /share/wangmeihua/vdbtest/vdb/vdbvenv/bin/python /d/wangmeihua/vdbtest/vdb/app/vdbapp.py -p 8887 -w /d/wangmeihua/vdbtest/vdb & echo "VDB 服务已启动 (PID: $!)" exit 0 ```