如何下载大模型并用llamafactory启动

用modelscope下载模型

用python下载

# !pip install modelscope


from modelscope.hub.snapshot_download import snapshot_download
model_dir = snapshot_download(
    model_id='Qwen/Qwen3-8B',
    local_dir='/ubuntu-22.04/LLaMA-Factory/models/qwen3-8b',
    cache_dir='/ubuntu-22.04/LLaMA-Factory/models/qwen3-8b-cache')

用llamafactory加载模型

在终端启用

CUDA_VISIBLE_DEVICES=2,3 \
API_HOST=0.0.0.0 \
API_PORT=8001 \
API_KEY=sk-test\
llamafactory-cli api \
  --model_name_or_path /ubuntu-22.04/LLaMA-Factory/models/qwen3-8b \
  --template qwen \
  --finetuning_type lora \
  --trust_remote_code \
  --max_new_tokens 32768

用vllm加载模型

在终端启用

CUDA_VISIBLE_DEVICES=5 vllm  serve /ubuntu-22.04/LLaMA-Factory/models/qwen3-8b --port 8004 --host 0.0.0.0   --max-num-seqs 4 --max-model-len 4096 --served-model-name deepseek-ocr --gpu-memory-utilization 0.2