endpoint_set_model_for_inference

设置模型允许推理

在安装脚本的.env中，查看变量HUB_SERVER_API_TOKEN获取API key
使用curl命令调用API接口，给模型添加运行时框架, 请将meta/llama-3.1-8b-instruct 替换成真实repo id

curl -X POST \
    "http://${HUB_SERVER_IP}:${HUB_SERVER_PORT}/api/v1/models/meta/llama-3.1-8b-instruct/runtime_framework?current_user=meta" \
    -H 'Content-Type: application/json' \
    -H "Authorization: Bearer ${API_KEY}" \
    -d '{
        "container_port": 8000,
        "enabled": 1,
        "frame_cpu_image": "",
        "frame_image": "nvcr.io/nim/meta/llama-3.1-8b-instruct:1.1.2",
        "frame_name": "nim-llama-3.1-8b-instruct",
        "frame_version": "1.1.2",
        "type":1
      }'

获取第二步返回的id,并使用curl命令调用API接口，给模型添加运行时框架

curl -X POST \
    "http://${HUB_SERVER_IP}:${HUB_SERVER_PORT}/api/v1/runtime_framework/{id}?deploy_type=1&current_user=meta" \
    -H 'Content-Type: application/json' \
    -H "Authorization: Bearer ${API_KEY}" \
    -d '{
      "models": [
        "meta/llama-3.1-8b-instruct"
      ]
    }'

设置模型允许推理​

设置模型允许推理