endpoint_set_model_for_inference
设置模型允许推理
- 在安装脚本的.env中,查看变量HUB_SERVER_API_TOKEN获取API key
- 使用curl命令调用API接口,给模型添加运行时框架, 请将meta/llama-3.1-8b-instruct 替换成真实repo id
curl -X POST \
"http://${HUB_SERVER_IP}:${HUB_SERVER_PORT}/api/v1/models/meta/llama-3.1-8b-instruct/runtime_framework?current_user=meta" \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer ${API_KEY}" \
-d '{
"container_port": 8000,
"enabled": 1,
"frame_cpu_image": "",
"frame_image": "nvcr.io/nim/meta/llama-3.1-8b-instruct:1.1.2",
"frame_name": "nim-llama-3.1-8b-instruct",
"frame_version": "1.1.2",
"type":1
}'
- 获取第二步返回的id,并使用curl命令调用API接口,给模型添加运行时框架
curl -X POST \
"http://${HUB_SERVER_IP}:${HUB_SERVER_PORT}/api/v1/runtime_framework/{id}?deploy_type=1¤t_user=meta" \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer ${API_KEY}" \
-d '{
"models": [
"meta/llama-3.1-8b-instruct"
]
}'