启动服务
首先,执行以下命令启动服务:
taco_llm serve facebook/opt-125m --api-key taco-llm-test
发送请求
您可以使用 OpenAI 的官方 Python 客户端来发送请求:
from openai import OpenAIclient = OpenAI(base_url="http://localhost:8000/v1",api_key="taco-llm-test",)completion = client.chat.completions.create(model="facebook/opt-125m",messages=[{"role": "user", "content": "Hello!"}])print(completion.choices[0].message)
您也可以直接使用 HTTP 客户端来发送请求:
import requestsapi_key = "taco-llm-test"headers = {"Authorization": f"Bearer {api_key}"}pload = {"prompt": "Hello!","stream": True,"max_tokens": 128,}response = requests.post("http://localhost:8000/v1/completions",headers=headers,json=pload,stream=True)for chunk in response.iter_lines(chunk_size=8192,decode_unicode=False,delimiter=b"\\0"):if chunk:data = json.loads(chunk.decode("utf-8"))output = data["text"][0]print(output)
完整服务端参数配置
完整客户端参数配置
Chat: tools, and tool_choice。
Completions: suffix。