AI 网关部署

最近更新时间:2025-10-10 10:36:21

我的收藏
本文档介绍在 TKE 上如何部署 Envoy AI Gateway 并接入大模型。

简介

AI 网关是面向大模型服务的专用流量治理组件,基于传统 API 网关扩展而来,核心能实现多模型按需切换、租户鉴权与配额管控、Token 级限流、内容安全合规校验,还可在模型异常时自动切换保障稳定性。
Envoy AI Gateway 是一个开源的、基于 Kubernetes 原生架构的 AI 网关,专门用于管理和路由大模型服务流量。该项目构建在成熟的 Envoy Proxy 之上,为应用客户端与各种 AI 服务提供商之间提供了一个统一、安全且可扩展的接入层。

环境准备

1. 已创建并部署好 TKE 集群,建议选择香港地域集群,如果您还没有集群,请参见 创建集群
2. 已创建节点池,并且节点池内有至少3个节点,推荐机型为 SA5.LARGE8。

环境验证

验证 kubectl:
kubectl version --client
预期结果:
Client Version: v1.32.2-tke.6
Kustomize Version: v5.5.0
验证 helm 安装:
helm version
预期结果:
version.BuildInfo{Version:"v3.19.0", GitCommit:"3d8990f0836691f0229297773f3524598f46bda6", GitTreeState:"clean", GoVersion:"go1.24.7"}
如果显示未找到 helm 命令,TencentOS 系统可以用下列命令一键安装:
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
验证 curl 安装:
curl --version
curl 8.4.0 (x86_64-koji-linux-gnu) libcurl/8.4.0 OpenSSL/3.0.12 zlib/1.2.13 brotli/1.1.0 libidn2/2.3.4 libpsl/0.21.2 (+libidn2/2.3.4) libssh/0.10.5/openssl/zlib nghttp2/1.58.0 OpenLDAP/2.6.5
Release-Date: 2023-10-11
Protocols: dict file ftp ftps gopher gophers http https imap imaps ldap ldaps mqtt pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: alt-svc AsynchDNS brotli GSS-API HSTS HTTP2 HTTPS-proxy IDN IPv6 Kerberos Largefile libz NTLM PSL SPNEGO SSL threadsafe TLS-SRP UnixSocket

部署 AI 网关

第一步:部署 Envoy Gateway

Envoy AI Gateway 建立在 Envoy Gateway 之上,使用 Helm 安装并等待部署准备就绪:
helm upgrade -i eg oci://docker.io/envoyproxy/gateway-helm \\
--version v0.0.0-latest \\
--namespace envoy-gateway-system \\
--create-namespace

kubectl wait --timeout=2m -n envoy-gateway-system deployment/envoy-gateway --for=condition=Available

第二步:部署 Envoy AI Gateway

安装 AI Gateway 的 Helm chart,完成后等待部署准备就绪:
helm upgrade -i aieg oci://docker.io/envoyproxy/ai-gateway-helm \\
--version v0.0.0-latest \\
--namespace envoy-ai-gateway-system \\
--create-namespace

kubectl wait --timeout=2m -n envoy-ai-gateway-system deployment/ai-gateway-controller --for=condition=Available

第三步:配置 Envoy AI Gateway

安装 Envoy AI Gateway 后,将特定于 AI Gateway 的配置应用于 Envoy Gateway,重新启动部署,然后等待其准备就绪:
kubectl apply -f https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/manifests/envoy-gateway-config/redis.yaml
kubectl apply -f https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/manifests/envoy-gateway-config/config.yaml
kubectl apply -f https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/manifests/envoy-gateway-config/rbac.yaml

kubectl rollout restart -n envoy-gateway-system deployment/envoy-gateway

kubectl wait --timeout=2m -n envoy-gateway-system deployment/envoy-gateway --for=condition=Available

第四步:检查健康状态

检查 AI 网关 Pod:
kubectl get pods -n envoy-ai-gateway-system
检查 Envoy Gateway 容器:
kubectl get pods -n envoy-gateway-system

第五步:部署测试后端

首先部署包含测试后端的基本 AI 网关设置:
kubectl apply -f https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/examples/basic/basic.yaml
等待网关容器准备就绪:
kubectl wait pods --timeout=2m \\ -l gateway.envoyproxy.io/owning-gateway-name=envoy-ai-gateway-basic \\ -n envoy-gateway-system \\ --for=condition=Ready
设置网关 URL:
export GATEWAY_URL=$(kubectl get gateway/envoy-ai-gateway-basic -o jsonpath='{.status.addresses[0].value}')
验证 URL 是否可用:
echo $GATEWAY_URL

第六步:测试 AI 网关

打开一个新终端,并使用以下命令向 AI 网关发送测试请求:
curl -H "Content-Type: application/json" \\
-d '{ "model": "some-cool-self-hosted-model","messages": [{"role":"system","content": "Hi."}]}' \\
$GATEWAY_URL/v1/chat/completions
预期结果:
{"choices":[{"message":{"role":"assistant", "content":"I am inevitable."}}]}

使用 AI 网关接入大模型

第一步:下载配置模板

本文实践以 Kimi 大模型为例,先获取 AI 网关的默认模型配置模板,后续第二步修改模板中的 hostname 和 API_KEY,需要具备相关运营商的 API_KEY。
curl -O https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/examples/basic/openai.yaml

第二步:修改配置模板

编辑文件 openai.yaml 替换其中的 hostname 和 OPENAI_API_KEY,下面以 Kimi 为例,yaml 文件配置参考如下:
apiVersion: aigateway.envoyproxy.io/v1alpha1
kind: AIGatewayRoute
metadata:
name: envoy-ai-gateway-basic-openai
namespace: default
spec:
parentRefs:
- name: envoy-ai-gateway-basic
kind: Gateway
group: gateway.networking.k8s.io
rules:
- matches:
- headers:
- type: Exact
name: x-ai-eg-model
value: kimi-k2-0905-preview
backendRefs:
- name: envoy-ai-gateway-basic-openai
---
apiVersion: aigateway.envoyproxy.io/v1alpha1
kind: AIServiceBackend
metadata:
name: envoy-ai-gateway-basic-openai
namespace: default
spec:
schema:
name: OpenAI
backendRef:
name: envoy-ai-gateway-basic-openai
kind: Backend
group: gateway.envoyproxy.io
---
apiVersion: aigateway.envoyproxy.io/v1alpha1
kind: BackendSecurityPolicy
metadata:
name: envoy-ai-gateway-basic-openai-apikey
namespace: default
spec:
targetRefs:
- group: aigateway.envoyproxy.io
kind: AIServiceBackend
name: envoy-ai-gateway-basic-openai
type: APIKey
apiKey:
secretRef:
name: envoy-ai-gateway-basic-openai-apikey
namespace: default
---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: Backend
metadata:
name: envoy-ai-gateway-basic-openai
namespace: default
spec:
endpoints:
- fqdn:
hostname: api.moonshot.cn
port: 443
---
apiVersion: gateway.networking.k8s.io/v1alpha3
kind: BackendTLSPolicy
metadata:
name: envoy-ai-gateway-basic-openai-tls
namespace: default
spec:
targetRefs:
- group: 'gateway.envoyproxy.io'
kind: Backend
name: envoy-ai-gateway-basic-openai
validation:
wellKnownCACertificates: "System"
hostname: api.moonshot.cn
---
apiVersion: v1
kind: Secret
metadata:
name: envoy-ai-gateway-basic-openai-apikey
namespace: default
type: Opaque
stringData:
apiKey: sk-xxxxxxxxxxxxxx # Replace with your kimi API key.


第三步:应用配置

应用更新的配置并等待网关容器准备就绪。
kubectl apply -f openai.yaml

kubectl wait pods --timeout=2m \\
-l gateway.envoyproxy.io/owning-gateway-name=envoy-ai-gateway-basic \\
-n envoy-gateway-system \\
--for=condition=Ready

第四步:测试配置

curl -H "Content-Type: application/json" -d '{
"model": "kimi-k2-0905-preview",
"messages": [
{
"role": "user",
"content": "Hi."
}
]
}' $GATEWAY_URL/v1/chat/completions
预期结果:
{"id":"chatcmpl-68d64bc7a5422d1970e278da","object":"chat.completion","created":1758874567,"model":"kimi-k2-0905-preview","choices":[{"index":0,"message":{"role":"assistant","content":"Hi there! How can I help you today?"},"finish_reason":"stop"}],"usage":{"prompt_tokens":9,"completion_tokens":11,"total_tokens":20}}

常见问题

1. 在执行 helm 之后,安装 chart 命令超时,无法安装 chart?

推荐集群选择香港地域,防止由于网络延迟等问题导致安装超时。

2. curl 命令测试之后,返回报错:No matching route found. It is likely that the model specified your request is not configured in the Gateway

检查 yaml 文件中的模型文件是否与 curl 命令中的一致。