前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >一些kubernetes的开发/实现/使用技巧-2

一些kubernetes的开发/实现/使用技巧-2

原创
作者头像
王磊-字节跳动
修改2019-10-31 22:10:29
1.7K0
修改2019-10-31 22:10:29
举报
文章被收录于专栏:01ZOO

查看某种资源

代码语言:txt
复制
kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes"
或者
kubectl proxy 在页面上看

Controller 逻辑(JobController为例)

JobController的实现逻辑比较简单,用它来示例 Controller的实现方式

serviceaccount_controller 和 tokens_controller

  • serviceaccount_controller: 让每个空间都有一个默认 serviceaccount, 比如配置的 "default"
  • tokens_controller: 让 serviceaccount 有对应 token: secret

kubernetes 可配置特性

默认是否打开,以及当前成熟程度

代码语言:txt
复制
pkg/features/kube_features.go

kubectl code 位置问题

kubectl中 auth/convert/cp/get 在 k8s.io/kubernetes/pkg 下面,其余代码在 k8s.io/kubectl 下面,这个是因为以前都在 k8s.io/kubernetes/pkg, 后来渐渐 move to staging目录下,还没移完

kubectl 实现方式

kubectl 的核心在 vendor/k8s.io/cli-runtime,其中比较重要的是 vendor/k8s.io/cli-runtime/pkg/resource/builder.go

构建builder->设置builder参数->Do设置vistors->Info取回并修饰结果

代码语言:txt
复制
type RESTClientGetter interface {
	// ToRESTConfig returns restconfig
	ToRESTConfig() (*rest.Config, error)
	// ToDiscoveryClient returns discovery client
	// DiscoveryInterface holds the methods that discover server-supported API groups,
    // versions and resources.
	ToDiscoveryClient() (discovery.CachedDiscoveryInterface, error)
	// ToRESTMapper returns a restmapper
	// RESTMapper allows clients to map resources to kind, and map kind and version
    // to interfaces for manipulating those objects. It is primarily intended for
    // consumers of Kubernetes compatible REST APIs as defined in docs/devel/api-conventions.md.
	ToRESTMapper() (meta.RESTMapper, error)
	// ToRawKubeConfigLoader return kubeconfig loader as-is
	ToRawKubeConfigLoader() clientcmd.ClientConfig
}

// Result contains helper methods for dealing with the outcome of a Builder.
type Result struct {
	err     error
	visitor Visitor

	sources            []Visitor
	singleItemImplied  bool
	targetsSingleItems bool

	mapper       *mapper
	ignoreErrors []utilerrors.Matcher

	// populated by a call to Infos
	info []*Info
}

kubelet的核心组件

image.png
image.png
image
image

图片来自 https://feisky.gitbooks.io/kubernetes/components/kubelet.html (上图中kubelet之后还有ContainerManager(名字容易混淆)会设置cgroup,device resource之类的信息,然后才会调用genericRuntimeManager)

  • PodWorkers: podWorkers handle syncing Pods in response to events.
  • kubepod.Manager: podManager is a facade that abstracts away the various sources of pods this Kubelet services.
  • eviction.Manager: Needed to observe and respond to situations that could impact node stability
  • kubecontainer.ContainerCommandRunner: run in container, 即 exec in container
  • cadvisor: 监控
  • dnsConfigurer: setting up DNS resolver configuration when launching pods
  • VolumePluginMgr: Volume plugins.
  • probeManager/livenessManager: Handles container probing/ Manages container health check results.
  • kubecontainer.ContainerGC: Policy for handling garbage collection of dead containers.
  • images.ImageGCManager: Manager for image garbage collection.
  • logs.ContainerLogManager: Manager for container logs.
  • secret.Manager: Secret manager
  • configmap.Manager: ConfigMap manager.
  • certificate.Manager: Handles certificate rotations.
  • status.Manager: Syncs pods statuses with apiserver; also used as a cache of statuses.
  • volumemanager.VolumeManager: attach/mount/unmount/detach volumes for pods
  • cloudprovider.Interface
  • cloudresource.SyncManager
  • kubecontainer.Runtime: Container runtime, GetPods/SyncPod/KillPod/GetPodStatus/ImageService....
  • kubecontainer.StreamingRuntime: GetExec/GetAttach/GetPortForward
  • RuntimeService:
    • ContainerManager(Create/Start/Stop/List/Exec...Container)
    • PodSandboxManager(Run/Stop/Remove..PodSandbox)
    • ContainerStatsManager
  • PodLifecycleEventGenerator: Generates pod events.
  • oomwatcher.Watcher
  • cm.ContainerManager: Start/SystemCgroupsLimit/GetNodeConfig/GetMountedSubsystems/GetQOSContainersInfo...
  • pluginmanager.PluginManager

kubelet的入口线程

kubelet.go

  • ListenAndServe/ListenAndServeReadOnly: server 10250/10255
  • ListenAndServePodResources: a gRPC server to serve the PodResources service
  • For serviceIndexer/nodeIndexer: get local cache for service and node object
  • containerGC/imageManager.GarbageCollection: 定期 GarbageCollect, call kubeGenericRuntimeManager.containerGC evictContainers/evictSandboxes/evictPodLogsDirectories / realImageGCManager.GarbageCollect
  • pluginManager.Run: CSIPlugin/DevicePlugin
  • cloudResourceSyncManager: sync node address
  • volumeManager: runs a set of asynchronous loops that figure out which volumes need to be attached/mounted/unmounted/detached based on the pods scheduled on this node and makes it so.
  • syncNodeStatus/fastStatusUpdateOnce/nodeLeaseController: updateNodeStatus 两种上报方式,lease轻量不易因为集群数据量过大失败
  • updateRuntimeUp: every 5s , initializing the runtime dependent modules when the container runtime first comes up
  • podKiller: every 1s, Start a goroutine responsible for killing pods (that are not properly handled by pod workers).
代码语言:txt
复制
syncLoopIteration
// Arguments:
// 1.  configCh:       a channel to read config events from, 来自http/status/apiserver
// 2.  handler:        the SyncHandler to dispatch pods to, 同步状态
// 3.  syncCh:         a channel to read periodic sync events from
// 4.  housekeepingCh: a channel to read housekeeping events from
// 5.  plegCh:         a channel to read PLEG updates from, 容器状态变化ContainerStarted/Died/Removed/..

cgroup 结构

https://zhuanlan.zhihu.com/p/38359775

代码语言:txt
复制
# ubuntu 16.04; kubernetes v1.10.5
ubuntu@VM-0-12-ubuntu:~$ systemd-cgls
Control group /:
-.slice
├─init.scope
│ └─1 /sbin/init
├─system.slice
│ ├─avahi-daemon.service
│ │ ├─1268 avahi-daemon: running [VM-0-12-ubuntu.local
│ │ └─1283 avahi-daemon: chroot helpe
| | -- 略
│ ├─dockerd.service
│ │ ├─ 5134 /usr/bin/dockerd --config-file=/etc/docker/daemon.json
│ │ ├─ 5143 docker-containerd --config /var/run/docker/containerd/containerd.toml
│ │ └─29537 docker-containerd-shim -namespace moby -workdir /data/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/303a0718c84995350d835f6e2d17036
| | | 略
│ ├─accounts-daemon.service
│ │ └─1262 /usr/lib/accountsservice/accounts-daemon
| | --略
│ ├─NetworkManager.service
│ │ └─1287 /usr/sbin/NetworkManager --no-daemon
│ ├─kubelet.service
│ │ └─5239 /usr/bin/kubelet --cluster-dns=10.15.255.254 --network-plugin=cni --kube-reserved=cpu=80m,memory=1319Mi --cloud-config=/etc/kubernetes/qcloud.conf 
│ ├─rsyslog.service
│ │ └─1251 /usr/sbin/rsyslogd -n
| | 略
│ └─acpid.service
│   └─1293 /usr/sbin/acpid
├─user.slice
│ └─user-500.slice
│   ├─session-129315.scope
│   │ ├─27862 sshd: ubuntu [priv] 
│   └─user@500.service
│     └─init.scope
│       ├─27870 /lib/systemd/systemd --user
│       └─27871 (sd-pam)  
└─kubepods
  ├─burstable
  │ ├─pod5645ed58-e98f-11e9-8443-52540087514c
  │ │ ├─1f8f76dacb8334bd8d8ab2a7432d2cc250286ca6b5b73ab6dca9a845b77a3a09
  │ │ │ └─8958 /configmap-reload --webhook-url=http://localhost:9090/-/reload --volume-dir=/etc/prometheus/rules/prometheus-k8s-rulefiles-0
  └─besteffort
    ├─pod3cf3ae0d-b7f4-11e9-8443-52540087514c
    │ ├─fde2178c5fa634206c2c86756c107c3de2828d2f90e2ea4c6a3b57f50c25267c
    │ │ └─5435 /pause
    │ └─5b4082efeb73ad102cc3fea33ff4c931c042a7120f0cd5277d46660aedffffde
    │   ├─ 5663 sh /install-cni.sh
    │   └─20347 sleep 3600

APIserver 结构

一个不错的参考:https://note.youdao.com/ynoteshare1/index.html?id=63f58c5e98634c8b3df9da2b024aacd5&type=note

image.png
image.png

重要流程

  • CreateKubeAPIServer
    • completedConfig.InstallLegacyAPI: api/all 和 api/legacy,分别控制全部和遗留 API
    • completedConfig.InstallAPIs
      • apiGroupInfo=restStorageBuilder.NewRESTStorage: 比较重要的元素是 VersionedResourcesStorageMap mapstringmapstringrest.Storage: {"v1beta1":{"deployments":deploymentStorage.Deployment}}
        • 以"app"为例: if v1enable: storageMap=RESTStorageProvider(storage_app).v1Storage
          • deploymentStorage = deploymentstore.NewStorage, storage"deployments" = deploymentStorage.Deployment; deploymentStorage 里面是 XXXREST元素, XXXREST元素的解释见下面
      • GenericAPIServer.InstallAPIGroups
        • s.installAPIResources: 核心安装 API 的方法,建立 api 和 storage的关系
          • apiGroupVersion.InstallREST
            • installer.Install()
              • registerResourceHandlers: 对storage里面所有的path 关联 storage
              • 比如 actions = appendIf(actions, action{"GET", itemPath, nameParams, namer, false}, isGetter)
              • handler = restfulGetResource(getter, exporter, reqScope)
              • route := ws.GET(action.Path).To(handler).Doc(doc)....
        • s.DiscoveryGroupManager.AddGroup
        • s.Handler.GoRestfulContainer.Add(discovery.NewAPIGroupHandler(s.Serializer, apiGroup).WebService())
代码语言:txt
复制
// NewREST returns a RESTStorage object that will work against deployments.
func NewREST(optsGetter generic.RESTOptionsGetter) (*REST, *StatusREST, *RollbackREST, error) {
	store := &genericregistry.Store{
		NewFunc:                  func() runtime.Object { return &apps.Deployment{} },
		NewListFunc:              func() runtime.Object { return &apps.DeploymentList{} },
		DefaultQualifiedResource: apps.Resource("deployments"),

		CreateStrategy: deployment.Strategy,
		UpdateStrategy: deployment.Strategy,
		DeleteStrategy: deployment.Strategy,

		TableConvertor: printerstorage.TableConvertor{TableGenerator: printers.NewTableGenerator().With(printersinternal.AddHandlers)},
	}
	options := &generic.StoreOptions{RESTOptions: optsGetter}
	if err := store.CompleteWithOptions(options); err != nil {
		return nil, nil, nil, err
	}

	statusStore := *store
	statusStore.UpdateStrategy = deployment.StatusStrategy
	return &REST{store, []string{"all"}}, &StatusREST{store: &statusStore}, &RollbackREST{store: store}, nil
}

type REST struct {
	*genericregistry.Store
	categories []string
}

genericregistry.Store 定义了 NewList,NewObject,CreateStrategy,UpdateStrategy
核心是 DryRunnableStorage:DryRunnableStorage中的 storage.Interface 是实际对存储的 crud 入口

type DryRunnableStorage struct {
	Storage storage.Interface
	Codec   runtime.Codec
}

Storage 是 Cacher struct {真实stotrage -> etcd3/store}

generic.StoreOptions.RESTOptions决定了后端存储 是 completedConfig.(genericapiserver.CompletedConfig).的一部分 从最上面一层一层传递下来 来自 buildGenericConfig <- createAggregatorConfig  master.config->completedConfig

最终可以发现 generic.RESTOptions.Decorator = genericregistry.StorageWithCacher(cacheSize) 即带 cache 的etcd后端 (EnableWatchCache打开的时候 default true)

cache 的实现在 vendor/k8s.io/apiserver/pkg/storage/cacher/cacher.go
下面具体的看这个 cache的实现

apiserver 里面的 cache 实现

watch 为例; user 为 vendor/k8s.io/apiserver/pkg/registry/generic/registry/store.go

image.png
image.png

动作

处理

Create

etcd3/store:Create

Delete

etcd3/store:Delete

Watch

etcd3/注册 watcher 接受事件, 从 cache

Get

resourceVersion=""时直接去 store get; 否则从 cache 获取(需要wait resourceVersion)

List

和 Get 类似

Debug Etcd

代码语言:txt
复制
# 下载 etcd 
ETCD_VER=v3.4.0
DOWNLOAD_URL=https://github.com/etcd-io/etcd/releases/download
curl -L ${DOWNLOAD_URL}/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz -o /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz && tar xzvf /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz -C /tmp/etcd-download-test --strip-components=1 && rm -f /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz

# 配置环境
export ETCDCTL_CERT=/etc/kubernetes/certs/kube-apiserver-etcd-client.crt
export ETCDCTL_KEY=/etc/kubernetes/certs/kube-apiserver-etcd-client.key
export ETCDCTL_CACERT=/etc/kubernetes/certs/kube-apiserver-etcd-ca.crt
export ETCDCTL_ENDPOINTS=https://etcd.cls-4lr4c4wx.ccs.tencent-cloud.com:2379


etcdctl get  "" --prefix=true  --limit=1 # get key and value
etcdctl get  "" --prefix=true --keys-only --limit=100 # get only keys
etcdctl get "/cls-4lr4c4wx/pods" --prefix=true --keys-only  --limit=10 # get pod keys; 这里cls-4lr4c4wx是etcd prefix
etcdctl get "/cls-4lr4c3wx/configmaps" --prefix=true --limit 1 --write-out="json" # 输出为 json

1.16 里面的 watch bookmark event是什么意思

比如一个客户端 watch pod

代码语言:txt
复制
GET /api/v1/namespaces/test/pods?watch=1&resourceVersion=10245&allowWatchBookmarks=true
---
200 OK
Transfer-Encoding: chunked
Content-Type: application/json
{
  "type": "ADDED",
  "object": {"kind": "Pod", "apiVersion": "v1", "metadata": {"resourceVersion": "10596", ...}, ...}
}
{
  "type": "BOOKMARK",
  "object": {"kind": "Pod", "apiVersion": "v1", "metadata": {"resourceVersion": "12746"} }
}

然后 watcher 发生了重启, 有BOOKMARK的 watcher就可以从 resourceVersion=12746 开始继续 watch, 而没有收到 BOOKMARK 的,只能从 resourceVersion=10596 继续 watch,但是其实 10596-12746 起码没有他关心的 event了.

apiserver 是如何实现 Aggregator 的

i56jqed8mj.jpg
i56jqed8mj.jpg

aggregator 本身也是一个 controller

image.png
image.png

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 查看某种资源
  • Controller 逻辑(JobController为例)
  • serviceaccount_controller 和 tokens_controller
  • kubernetes 可配置特性
  • kubectl code 位置问题
  • kubectl 实现方式
  • kubelet的核心组件
  • kubelet的入口线程
  • cgroup 结构
  • APIserver 结构
  • apiserver 里面的 cache 实现
  • Debug Etcd
  • 1.16 里面的 watch bookmark event是什么意思
  • apiserver 是如何实现 Aggregator 的
相关产品与服务
容器服务
腾讯云容器服务(Tencent Kubernetes Engine, TKE)基于原生 kubernetes 提供以容器为核心的、高度可扩展的高性能容器管理服务,覆盖 Serverless、边缘计算、分布式云等多种业务部署场景,业内首创单个集群兼容多种计算节点的容器资源管理模式。同时产品作为云原生 Finops 领先布道者,主导开源项目Crane,全面助力客户实现资源优化、成本控制。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档