K8S集群基礎架構的有效管理實踐_第1頁
K8S集群基礎架構的有效管理實踐_第2頁
K8S集群基礎架構的有效管理實踐_第3頁
K8S集群基礎架構的有效管理實踐_第4頁
K8S集群基礎架構的有效管理實踐_第5頁
已閱讀5頁,還剩18頁未讀, 繼續(xù)免費閱讀

下載本文檔

版權說明:本文檔由用戶提供并上傳,收益歸屬內容提供方,若內容存在侵權,請進行舉報或認領

文檔簡介

1、K8S集群基礎架構的有效管理實踐How we Manage our Widely Varied Kubernetes Infrastructures in AlibabaAgendaBackgroundAlibaba Kubernetes ArchitectureInfrastructure ManagementCI/CD PipelinesQuick DemoBackgroundWho are we?Scale of Alibaba Kubernetes Clusters (handreds of internal clusters, 5k-10k nodes each)Variety of

2、 Cluster Infrastructures (200+ addons)Significance of keeping the stability in large-scale clusters.Tenant ClusterMeta ClusterArchitecture of Alibaba Kubernetes InfrastructureKubeletPouch-ContainerPodPodPodCNIalinetultron-pluginAlibaba ECSKubeletcontainerdPodPodPodCNIterwaycsi-pluginData PlaneBare M

3、entalKubeletcontainerdkatakatakataCNIalinetCSIultron-pluginMulti-tenantNetwork ControllerStorage ControllerKruiseDefender OperatorKubeNode Operatorkube-apiserverkube-controller-managerControl Planekube-scheduleretcdalphaCustomized SchedulerAlert OpeatorMonitoring OpeatorMetrics OperatorAdd-onsRepair

4、 OperatorCustomized OperatorInfrastructure Management - MasterapiVersion: /v1alpha1 kind: Clustermetadata:labels:cluster.id: c3f1b726caecf4d0ca076f73ee781e312 name: kubernetes-clusternamespace: c3f1b726caecf4d0ca076f73ee781e312 spec:kubernetes:kcm:commit: 0bfce06name: kubernetes.kdm.kcm replicas: 3v

5、ersion: v1.16.3-alibaba.2kore:name: kubernetes.kdm.korepanelreplicas: 3version: v1.16.3-alibaba.2 rols:name: kubernetes.kdm.roles version: v1.16.3-alibaba.2scheduler:commit: 0bfce06name: kubernetes.kdm.schedulerreplicas: 3version: v1.16.3-alibaba.2Kubernetes VersionOpsCICluster APIKube-ApiserverKube

6、-Controller-ManagerKube-SchedulerCluster Spec1. push k8s version2. update cluster specSimplified logic of managing master versionswatch3. Upgrade master versionUse Cluster API manage master versionOperator manager infrastructureInfrastructure Management - AddonSimplified logic of managing addon vers

7、ionsOpsCIOperator-Manager1. push operator version2. Call Operator manager to trick canary grayOld PodCanary PodNew Pod4. upgrade versionfirst create a canary pod andthen update operator rules and watching the canary podstatuscall UpdateOperatorRule to empty the rules and delete the canary podupgrade

8、 to new version3. operate canary podInfrastructure Management - DataplaneSimplified logic of managing data plane versionsOpsCI1. push rpm version2. create machine component setMachine OperatorMachineComponentSetwatchkube-node-agentRPMcall kubenode agent to upgade rpm versionupgrade rpmuse partition

9、to controller the batch of grayKubeNode: upgrade a dataplane component“Philosophy”Components varied from different clustersHow to manage componentsAlways provide the stable component versionHow to make stable releasesContinuous and non-disruptive cluster deliveryHow to build safe delivery pipelinesC

10、omponent ManagementImage-OrientedOnly patch container imageSimple but not fit to all casesYAML-OrientedHelm templateSeparate image and meta- configDesign for CIHelm + Version ControlComponent ManagementapiVersion: apps.kruise.io/v1alpha1kind: DaemonSetmetadata:name: asi-proxy-ds-1namespace: kube-sys

11、tem Spec:template:spec:containers:- image: .image.nginx.repository:.image.nginx.tagresource: toYaml .resource | indent 8 tolerations: toYaml .tolerations | indent 8 .nginx:repository: nginxtag: latestresource: requests: cpu: 1 memory: 2Gi limit:cpu: 2memory: 4Gitolerations:- operator: ExistsYAMLMeta

12、-Config: Varies from cluster to clusterInfrastructure Components = YAML = Template + Image + Meta-ConfigImage: expected to be the sameimage:Template: constants that never changesComponent ManagementDo things like that kubectl apply doesCompare with current spec/cluster specPATCH diff to apiserver7HP

13、SODWH,PDJH0HWDDB SpecTarget SpecCluster SpecResource DiResource DiNew Cluster Specpodpodpodpodpod% 5HFRUGHG,PDJH 9HUVLRQ1HZ0HWD 9HUVLRQreplicas: 3cpu: 1 mem: 2Gi spec:replicas: 3 resource:request: UHSOLFD Cluster SpecFilter out danger fieldthree-way diTrigger operators reconcileVersion Release & Tes

14、tingBranch updateRun e2e testsRelease and deliveryControlplaneaddonaddonaddonaddonDeploy to e2e ClusterDev branchNew features & fixesRelease v1.0.0ClusterClusterClusterVersion Release & TestingKubernetes Conformance e2ee2e-cluster-1apiserver kcm scheduler cni-serviceextension-webhookextension-contro

15、ller Pouche2e-cluster-2apiserver kcm scheduler kube-proxy coredns containerde2e-cluster-3apiserver kcm scheduler terwaycloud-controller-manager containerd:KLWHER 7HVWLQJOperatorsguest-cluster-1Operatorsguest-cluster-2Operatorsguest-cluster-3Canary Test Sets 1%ODFNER 7HVWLQJCanary Test Sets 2Canary T

16、est Sets 3e2e testing is not enoughCanary tests runs continuouslyCreate/delete pod/sts/deployUpgrade sts/deployScale up/down sts/deployCreate JobCreate CustomResouceIntra-cluster upgradeRolling updates for Kubernetes WorkloadsDeployment (Kruise)StatefulSet (Kruise)DaemonSet (Kruise)Dataplane compone

17、nts (KubeNode)Rollout PolicyPause/ResumeMax unavailableDeploymentStatefulSetDaemonSetDataplane ComponentsRollout PolicyRollingUpdate Canary DeployRollingUpdate Canary DeployRollingUpdateRollingUpdatePause/Resum eYesYesYesYesMax unavailableNot yetYesYesYesPartitionNoNoYesYes/openkruise/kruiseRollout

18、for operatorsEnhance the ability of Operator (StatefulSet / Deployment)Implement operator as the way kubebuilder doesSidecar container which contains clientset, informer and pluginsServing operator with gRPC requests/openkruise/kruiseRollout for operatorsCanary deploy for OperatorsFlow control on a

19、monilithic managerFlow slice controlled by rule (Custom Resource)Rolling update/openkruise/kruiseRollout for DaemonSetOriginal DaemonSetLack of the ability of rolling updatealways updates all pods once image changesOnDelete ?Replicas:5Updated Replicas: 0Replicas:5Updated Replicas: 5Rollout for Daemo

20、nSetKruise: Enhance the ability of DaemonSetsPartition: the number of pods remained to be old versionMaxUnavailable: the maximum number of pods can be unavailable during rolling updateReplicas:5Updated Replicas: 0Partiton:5Replicas:5Updated Replicas: 5Partition:4Replicas:5Updated Replicas: 5Partitio

21、n:2/openkruise/kruiseRollout for DataplaneKubelet / Pouch / containerd Similar to Kruise Daemonset on patition controlNodeSet: a group of nodes which has the same characters, minimum rollout unitRolling update in each NodeSetUpgrade NodeSet sequentiallyNS - 1NS - 2NS - 3Cluster - 1NS - 5NS - 6Cluste

22、r - 2Bake timeNS - 4Inter-cluster upgradesInter-cluster rollout pipelinesOrchestrate clusters with scale / importance of upper biz appsBuild a gray release pipelineTekton-liked implementationTesting ClusterCanary ClusterSmallflow ClusterProduction ClusterContains small percent of service invocationI

23、nter-cluster upgradesInter-cluster rollout pipelinesSilent period between each clustersPre-checking and post-checkingtime window checker / rule-based blockerMetric monitoring / health checksCluster - 1Cluster - 2Cluster - 3Cluster - 4Pipelines inter-clusterPut all things together, here comes our pipeline journey ! Source CodeUnit TestBuild Imagee2e Cluster Ca

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
  • 4. 未經(jīng)權益所有人同意不得將文件中的內容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
  • 6. 下載文件中如有侵權或不適當內容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論