版權說明:本文檔由用戶提供并上傳,收益歸屬內容提供方,若內容存在侵權,請進行舉報或認領
文檔簡介
1、K8S集群基礎架構的有效管理實踐How we Manage our Widely Varied Kubernetes Infrastructures in AlibabaAgendaBackgroundAlibaba Kubernetes ArchitectureInfrastructure ManagementCI/CD PipelinesQuick DemoBackgroundWho are we?Scale of Alibaba Kubernetes Clusters (handreds of internal clusters, 5k-10k nodes each)Variety of
2、 Cluster Infrastructures (200+ addons)Significance of keeping the stability in large-scale clusters.Tenant ClusterMeta ClusterArchitecture of Alibaba Kubernetes InfrastructureKubeletPouch-ContainerPodPodPodCNIalinetultron-pluginAlibaba ECSKubeletcontainerdPodPodPodCNIterwaycsi-pluginData PlaneBare M
3、entalKubeletcontainerdkatakatakataCNIalinetCSIultron-pluginMulti-tenantNetwork ControllerStorage ControllerKruiseDefender OperatorKubeNode Operatorkube-apiserverkube-controller-managerControl Planekube-scheduleretcdalphaCustomized SchedulerAlert OpeatorMonitoring OpeatorMetrics OperatorAdd-onsRepair
4、 OperatorCustomized OperatorInfrastructure Management - MasterapiVersion: /v1alpha1 kind: Clustermetadata:labels:cluster.id: c3f1b726caecf4d0ca076f73ee781e312 name: kubernetes-clusternamespace: c3f1b726caecf4d0ca076f73ee781e312 spec:kubernetes:kcm:commit: 0bfce06name: kubernetes.kdm.kcm replicas: 3v
5、ersion: v1.16.3-alibaba.2kore:name: kubernetes.kdm.korepanelreplicas: 3version: v1.16.3-alibaba.2 rols:name: kubernetes.kdm.roles version: v1.16.3-alibaba.2scheduler:commit: 0bfce06name: kubernetes.kdm.schedulerreplicas: 3version: v1.16.3-alibaba.2Kubernetes VersionOpsCICluster APIKube-ApiserverKube
6、-Controller-ManagerKube-SchedulerCluster Spec1. push k8s version2. update cluster specSimplified logic of managing master versionswatch3. Upgrade master versionUse Cluster API manage master versionOperator manager infrastructureInfrastructure Management - AddonSimplified logic of managing addon vers
7、ionsOpsCIOperator-Manager1. push operator version2. Call Operator manager to trick canary grayOld PodCanary PodNew Pod4. upgrade versionfirst create a canary pod andthen update operator rules and watching the canary podstatuscall UpdateOperatorRule to empty the rules and delete the canary podupgrade
8、 to new version3. operate canary podInfrastructure Management - DataplaneSimplified logic of managing data plane versionsOpsCI1. push rpm version2. create machine component setMachine OperatorMachineComponentSetwatchkube-node-agentRPMcall kubenode agent to upgade rpm versionupgrade rpmuse partition
9、to controller the batch of grayKubeNode: upgrade a dataplane component“Philosophy”Components varied from different clustersHow to manage componentsAlways provide the stable component versionHow to make stable releasesContinuous and non-disruptive cluster deliveryHow to build safe delivery pipelinesC
10、omponent ManagementImage-OrientedOnly patch container imageSimple but not fit to all casesYAML-OrientedHelm templateSeparate image and meta- configDesign for CIHelm + Version ControlComponent ManagementapiVersion: apps.kruise.io/v1alpha1kind: DaemonSetmetadata:name: asi-proxy-ds-1namespace: kube-sys
11、tem Spec:template:spec:containers:- image: .image.nginx.repository:.image.nginx.tagresource: toYaml .resource | indent 8 tolerations: toYaml .tolerations | indent 8 .nginx:repository: nginxtag: latestresource: requests: cpu: 1 memory: 2Gi limit:cpu: 2memory: 4Gitolerations:- operator: ExistsYAMLMeta
12、-Config: Varies from cluster to clusterInfrastructure Components = YAML = Template + Image + Meta-ConfigImage: expected to be the sameimage:Template: constants that never changesComponent ManagementDo things like that kubectl apply doesCompare with current spec/cluster specPATCH diff to apiserver7HP
13、SODWH,PDJH0HWDDB SpecTarget SpecCluster SpecResource DiResource DiNew Cluster Specpodpodpodpodpod% 5HFRUGHG,PDJH 9HUVLRQ1HZ0HWD 9HUVLRQreplicas: 3cpu: 1 mem: 2Gi spec:replicas: 3 resource:request: UHSOLFD Cluster SpecFilter out danger fieldthree-way diTrigger operators reconcileVersion Release & Tes
14、tingBranch updateRun e2e testsRelease and deliveryControlplaneaddonaddonaddonaddonDeploy to e2e ClusterDev branchNew features & fixesRelease v1.0.0ClusterClusterClusterVersion Release & TestingKubernetes Conformance e2ee2e-cluster-1apiserver kcm scheduler cni-serviceextension-webhookextension-contro
15、ller Pouche2e-cluster-2apiserver kcm scheduler kube-proxy coredns containerde2e-cluster-3apiserver kcm scheduler terwaycloud-controller-manager containerd:KLWHER 7HVWLQJOperatorsguest-cluster-1Operatorsguest-cluster-2Operatorsguest-cluster-3Canary Test Sets 1%ODFNER 7HVWLQJCanary Test Sets 2Canary T
16、est Sets 3e2e testing is not enoughCanary tests runs continuouslyCreate/delete pod/sts/deployUpgrade sts/deployScale up/down sts/deployCreate JobCreate CustomResouceIntra-cluster upgradeRolling updates for Kubernetes WorkloadsDeployment (Kruise)StatefulSet (Kruise)DaemonSet (Kruise)Dataplane compone
17、nts (KubeNode)Rollout PolicyPause/ResumeMax unavailableDeploymentStatefulSetDaemonSetDataplane ComponentsRollout PolicyRollingUpdate Canary DeployRollingUpdate Canary DeployRollingUpdateRollingUpdatePause/Resum eYesYesYesYesMax unavailableNot yetYesYesYesPartitionNoNoYesYes/openkruise/kruiseRollout
18、for operatorsEnhance the ability of Operator (StatefulSet / Deployment)Implement operator as the way kubebuilder doesSidecar container which contains clientset, informer and pluginsServing operator with gRPC requests/openkruise/kruiseRollout for operatorsCanary deploy for OperatorsFlow control on a
19、monilithic managerFlow slice controlled by rule (Custom Resource)Rolling update/openkruise/kruiseRollout for DaemonSetOriginal DaemonSetLack of the ability of rolling updatealways updates all pods once image changesOnDelete ?Replicas:5Updated Replicas: 0Replicas:5Updated Replicas: 5Rollout for Daemo
20、nSetKruise: Enhance the ability of DaemonSetsPartition: the number of pods remained to be old versionMaxUnavailable: the maximum number of pods can be unavailable during rolling updateReplicas:5Updated Replicas: 0Partiton:5Replicas:5Updated Replicas: 5Partition:4Replicas:5Updated Replicas: 5Partitio
21、n:2/openkruise/kruiseRollout for DataplaneKubelet / Pouch / containerd Similar to Kruise Daemonset on patition controlNodeSet: a group of nodes which has the same characters, minimum rollout unitRolling update in each NodeSetUpgrade NodeSet sequentiallyNS - 1NS - 2NS - 3Cluster - 1NS - 5NS - 6Cluste
22、r - 2Bake timeNS - 4Inter-cluster upgradesInter-cluster rollout pipelinesOrchestrate clusters with scale / importance of upper biz appsBuild a gray release pipelineTekton-liked implementationTesting ClusterCanary ClusterSmallflow ClusterProduction ClusterContains small percent of service invocationI
23、nter-cluster upgradesInter-cluster rollout pipelinesSilent period between each clustersPre-checking and post-checkingtime window checker / rule-based blockerMetric monitoring / health checksCluster - 1Cluster - 2Cluster - 3Cluster - 4Pipelines inter-clusterPut all things together, here comes our pipeline journey ! Source CodeUnit TestBuild Imagee2e Cluster Ca
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
- 4. 未經(jīng)權益所有人同意不得將文件中的內容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
- 6. 下載文件中如有侵權或不適當內容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 2024年體育賽事臨時租場合同
- 2024燈光亮化工程設計合同
- 2024年度勞務派遣服務合同(安裝工人)
- 2024年建筑工程勞務分包協(xié)議書
- 深海剪影課件教學課件
- 2024年幕墻工程質量保修合同
- 2024年度新能源技術研發(fā)與轉讓合同
- 2024年度房產(chǎn)市場監(jiān)管合同:不動產(chǎn)市場調控配合
- 2024年度觀白活力中心房地產(chǎn)項目環(huán)境影響評估合同
- 2024年度塔吊配件采購供應合同
- 藥學職業(yè)生涯人物訪談
- 單位職工獨生子女父母一次性退休補貼申請表
- 國有集團公司中層及員工履職追責問責處理辦法模版
- 管理學-第6章-組織設計
- 2020醫(yī)用氧藥典標準
- 七年級生物作業(yè)設計
- 2023年考研英語二真題(含答案及解析)【可編輯】
- 食堂員工規(guī)章制度
- 軟件工程(嵌入式培養(yǎng))專業(yè)職業(yè)生涯規(guī)劃書
- 精力管理-課件
- 提高工作效率有技巧(一)課件
評論
0/150
提交評論