SRE 每日主题:Kubernetes 生产部署指南

日期: 2026-03-09
主题序号: 9 (9 % 12 = 9)
难度等级: ⭐⭐⭐⭐⭐
适用场景: 生产环境 Kubernetes 集群部署与运维


目录

  1. 集群架构设计
  2. 节点配置与标签
  3. 核心组件配置
  4. 网络插件选择与配置
  5. 存储方案
  6. 安全加固
  7. 资源配额与限制
  8. 监控与告警
  9. 日志收集
  10. 故障排查
  11. 最佳实践清单

1. 集群架构设计

1.1 推荐架构

┌─────────────────────────────────────────────────────────────┐
│                        Load Balancer                         │
│                    (Cloud LB / F5 / Nginx)                   │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                    Control Plane (HA)                        │
│  ┌───────────┐  ┌───────────┐  ┌───────────┐               │
│  │  Master 1 │  │  Master 2 │  │  Master 3 │               │
│  │  etcd     │  │  etcd     │  │  etcd     │               │
│  │  API      │  │  API      │  │  API      │               │
│  └───────────┘  └───────────┘  └───────────┘               │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                      Worker Nodes                            │
│  ┌───────────┐  ┌───────────┐  ┌───────────┐  ┌───────────┐ │
│  │  Node 1   │  │  Node 2   │  │  Node 3   │  │  Node N   │ │
│  │  App      │  │  App      │  │  App      │  │  App      │ │
│  └───────────┘  └───────────┘  └───────────┘  └───────────┘ │
└─────────────────────────────────────────────────────────────┘

1.2 节点规模建议

集群规模 Master 节点 Worker 节点 适用场景
小型 3 (3C6G) 3-10 开发/测试
中型 3 (4C8G) 10-50 生产环境
大型 3-5 (8C16G) 50-200 大规模生产
超大型 5 (16C32G) 200+ 企业级

1.3 高可用配置

# kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
kubernetesVersion: v1.29.0
controlPlaneEndpoint: "k8s-vip.example.com:6443"
networking:
  podSubnet: "10.244.0.0/16"
  serviceSubnet: "10.96.0.0/12"
  dnsDomain: "cluster.local"
apiServer:
  certSANs:
    - "k8s-vip.example.com"
    - "192.168.1.100"
    - "192.168.1.101"
    - "192.168.1.102"
  extraArgs:
    authorization-mode: "Node,RBAC"
    enable-admission-plugins: "NodeRestriction,PodSecurityPolicy"
    audit-log-path: "/var/log/kubernetes/audit.log"
    audit-log-maxage: "30"
    audit-log-maxbackup: "10"
    audit-log-maxsize: "100"
controllerManager:
  extraArgs:
    bind-address: "0.0.0.0"
    node-monitor-grace-period: "40s"
    pod-eviction-timeout: "5m0s"
scheduler:
  extraArgs:
    bind-address: "0.0.0.0"
etcd:
  local:
    dataDir: "/var/lib/etcd"
    extraArgs:
      heartbeat-interval: "250"
      election-timeout: "2500"
      max-request-bytes: "10485760"

2. 节点配置与标签

2.1 系统内核参数优化

# /etc/sysctl.d/99-kubernetes.conf
# 网络优化
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
net.ipv4.conf.all.forwarding = 1
net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_intvl = 60
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_max_syn_backlog = 8192
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 30
net.ipv4.ip_local_port_range = 1024 65535
net.core.somaxconn = 32768
net.core.netdev_max_backlog = 5000
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216

# 文件描述符
fs.file-max = 2097152
fs.inotify.max_user_watches = 524288
fs.inotify.max_user_instances = 512

# 内存管理
vm.swappiness = 1
vm.max_map_count = 262144
vm.overcommit_memory = 1
vm.panic_on_oom = 0

# 应用配置
sysctl -p /etc/sysctl.d/99-kubernetes.conf

2.2 节点标签策略

# 按角色标签
kubectl label nodes <node-name> node-role.kubernetes.io/master=""
kubectl label nodes <node-name> node-role.kubernetes.io/worker=""

# 按环境标签
kubectl label nodes <node-name> env=production
kubectl label nodes <node-name> env=staging

# 按区域标签
kubectl label nodes <node-name> zone=us-east-1a
kubectl label nodes <node-name> region=us-east-1

# 按硬件规格标签
kubectl label nodes <node-name> instance-type=m5.xlarge
kubectl label nodes <node-name> cpu=4
kubectl label nodes <node-name> memory=16Gi

# 按业务线标签
kubectl label nodes <node-name> team=platform
kubectl label nodes <node-name> team=backend
kubectl label nodes <node-name> team=data

# 专用节点标签
kubectl label nodes <node-name> dedicated=database
kubectl label nodes <node-name> dedicated=gpu
kubectl label nodes <node-name> dedicated=monitoring

2.3 节点亲和性示例

apiVersion: v1
kind: Pod
metadata:
  name: database-pod
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: dedicated
            operator: In
            values:
            - database
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        preference:
          matchExpressions:
          - key: zone
            operator: In
            values:
            - us-east-1a
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: app
            operator: In
            values:
            - postgresql
        topologyKey: "kubernetes.io/hostname"

3. 核心组件配置

3.1 kube-apiserver 调优

# /etc/kubernetes/manifests/kube-apiserver.yaml 关键参数
spec:
  containers:
  - name: kube-apiserver
    command:
    - kube-apiserver
    - --request-timeout=1m0s
    - --max-requests-inflight=400
    - --max-mutating-requests-inflight=200
    - --apiserver-count=3
    - --service-node-port-range=30000-32767
    - --allow-privileged=true
    - --anonymous-auth=false
    - --basic-auth-file=""
    - --token-auth-file=""
    - --secure-port=6443
    - --tls-cert-file=/etc/kubernetes/pki/apiserver.crt
    - --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
    - --client-ca-file=/etc/kubernetes/pki/ca.crt
    - --service-account-key-file=/etc/kubernetes/pki/sa.pub
    - --service-account-signing-key-file=/etc/kubernetes/pki/sa.key
    - --service-account-issuer=https://kubernetes.default.svc
    - --encryption-provider-config=/etc/kubernetes/encryption-config.yaml
    - --profiling=false
    - --audit-policy-file=/etc/kubernetes/audit-policy.yaml

3.2 kube-controller-manager 调优

spec:
  containers:
  - name: kube-controller-manager
    command:
    - kube-controller-manager
    - --node-monitor-grace-period=40s
    - --pod-eviction-timeout=5m0s
    - --node-monitor-period=5s
    - --terminated-pod-gc-threshold=12500
    - --concurrent-deployment-syncs=10
    - --concurrent-endpoint-syncs=10
    - --concurrent-rc-syncs=5
    - --concurrent-service-endpoint-syncs=10
    - --horizontal-pod-autoscaler-sync-period=15s
    - --horizontal-pod-autoscaler-downscale-delay=5m0s
    - --horizontal-pod-autoscaler-upscale-delay=3m0s
    - --leader-elect=true
    - --use-service-account-credentials=true

3.3 kube-scheduler 调优

spec:
  containers:
  - name: kube-scheduler
    command:
    - kube-scheduler
    - --leader-elect=true
    - --profiling=false
    - --bind-address=0.0.0.0
    - --secure-port=10259
    - --feature-gates=PodSchedulingReadiness=true

3.4 kubelet 配置

# /var/lib/kubelet/config.yaml
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
authentication:
  anonymous:
    enabled: false
  webhook:
    cacheTTL: 0s
    enabled: true
  x509:
    clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
  mode: Webhook
  webhook:
    cacheAuthorizedTTL: 0s
    cacheUnauthorizedTTL: 0s
cgroupDriver: systemd
clusterDNS:
  - 10.96.0.10
clusterDomain: cluster.local
containerLogMaxFiles: 5
containerLogMaxSize: 10Mi
cpuManagerPolicy: static
cpuManagerReconcilePeriod: 10s
eventRecordQPS: 5
evictionHard:
  imagefs.available: 15%
  memory.available: 100Mi
  nodefs.available: 10%
  nodefs.inodesFree: 5%
evictionPressureTransitionPeriod: 5m0s
failSwapOn: true
featureGates:
  RotateKubeletServerCertificate: true
fileCheckFrequency: 20s
hairpinMode: promiscuous-bridge
healthzBindAddress: 127.0.0.1
healthzPort: 10248
httpCheckFrequency: 20s
imageMinimumGCAge: 2m0s
imageGCHighThresholdPercent: 85
imageGCLowThresholdPercent: 80
kubeReserved:
  cpu: 100m
  memory: 256Mi
  ephemeral-storage: 1Gi
systemReserved:
  cpu: 100m
  memory: 256Mi
  ephemeral-storage: 1Gi
maxOpenFiles: 1000000
maxPods: 110
nodeStatusReportFrequency: 10s
nodeStatusUpdateFrequency: 10s
podPidsLimit: 4096
port: 10250
readOnlyPort: 0
rotateCertificates: true
runtimeRequestTimeout: 2m0s
serializeImagePulls: false
staticPodPath: /etc/kubernetes/manifests
streamingConnectionIdleTimeout: 4h0m0s
syncFrequency: 1m0s
volumeStatsAggPeriod: 1m0s

4. 网络插件选择与配置

4.1 Calico 配置(推荐)

# calico-config.yaml
apiVersion: projectcalico.org/v3
kind: Installation
metadata:
  name: default
spec:
  calicoNetwork:
    bgp: Disabled
    ipPools:
    - blockSize: 26
      cidr: 10.244.0.0/16
      encapsulation: VXLANCrossSubnet
      natOutgoing: Enabled
      nodeSelector: all()
    nodeAddressAutodetectionV4:
      firstFound: true
  controlPlaneReplicas: 3
  variant: Calico
---
apiVersion: projectcalico.org/v3
kind: FelixConfiguration
metadata:
  name: default
spec:
  bpfEnabled: false
  bpfLogLevel: ""
  floatingIPs: Disabled
  healthEnabled: true
  logSeverityScreen: Info
  reportingInterval: 0s
  wireguardEnabled: false
---
apiVersion: projectcalico.org/v3
kind: BGPConfiguration
metadata:
  name: default
spec:
  logSeverityScreen: Info
  nodeToNodeMeshEnabled: false
  asNumber: 64512

4.2 Cilium 配置(eBPF 高性能)

# cilium-config.yaml
apiVersion: cilium.io/v2alpha1
kind: CiliumConfig
metadata:
  name: cilium
  namespace: kube-system
spec:
  ipam:
    mode: kubernetes
  routingMode: tunnel
  tunnelProtocol: vxlan
  enableBpfMasquerade: true
  enableHostPort: true
  enableL7Proxy: true
  enableLocalRedirectPolicy: true
  enableRemoteNodeIdentity: true
  operatorReplicas: 2
  securityContext:
    capabilities:
      ciliumAgent:
        - CHOWN
        - KILL
        - NET_ADMIN
        - NET_RAW
        - IPC_LOCK
        - SYS_ADMIN
        - SYS_RESOURCE
        - DAC_OVERRIDE
        - FOWNER
        - SETGID
        - SETUID
      cleanCiliumState:
        - NET_ADMIN
        - SYS_ADMIN
        - SYS_RESOURCE
  cni:
    chainingMode: none
  enableCnpStatusUpdates: true
  enableBpfClockProbe: true
  enableIPv4Masquerade: true
  enableIPv6: false
  ipv4NativeRoutingCIDR: "10.244.0.0/16"
  k8sServiceHost: ""
  k8sServicePort: 0
  monitor:
    enabled: true
  prometheus:
    enabled: true
    port: 9962
    serviceMonitor:
      enabled: true
  operator:
    prometheus:
      enabled: true
      port: 9963
      serviceMonitor:
        enabled: true
  rollOutPods: false
  installNoConntrackIptablesRules: false
  ipAllocStrategy: maxAvailable
  masquerade: true
  enableIptablesFeatherRules: true
  enableSessionAffinity: true
  enableEndToEndSplitting: true
  enableHealthCheckNodePort: true
  enableNodePort: true
  enableHostServices: true
  enableExternalIPs: true
  enableLoadBalancer: true
  enableL2Announcements: false
  enableBpfMasquerade: true
  enableIdentityMark: true
  enableHostLegacyRouting: false
  enableTProxy: false
  enableVlanBpfBridge: false
  enableSRv6: false
  bpf:
    monitorAggregation: moderate
    monitorFlags: all
    monitorInterval: 5s
    pollDelay: 5s
    lbAlgorithm: maglev
    lbSourceRangeHashSelection: true
    mapDynamicSizeRatio: 0.0025
    hostLegacyRouting: false
    writeCgroupNBPF: false
    preallocateMaps: false
    tproxy: false
    vlanBypassEnabled: false
    hostFirewall: false
    lbSock: false
    lbAcceleration: false
    devices: []
    enableMultiplex: false
    enableDistributedHashing: false
    enableHealthCheckLoadBalancer: false
    enableSocketLB: false
    enableSocketLBTracing: false
    enableSocketLBPeer: false
    proxyMaxBufferedConnectionsDNS: 0
    proxyMaxBufferedConnectionsDNSResponse: 0
    proxyMaxBufferedConnectionsDNSUpstream: 0

4.3 CoreDNS 配置

# coredns-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    .:53 {
        errors
        health {
            lameduck 5s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
            pods insecure
            fallthrough in-addr.arpa ip6.arpa
            ttl 30
        }
        prometheus :9153
        forward . /etc/resolv.conf {
            max_concurrent 1000
        }
        cache 30
        loop
        reload
        loadbalance
        log
    }
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: coredns
  namespace: kube-system
spec:
  replicas: 3
  revisionHistoryLimit: 5
  selector:
    matchLabels:
      k8s-app: kube-dns
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
  template:
    metadata:
      labels:
        k8s-app: kube-dns
    spec:
      serviceAccountName: coredns
      tolerations:
        - key: "CriticalAddonsOnly"
          operator: "Exists"
        - key: "node-role.kubernetes.io/master"
          effect: "NoSchedule"
        - key: "node-role.kubernetes.io/control-plane"
          effect: "NoSchedule"
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: k8s-app
                  operator: In
                  values:
                  - kube-dns
              topologyKey: kubernetes.io/hostname
      containers:
      - name: coredns
        image: coredns/coredns:1.11.1
        imagePullPolicy: IfNotPresent
        resources:
          limits:
            memory: 512Mi
          requests:
            cpu: 100m
            memory: 128Mi
        args: [ "-conf", "/etc/coredns/Corefile" ]
        volumeMounts:
        - name: config-volume
          mountPath: /etc/coredns
          readOnly: true
        ports:
        - containerPort: 53
          name: dns
          protocol: UDP
        - containerPort: 53
          name: dns-tcp
          protocol: TCP
        - containerPort: 9153
          name: metrics
          protocol: TCP
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 60
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 5
        readinessProbe:
          httpGet:
            path: /ready
            port: 8181
            scheme: HTTP
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            add:
            - NET_BIND_SERVICE
            drop:
            - all
          readOnlyRootFilesystem: true
      dnsPolicy: Default
      volumes:
        - name: config-volume
          configMap:
            name: coredns
            items:
            - key: Corefile
              path: Corefile

5. 存储方案

5.1 StorageClass 配置

# 本地存储(高性能)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-ssd
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete
---
# NFS 存储
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: nfs-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: Immediate
reclaimPolicy: Retain
parameters:
  archiveOnDelete: "false"
---
# Ceph RBD(推荐生产使用)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ceph-rbd
provisioner: rbd.csi.ceph.com
parameters:
  clusterID: <cluster-id>
  pool: replicapool
  imageFeatures: layering
  csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
  csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
  csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
  csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
  csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
  csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
volumeBindingMode: Immediate
reclaimPolicy: Delete
allowVolumeExpansion: true
---
# AWS EBS
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: gp3
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
  fsType: ext4
  encrypted: "true"
  kmsKeyId: "arn:aws:kms:us-east-1:123456789012:key/12345678-1234-1234-1234-123456789012"
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Retain
allowVolumeExpansion: true
---
# 默认 StorageClass
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: standard
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete

5.2 PersistentVolumeClaim 示例

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-pvc
  namespace: production
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 100Gi
  storageClassName: ceph-rbd
  volumeMode: Filesystem
---
# 带标签的 PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: database-pvc
  namespace: production
  labels:
    app: postgresql
    tier: database
  annotations:
    volume.beta.kubernetes.io/storage-provisioner: rbd.csi.ceph.com
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 500Gi
  storageClassName: ceph-rbd
  selector:
    matchLabels:
      pv-type: database

5.3 StatefulSet 存储配置

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgresql
  namespace: production
spec:
  serviceName: postgresql
  replicas: 3
  podManagementPolicy: OrderedReady
  updateStrategy:
    type: RollingUpdate
  selector:
    matchLabels:
      app: postgresql
  template:
    metadata:
      labels:
        app: postgresql
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - postgresql
            topologyKey: "kubernetes.io/hostname"
      containers:
      - name: postgresql
        image: postgres:15.4
        ports:
        - containerPort: 5432
          name: postgres
        volumeMounts:
        - name: data
          mountPath: /var/lib/postgresql/data
        resources:
          requests:
            cpu: 2
            memory: 8Gi
          limits:
            cpu: 4
            memory: 16Gi
        livenessProbe:
          exec:
            command:
            - pg_isready
            - -U
            - postgres
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          exec:
            command:
            - pg_isready
            - -U
            - postgres
          initialDelaySeconds: 5
          periodSeconds: 5
  volumeClaimTemplates:
  - metadata:
      name: data
      labels:
        app: postgresql
    spec:
      accessModes:
      - ReadWriteOnce
      storageClassName: ceph-rbd
      resources:
        requests:
          storage: 100Gi

6. 安全加固

6.1 RBAC 配置

# 创建命名空间
apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted
---
# ServiceAccount
apiVersion: v1
kind: ServiceAccount
metadata:
  name: app-service-account
  namespace: production
automountServiceAccountToken: false
---
# Role
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: app-role
  namespace: production
rules:
- apiGroups: [""]
  resources: ["configmaps", "secrets"]
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources: ["pods", "services"]
  verbs: ["get", "list", "watch"]
---
# RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: app-role-binding
  namespace: production
subjects:
- kind: ServiceAccount
  name: app-service-account
  namespace: production
roleRef:
  kind: Role
  name: app-role
  apiGroup: rbac.authorization.k8s.io
---
# ClusterRole(只读)
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: readonly-cluster
rules:
- apiGroups: [""]
  resources: ["*"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
  resources: ["*"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["batch"]
  resources: ["*"]
  verbs: ["get", "list", "watch"]
---
# ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: readonly-binding
subjects:
- kind: Group
  name: readonly-users
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: readonly-cluster
  apiGroup: rbac.authorization.k8s.io

6.2 Pod Security Standards

# Pod Security Admission 配置
apiVersion: v1
kind: Namespace
metadata:
  name: restricted-ns
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/enforce-version: latest
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/audit-version: latest
    pod-security.kubernetes.io/warn: restricted
    pod-security.kubernetes.io/warn-version: latest
---
# 安全的 Pod 示例
apiVersion: v1
kind: Pod
metadata:
  name: secure-pod
  namespace: restricted-ns
spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    runAsGroup: 1000
    fsGroup: 1000
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: app
    image: myapp:latest
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      capabilities:
        drop:
        - ALL
      runAsNonRoot: true
      runAsUser: 1000
    resources:
      requests:
        cpu: 100m
        memory: 128Mi
      limits:
        cpu: 500m
        memory: 512Mi
    volumeMounts:
    - name: tmp
      mountPath: /tmp
    - name: cache
      mountPath: /var/cache
  volumes:
  - name: tmp
    emptyDir: {}
  - name: cache
    emptyDir: {}
  automountServiceAccountToken: false

6.3 NetworkPolicy 配置

# 默认拒绝所有流量
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
---
# 允许特定服务间通信
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend-to-backend
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: backend
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8080
---
# 允许访问外部 DNS
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns-egress
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Egress
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: kube-system
    ports:
    - protocol: UDP
      port: 53
    - protocol: TCP
      port: 53
---
# 数据库访问策略
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-database-access
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: postgresql
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          tier: backend
    - namespaceSelector:
        matchLabels:
          name: monitoring
      podSelector:
        matchLabels:
          app: prometheus
    ports:
    - protocol: TCP
      port: 5432

6.4 Secret 加密配置

# /etc/kubernetes/encryption-config.yaml
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
  - resources:
      - secrets
    providers:
      - aescbc:
          keys:
            - name: key1
              secret: <base64-encoded-secret>
      - identity: {}
  - resources:
      - configmaps
    providers:
      - identity: {}

7. 资源配额与限制

7.1 ResourceQuota

apiVersion: v1
kind: ResourceQuota
metadata:
  name: production-quota
  namespace: production
spec:
  hard:
    requests.cpu: "100"
    requests.memory: 200Gi
    limits.cpu: "200"
    limits.memory: 400Gi
    persistentvolumeclaims: "50"
    requests.storage: 2Ti
    pods: "500"
    services: "100"
    secrets: "200"
    configmaps: "200"
    replicationcontrollers: "50"
    services.loadbalancers: "10"
    services.nodeports: "20"
---
# 按优先级配额
apiVersion: v1
kind: ResourceQuota
metadata:
  name: critical-quota
  namespace: production
spec:
  hard:
    requests.cpu: "50"
    requests.memory: 100Gi
    pods: "100"
  scopeSelector:
    matchExpressions:
    - operator: In
      scopeName: PriorityClass
      values:
      - critical
---
# 按 StorageClass 配额
apiVersion: v1
kind: ResourceQuota
metadata:
  name: storage-quota
  namespace: production
spec:
  hard:
    requests.storage: 1Ti
    ceph-rbd.storageclass.storage.k8s.io/requests.storage: 500Gi
  scopeSelector:
    matchExpressions:
    - operator: In
      scopeName: StorageClass
      values:
      - ceph-rbd

7.2 LimitRange

apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
  namespace: production
spec:
  limits:
  - type: Container
    default:
      cpu: 500m
      memory: 512Mi
    defaultRequest:
      cpu: 100m
      memory: 128Mi
    max:
      cpu: "8"
      memory: 16Gi
    min:
      cpu: 50m
      memory: 64Mi
    maxLimitRequestRatio:
      cpu: 10
      memory: 5
  - type: PersistentVolumeClaim
    max:
      storage: 500Gi
    min:
      storage: 1Gi
  - type: Pod
    max:
      cpu: "16"
      memory: 32Gi

7.3 PriorityClass

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: critical
value: 1000000
globalDefault: false
description: "Critical system pods that should never be evicted"
---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: high
value: 100000
globalDefault: false
description: "High priority application pods"
---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: medium
value: 10000
globalDefault: true
description: "Default priority for application pods"
---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: low
value: 1000
globalDefault: false
description: "Low priority batch jobs"

7.4 Pod Disruption Budget

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: frontend-pdb
  namespace: production
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: frontend
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: backend-pdb
  namespace: production
spec:
  maxUnavailable: 1
  selector:
    matchLabels:
      app: backend
---
# 关键服务 PDB
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: database-pdb
  namespace: production
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: postgresql
  unhealthyPodEvictionPolicy: AlwaysAllow

8. 监控与告警

8.1 Prometheus 配置

# prometheus-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
  namespace: monitoring
data:
  prometheus.yml: |
    global:
      scrape_interval: 15s
      evaluation_interval: 15s
      external_labels:
        cluster: production
        environment: prod

    alerting:
      alertmanagers:
      - static_configs:
        - targets:
          - alertmanager:9093

    rule_files:
      - /etc/prometheus/rules/*.yml

    scrape_configs:
      # Prometheus 自监控
      - job_name: 'prometheus'
        static_configs:
        - targets: ['localhost:9090']

      # Kubernetes API Server
      - job_name: 'kubernetes-apiservers'
        kubernetes_sd_configs:
        - role: endpoints
        scheme: https
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
        relabel_configs:
        - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
          action: keep
          regex: default;kubernetes;https

      # Kubelet
      - job_name: 'kubernetes-nodes'
        scheme: https
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
        kubernetes_sd_configs:
        - role: node
        relabel_configs:
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)
        - target_label: __address__
          replacement: kubernetes.default.svc:443
        - source_labels: [__meta_kubernetes_node_name]
          regex: (.+)
          target_label: __metrics_path__
          replacement: /api/v1/nodes/${1}/proxy/metrics

      # cAdvisor
      - job_name: 'kubernetes-cadvisor'
        scheme: https
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
        kubernetes_sd_configs:
        - role: node
        relabel_configs:
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)
        - target_label: __address__
          replacement: kubernetes.default.svc:443
        - source_labels: [__meta_kubernetes_node_name]
          regex: (.+)
          target_label: __metrics_path__
          replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor

      # Service Monitor
      - job_name: 'kubernetes-service-endpoints'
        kubernetes_sd_configs:
        - role: endpoints
        relabel_configs:
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
          action: keep
          regex: true
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
          action: replace
          target_label: __scheme__
          regex: (https?)
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
          action: replace
          target_label: __metrics_path__
          regex: (.+)
        - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
          action: replace
          target_label: __address__
          regex: ([^:]+)(?::\d+)?;(\d+)
          replacement: $1:$2
        - action: labelmap
          regex: __meta_kubernetes_service_label_(.+)
        - source_labels: [__meta_kubernetes_namespace]
          action: replace
          target_label: kubernetes_namespace
        - source_labels: [__meta_kubernetes_service_name]
          action: replace
          target_label: kubernetes_name

      # Pod Monitor
      - job_name: 'kubernetes-pods'
        kubernetes_sd_configs:
        - role: pod
        relabel_configs:
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
          action: keep
          regex: true
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
          action: replace
          target_label: __metrics_path__
          regex: (.+)
        - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
          action: replace
          target_label: __address__
          regex: ([^:]+)(?::\d+)?;(\d+)
          replacement: $1:$2
        - action: labelmap
          regex: __meta_kubernetes_pod_label_(.+)
        - source_labels: [__meta_kubernetes_namespace]
          action: replace
          target_label: kubernetes_namespace
        - source_labels: [__meta_kubernetes_pod_name]
          action: replace
          target_label: kubernetes_pod_name

8.2 关键告警规则

# alert-rules.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: k8s-alerts
  namespace: monitoring
spec:
  groups:
  - name: kubernetes-system
    rules:
    # 节点告警
    - alert: NodeDown
      expr: up{job="kubernetes-nodes"} == 0
      for: 5m
      labels:
        severity: critical
      annotations:
        summary: "节点 {{ $labels.node }} 宕机"
        description: "节点 {{ $labels.node }} 已经宕机超过 5 分钟"

    - alert: NodeHighCPU
      expr: 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 85
      for: 10m
      labels:
        severity: warning
      annotations:
        summary: "节点 {{ $labels.instance }} CPU 使用率过高"
        description: "CPU 使用率: {{ $value }}%"

    - alert: NodeHighMemory
      expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100 > 85
      for: 10m
      labels:
        severity: warning
      annotations:
        summary: "节点 {{ $labels.instance }} 内存使用率过高"
        description: "内存使用率: {{ $value }}%"

    - alert: NodeDiskPressure
      expr: (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100 < 15
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "节点 {{ $labels.instance }} 磁盘空间不足"
        description: "可用磁盘空间: {{ $value }}%"

    # Pod 告警
    - alert: PodCrashLooping
      expr: rate(kube_pod_container_status_restarts_total[15m]) * 60 * 5 > 0
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "Pod {{ $labels.pod }} 频繁重启"
        description: "Pod 在 15 分钟内重启次数超过 5 次"

    - alert: PodPending
      expr: kube_pod_status_phase{phase="Pending"} == 1
      for: 10m
      labels:
        severity: warning
      annotations:
        summary: "Pod {{ $labels.pod }} 处于 Pending 状态"
        description: "Pod 已经 Pending 超过 10 分钟"

    - alert: PodOOMKilled
      expr: kube_pod_container_status_last_terminated_reason{reason="OOMKilled"} == 1
      for: 1m
      labels:
        severity: critical
      annotations:
        summary: "Pod {{ $labels.pod }} 被 OOM 杀死"
        description: "容器 {{ $labels.container }} 因内存不足被杀死"

    # Deployment 告警
    - alert: DeploymentReplicasMismatch
      expr: kube_deployment_spec_replicas != kube_deployment_status_replicas_available
      for: 10m
      labels:
        severity: warning
      annotations:
        summary: "Deployment {{ $labels.deployment }} 副本数不匹配"
        description: "期望副本: {{ $value }}, 实际可用: {{ $value }}"

    - alert: DeploymentRolloutStuck
      expr: kube_deployment_status_condition{condition="Progressing", status="false"} == 1
      for: 10m
      labels:
        severity: critical
      annotations:
        summary: "Deployment {{ $labels.deployment }} 更新卡住"
        description: "Deployment 更新已经卡住超过 10 分钟"

    # 服务告警
    - alert: ServiceEndpointsMissing
      expr: kube_service_spec_type == "ClusterIP" and kube_endpoint_address_available == 0
      for: 5m
      labels:
        severity: critical
      annotations:
        summary: "Service {{ $labels.service }} 没有可用端点"
        description: "服务没有可用的后端 Pod"

    # 证书告警
    - alert: KubeCertificateExpiry
      expr: (probe_ssl_earliest_cert_expiry - time()) / 86400 < 30
      for: 1h
      labels:
        severity: warning
      annotations:
        summary: "Kubernetes 证书即将过期"
        description: "证书将在 {{ $value }} 天内过期"

    # etcd 告警
    - alert: EtcdHighCommitDuration
      expr: histogram_quantile(0.99, rate(etcd_disk_backend_commit_duration_seconds_bucket[5m])) > 0.25
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "etcd 提交延迟过高"
        description: "P99 提交延迟: {{ $value }}s"

    - alert: EtcdMembersDown
      expr: etcd_cluster_members == 0
      for: 1m
      labels:
        severity: critical
      annotations:
        summary: "etcd 成员宕机"
        description: "etcd 集群有成员不可用"

    # API Server 告警
    - alert: KubeAPIErrorRate
      expr: sum(rate(apiserver_request_total{code=~"5.."}[5m])) / sum(rate(apiserver_request_total[5m])) > 0.05
      for: 5m
      labels:
        severity: critical
      annotations:
        summary: "Kubernetes API 错误率过高"
        description: "API 错误率: {{ $value | humanizePercentage }}"

    - alert: KubeAPILatencyHigh
      expr: histogram_quantile(0.99, rate(apiserver_request_duration_seconds_bucket[5m])) > 1
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "Kubernetes API 延迟过高"
        description: "P99 延迟: {{ $value }}s"

8.3 Grafana Dashboard 推荐

# 推荐导入的 Dashboard ID
# Kubernetes Cluster (Prometheus): 6417
# Kubernetes Node Exporter: 1860
# Kubernetes Pod Monitoring: 6336
# Prometheus Stats: 2
# AlertManager: 9575
# Node Exporter Full: 1860
# Kubelet: 10032
# API Server: 10168
# etcd: 3070
# CoreDNS: 12539

8.4 监控命令

# 查看节点状态
kubectl get nodes -o wide
kubectl top nodes
kubectl describe node <node-name>

# 查看 Pod 状态
kubectl get pods -A -o wide
kubectl top pods -A
kubectl get pods --field-selector=status.phase!=Running -A

# 查看资源使用
kubectl top nodes --sort-by=cpu
kubectl top pods --sort-by=memory -n <namespace>

# 查看事件
kubectl get events -A --sort-by='.lastTimestamp'
kubectl get events -n <namespace> --field-selector type=Warning

# Prometheus 查询示例
# CPU 使用率
sum(rate(container_cpu_usage_seconds_total{image!=""}[5m])) by (pod)

# 内存使用率
sum(container_memory_usage_bytes{image!=""}) by (pod)

# 磁盘使用率
sum(kubelet_volume_stats_used_bytes) by (persistentvolumeclaim)

# 网络流量
sum(rate(container_network_receive_bytes_total[5m])) by (pod)

9. 日志收集

9.1 Fluentd 配置

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentd-config
  namespace: logging
data:
  fluent.conf: |
    <system>
      log_level info
    </system>

    <source>
      @type tail
      @id in_tail_container_logs
      path /var/log/containers/*.log
      pos_file /var/log/fluentd-containers.log.pos
      tag kubernetes.*
      read_from_head true
      <parse>
        @type json
        time_key time
        time_format %Y-%m-%dT%H:%M:%S.%NZ
      </parse>
    </source>

    <filter kubernetes.**>
      @type kubernetes_metadata
      @id filter_kube_metadata
      kubernetes_url https://kubernetes.default.svc
      verify_ssl false
      ca_file /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file /var/run/secrets/kubernetes.io/serviceaccount/token
    </filter>

    <match kubernetes.**>
      @type elasticsearch
      @id out_es
      host elasticsearch.logging.svc
      port 9200
      logstash_format true
      logstash_prefix kubernetes
      include_tag_key true
      tag_key @log_name
      flush_interval 5s
      <buffer>
        @type file
        path /var/log/fluentd-buffers/kubernetes.system.buffer
        flush_mode interval
        retry_type exponential_backoff
        flush_thread_count 2
        flush_interval 5s
        retry_forever
        retry_max_interval 30
        chunk_limit_size 2M
        queue_limit_length 8
        overflow_action block
      </buffer>
    </match>

9.2 Loki 配置

# loki-config.yaml
auth_enabled: false
server:
  http_listen_port: 3100
  grpc_listen_port: 9096
common:
  path_prefix: /loki
  storage:
    filesystem:
      chunks_directory: /loki/chunks
      rules_directory: /loki/rules
  replication_factor: 1
  ring:
    instance_addr: 127.0.0.1
    kvstore:
      store: inmemory
query_range:
  results_cache:
    cache:
      embedded_cache:
        enabled: true
        max_size_mb: 100
schema_config:
  configs:
    - from: 2020-10-24
      store: boltdb-shipper
      object_store: filesystem
      schema: v11
      index:
        prefix: index_
        period: 24h
ruler:
  alertmanager_url: http://localhost:9093
limits_config:
  retention_period: 744h
  ingestion_rate_mb: 16
  ingestion_burst_size_mb: 24
  per_stream_rate_limit: 5MB
  per_stream_rate_limit_burst: 15MB
  max_entries_limit_per_query: 5000
  max_label_name_length: 1024
  max_label_value_length: 2048
  max_label_names_per_series: 30

9.3 日志查询命令

# 查看 Pod 日志
kubectl logs <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace> -c <container-name>
kubectl logs <pod-name> -n <namespace> --tail=100
kubectl logs <pod-name> -n <namespace> --since=1h
kubectl logs -f <pod-name> -n <namespace>

# 查看多 Pod 日志
kubectl logs -l app=<label> -n <namespace>
kubectl logs -l app=<label> -n <namespace> --all-containers

# 查看之前容器的日志(崩溃后)
kubectl logs <pod-name> -n <namespace> --previous

# 日志导出
kubectl logs <pod-name> -n <namespace> > pod-logs.txt

# 使用 stern 查看多 Pod 日志(需安装 stern)
stern -n <namespace> <pod-name-pattern>
stern --tail=100 -n <namespace> <pod-name-pattern>

# 使用 kubectl-logs 插件
kubectl logs-since 1h -n <namespace> <pod-name>

10. 故障排查

10.1 常见问题诊断流程

# 1. 检查集群整体状态
kubectl cluster-info
kubectl get componentstatuses
kubectl get nodes
kubectl get pods -A | grep -v Running

# 2. 检查节点问题
kubectl describe node <node-name>
kubectl top nodes
kubectl get events --field-selector involvedObject.kind=Node --sort-by='.lastTimestamp'

# 3. 检查 Pod 问题
kubectl describe pod <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace> --previous
kubectl get events -n <namespace> --sort-by='.lastTimestamp'

# 4. 检查网络问题
kubectl exec -it <pod-name> -n <namespace> -- ping <target>
kubectl exec -it <pod-name> -n <namespace> -- nslookup <service>
kubectl exec -it <pod-name> -n <namespace> -- curl <service>:<port>

# 5. 检查存储问题
kubectl get pv
kubectl get pvc -A
kubectl describe pvc <pvc-name> -n <namespace>

# 6. 检查资源问题
kubectl top pods -n <namespace>
kubectl describe quota -n <namespace>
kubectl describe limitrange -n <namespace>

10.2 Pod 故障排查

# Pod 状态说明
# Pending: 调度失败,检查资源、亲和性、污点
# ContainerCreating: 镜像拉取或存储挂载问题
# Running: 容器运行中
# Error/CrashLoopBackOff: 容器崩溃,检查日志
# OOMKilled: 内存不足

# 查看 Pod 详细状态
kubectl get pod <pod-name> -n <namespace> -o yaml
kubectl get pod <pod-name> -n <namespace> -o json | jq '.status'

# 查看 Pod 事件
kubectl get events -n <namespace> --field-selector involvedObject.name=<pod-name>

# 进入容器调试
kubectl exec -it <pod-name> -n <namespace> -- /bin/sh
kubectl exec -it <pod-name> -n <namespace> -- /bin/bash

# 查看容器状态
kubectl get pod <pod-name> -n <namespace> -o jsonpath='{.status.containerStatuses[*]}'

# 调试镜像问题
kubectl debug -it <pod-name> -n <namespace> --image=busybox --target=<container-name>

10.3 节点故障排查

# 查看节点条件
kubectl get node <node-name> -o jsonpath='{.status.conditions}'

# 查看节点容量和分配
kubectl describe node <node-name> | grep -A 5 "Allocated resources"

# 检查节点污点
kubectl describe node <node-name> | grep -A 5 "Taints"

# 查看节点上的 Pod
kubectl get pods --all-namespaces -o wide --field-selector spec.nodeName=<node-name>

# 驱逐节点上的 Pod
kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-data

# 恢复节点
kubectl uncordon <node-name>

# 检查 kubelet 状态
systemctl status kubelet
journalctl -u kubelet -f
cat /var/log/kubelet.log

# 检查容器运行时
systemctl status containerd
crictl info
crictl pods
crictl ps -a

10.4 网络故障排查

# 检查 CoreDNS
kubectl get pods -n kube-system -l k8s-app=kube-dns
kubectl logs -n kube-system -l k8s-app=kube-dns
kubectl exec -it <dns-pod> -n kube-system -- nslookup kubernetes.default

# 检查 Service
kubectl get svc -A
kubectl get endpoints <service-name> -n <namespace>
kubectl describe svc <service-name> -n <namespace>

# 检查 NetworkPolicy
kubectl get networkpolicy -A
kubectl describe networkpolicy <policy-name> -n <namespace>

# 测试网络连通性
kubectl run test-pod --rm -it --image=busybox -- sh
# 在容器内执行:
# ping <pod-ip>
# nslookup <service-name>.<namespace>.svc.cluster.local
# wget <service-name>:<port>

# 检查 CNI 插件
kubectl get pods -n kube-system -l k8s-app=calico-node
kubectl logs -n kube-system -l k8s-app=calico-node

# 检查 iptables 规则
iptables -L -n -v | grep -i <service-ip>

10.5 存储故障排查

# 检查 PV/PVC 状态
kubectl get pv
kubectl get pvc -A
kubectl describe pv <pv-name>
kubectl describe pvc <pvc-name> -n <namespace>

# 检查 StorageClass
kubectl get storageclass
kubectl describe storageclass <sc-name>

# 检查 CSI 驱动
kubectl get pods -n <csi-namespace>
kubectl logs -n <csi-namespace> -l app=<csi-driver>

# 检查挂载问题
kubectl describe pod <pod-name> -n <namespace> | grep -A 10 "Volumes"

# 测试存储写入
kubectl run test-storage --rm -it --image=busybox -- sh
# 在容器内执行:
# df -h
# echo "test" > /data/test.txt
# cat /data/test.txt

10.6 性能问题排查

# 检查 API Server 延迟
kubectl get --raw /metrics | grep apiserver_request_duration_seconds

# 检查 etcd 性能
kubectl get --raw /metrics | grep etcd_disk_backend_commit_duration_seconds

# 检查调度延迟
kubectl get --raw /metrics | grep scheduler_schedule_attempts_total

# 检查容器启动时间
kubectl get pod <pod-name> -n <namespace> -o jsonpath='{.status.containerStatuses[*].state}'

# 检查镜像拉取时间
kubectl describe pod <pod-name> -n <namespace> | grep -A 5 "Events"

# 使用 kubectl-trace 分析性能
kubectl trace run <node-name> -e 'tracepoint:sched:sched_switch { printf("%s\\n", comm); }'

11. 最佳实践清单

11.1 部署前检查清单

  • 集群版本已确认(生产环境建议使用稳定版本)
  • Master 节点高可用配置完成(至少 3 节点)
  • etcd 集群已优化(SSD 存储、独立磁盘)
  • 网络插件已选择并配置(Calico/Cilium)
  • 存储方案已确定(本地/NFS/Ceph/云存储)
  • RBAC 权限已配置
  • NetworkPolicy 已启用
  • Pod Security Standards 已配置
  • 资源配额已设置
  • 监控告警已部署
  • 日志收集已配置
  • 备份方案已就绪

11.2 运行时最佳实践

  • 始终设置资源 requests/limits
  • 使用 Health Check(liveness/readiness)
  • 配置 Pod Disruption Budget
  • 使用 PriorityClass 区分优先级
  • 启用自动扩缩容(HPA/VPA)
  • 定期轮换证书
  • 定期更新 Kubernetes 版本
  • 监控集群容量规划
  • 实施多可用区部署
  • 配置自动备份

11.3 安全最佳实践

  • 启用 RBAC 最小权限原则
  • 使用 ServiceAccount 而非默认
  • 启用 Pod Security Admission
  • 配置 NetworkPolicy 限制流量
  • 加密 etcd 数据
  • 启用审计日志
  • 定期扫描镜像漏洞
  • 使用私有镜像仓库
  • 限制 privileged 容器
  • 启用 Secret 加密

11.4 运维最佳实践

  • 实施 GitOps 工作流
  • 使用 Helm/Kustomize 管理配置
  • 配置自动备份(etcd、PV)
  • 建立变更管理流程
  • 实施蓝绿/金丝雀发布
  • 配置自动故障恢复
  • 定期演练灾难恢复
  • 维护运行手册(Runbook)
  • 建立 on-call 机制
  • 持续优化成本

11.5 监控指标阈值

指标 警告阈值 严重阈值 说明
CPU 使用率 >80% >95% 持续 10 分钟
内存使用率 >80% >95% 持续 10 分钟
磁盘使用率 >80% >90% 持续 5 分钟
Pod 重启次数 >3/小时 >5/小时 单 Pod
API 错误率 >1% >5% 5 分钟平均
API 延迟 P99 >500ms >1s 5 分钟平均
etcd 延迟 P99 >100ms >250ms 5 分钟平均
节点宕机 1 节点 2 节点 持续 5 分钟

附录:常用命令速查

# 集群信息
kubectl cluster-info
kubectl version --short
kubectl get componentstatuses

# 节点管理
kubectl get nodes -o wide
kubectl describe node <node>
kubectl cordon/uncordon <node>
kubectl drain <node> --ignore-daemonsets

# Pod 管理
kubectl get pods -A -o wide
kubectl describe pod <pod> -n <ns>
kubectl logs <pod> -n <ns> [-f] [--previous]
kubectl exec -it <pod> -n <ns> -- <cmd>
kubectl delete pod <pod> -n <ns> --grace-period=0 --force

# 部署管理
kubectl get deployments -A
kubectl rollout status deployment/<name> -n <ns>
kubectl rollout undo deployment/<name> -n <ns>
kubectl scale deployment/<name> --replicas=N -n <ns>

# 服务发现
kubectl get svc -A
kubectl get endpoints -A
kubectl get ingress -A

# 配置管理
kubectl get configmap -n <ns>
kubectl get secret -n <ns>
kubectl create configmap <name> --from-file=<file>
kubectl create secret generic <name> --from-literal=key=value

# 资源管理
kubectl top nodes
kubectl top pods -n <ns>
kubectl describe quota -n <ns>
kubectl describe limitrange -n <ns>

# 故障排查
kubectl get events -A --sort-by='.lastTimestamp'
kubectl get events -n <ns> --field-selector type=Warning
kubectl explain <resource>.<field>

# 标签选择
kubectl get pods -l app=<label>
kubectl get pods --selector=app=<label>,env=prod

# 输出格式
kubectl get pods -o yaml
kubectl get pods -o json
kubectl get pods -o wide
kubectl get pods -o custom-columns=NAME:.metadata.name,STATUS:.status.phase

文档版本: 2026-03-09
维护者: SRE Team
更新频率: 季度更新
反馈渠道: sre-team@example.com

results matching ""

    No results matching ""