kakakakakku blog

Weekly Tech Blog: Keep on Learning!

Pod Topology Spread Constraints : Pod を Multi AZ 配置する

Kubernetes で「Pod Topology Spread Constraints」を使うと Pod をスケジューリングするときの制約条件を柔軟に設定できる.今回は Zone Spread (Multi AZ) を試す!詳しくは以下のドキュメントに載っている!

kubernetes.io

spec.topologySpreadConstraints

Pod Topology Spread Constraints を使うために YAML に spec.topologySpreadConstraints を追加する.構文と各項目の概要を以下にまとめる.単純に Node Label を指定するだけではなく maxSkewwhenUnsatisfiable も設定できて柔軟さを感じる💡

  • topologyKey : スケジュールする制約条件に使う Node Label
  • maxSkew : どれぐらい Pod 数の差を許容できるか(maxSkew = 1 なら均等 / maxSkew >= 2 ならある程度は均等)
  • whenUnsatisfiable : 制約を満たせない場合に Pod をどうするか(デフォルトは DoNotSchedule でスケジュールしない)
  • labelSelector : スケジュールの対象とする Pod Label
apiVersion: v1
kind: Pod
metadata:
  name: mypod
spec:
  topologySpreadConstraints:
    - maxSkew: <integer>
      topologyKey: <string>
      whenUnsatisfiable: <string>
      labelSelector: <object>

検証環境

今回は検証環境として eksctl を使って構築した Amazon EKS クラスタ (Kubernetes 1.21) を使う.以下のように 3 Availability Zone に 6 ノードを構築してある.IP アドレスは説明のためにわかりやすく書き換えてある(第3オクテットを .10 / .50 / .90 に合わせた).

$ kubectl get nodes \
>   -l 'eks.amazonaws.com/nodegroup=managed' \
>   -o=custom-columns=NAME:.metadata.name,ZONE:metadata.labels."topology\.kubernetes\.io/zone" \
>   --sort-by metadata.labels."topology\.kubernetes\.io/zone"
NAME                                                ZONE
ip-192-168-10-100.ap-northeast-1.compute.internal   ap-northeast-1a
ip-192-168-10-101.ap-northeast-1.compute.internal   ap-northeast-1a
ip-192-168-50-100.ap-northeast-1.compute.internal   ap-northeast-1c
ip-192-168-50-101.ap-northeast-1.compute.internal   ap-northeast-1c
ip-192-168-90-100.ap-northeast-1.compute.internal   ap-northeast-1d
ip-192-168-90-101.ap-northeast-1.compute.internal   ap-northeast-1d

f:id:kakku22:20220302213001p:plain

Pod Topology Spread Constraints を試す

さっそく Pod Topology Spread Constraints を試す.Deployment で Pod を「12個」起動する以下のマニフェストを作った.ポイントは spec.topologySpreadConstraints.topologyKeytopology.kubernetes.io/zone を指定しているところ.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: topology-spread-constraints
spec:
  replicas: 12
  selector:
    matchLabels:
      app: topology-spread-constraints
  template:
    metadata:
      labels:
        app: topology-spread-constraints
    spec:
      containers:
      - name: nginx
        image: nginx:1.21
        ports:
          - containerPort: 80
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: DoNotSchedule
          labelSelector:
            matchLabels:
              app: topology-spread-constraints

Pod を確認するとうまく Zone Spread になった.各 Availability Zone に 4 Pod 起動している.なお kubectl get pods の結果に Zone を含めることができず,判断しにくいため図を載せておく!

$ kubectl get pods \
>   -o=custom-columns=NAME:.metadata.name,NODE:.spec.nodeName \
>   -l app=topology-spread-constraints \
>   --sort-by .spec.nodeName
NAME                                           NODE
topology-spread-constraints-858b874fb9-vptnt   ip-192-168-10-100.ap-northeast-1.compute.internal
topology-spread-constraints-858b874fb9-rt948   ip-192-168-10-101.ap-northeast-1.compute.internal
topology-spread-constraints-858b874fb9-xsbh8   ip-192-168-10-101.ap-northeast-1.compute.internal
topology-spread-constraints-858b874fb9-cn992   ip-192-168-10-101.ap-northeast-1.compute.internal
topology-spread-constraints-858b874fb9-qtcm7   ip-192-168-50-100.ap-northeast-1.compute.internal
topology-spread-constraints-858b874fb9-xlxx4   ip-192-168-50-100.ap-northeast-1.compute.internal
topology-spread-constraints-858b874fb9-75pl8   ip-192-168-50-101.ap-northeast-1.compute.internal
topology-spread-constraints-858b874fb9-76t6h   ip-192-168-50-101.ap-northeast-1.compute.internal
topology-spread-constraints-858b874fb9-fnz7d   ip-192-168-90-100.ap-northeast-1.compute.internal
topology-spread-constraints-858b874fb9-cldf2   ip-192-168-90-101.ap-northeast-1.compute.internal
topology-spread-constraints-858b874fb9-d2kgt   ip-192-168-90-101.ap-northeast-1.compute.internal
topology-spread-constraints-858b874fb9-dqk2w   ip-192-168-90-101.ap-northeast-1.compute.internal

f:id:kakku22:20220302221224p:plain

Pod Topology Spread Constraints を複数組み合わせる

今の例では 各 Availability Zone に 4 Pod 起動しているけど,ノード単位だと 1 Pod と 3 Pod など差がある.次はノードも条件に追加する.spec.topologySpreadConstraints.topologyKeykubernetes.io/hostname を追加する(クラスターによっては違う Label の場合もある).なお,ドキュメントに載ってる通り,複数の制約条件を組み合わせるとコンフリクトにより Pending になる可能性がある.今回実際に発生したため maxSkew: 2 にした.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: topology-spread-constraints
spec:
  replicas: 12
  selector:
    matchLabels:
      app: topology-spread-constraints
  template:
    metadata:
      labels:
        app: topology-spread-constraints
    spec:
      containers:
      - name: nginx
        image: nginx:1.21
        ports:
          - containerPort: 80
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: DoNotSchedule
          labelSelector:
            matchLabels:
              app: topology-spread-constraints
        - maxSkew: 2
          topologyKey: kubernetes.io/hostname
          whenUnsatisfiable: DoNotSchedule
          labelSelector:
            matchLabels:
              app: topology-spread-constraints

Pod を確認するとうまく Zone / Node Spread になった.各 Availability Zone に 4 Pod 起動していて,各ノードに 2 Pod 起動している.

$ kubectl get pods \
>   -o=custom-columns=NAME:.metadata.name,NODE:.spec.nodeName \
>   -l app=topology-spread-constraints \
>   --sort-by .spec.nodeName
NAME                                           NODE
topology-spread-constraints-546fcfcb77-2zs8r   ip-192-168-10-100.ap-northeast-1.compute.internal
topology-spread-constraints-546fcfcb77-6zrtj   ip-192-168-10-100.ap-northeast-1.compute.internal
topology-spread-constraints-546fcfcb77-6nflm   ip-192-168-10-101.ap-northeast-1.compute.internal
topology-spread-constraints-546fcfcb77-cjrmn   ip-192-168-10-101.ap-northeast-1.compute.internal
topology-spread-constraints-546fcfcb77-7mxf6   ip-192-168-50-100.ap-northeast-1.compute.internal
topology-spread-constraints-546fcfcb77-vglz2   ip-192-168-50-100.ap-northeast-1.compute.internal
topology-spread-constraints-546fcfcb77-qdmmb   ip-192-168-50-101.ap-northeast-1.compute.internal
topology-spread-constraints-546fcfcb77-wxwn8   ip-192-168-50-101.ap-northeast-1.compute.internal
topology-spread-constraints-546fcfcb77-ctgth   ip-192-168-90-100.ap-northeast-1.compute.internal
topology-spread-constraints-546fcfcb77-j59f6   ip-192-168-90-100.ap-northeast-1.compute.internal
topology-spread-constraints-546fcfcb77-dzlr8   ip-192-168-90-101.ap-northeast-1.compute.internal
topology-spread-constraints-546fcfcb77-8qg5x   ip-192-168-90-101.ap-northeast-1.compute.internal

f:id:kakku22:20220302222819p:plain

Pod Topology Spread Constraints と Affinity を組み合わせる

Pod Topology Spread ConstraintsAffinity を組み合わせることもできる.例えば Zone Spread は実現しつつ,Availability Zone 1d にはスケジュールしたくない場合などが考えられる.以下のように spec.affinity.nodeAffinityNotIn を追加した.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: topology-spread-constraints
spec:
  replicas: 12
  selector:
    matchLabels:
      app: topology-spread-constraints
  template:
    metadata:
      labels:
        app: topology-spread-constraints
    spec:
      containers:
      - name: nginx
        image: nginx:1.21
        ports:
          - containerPort: 80
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: DoNotSchedule
          labelSelector:
            matchLabels:
              app: topology-spread-constraints
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: topology.kubernetes.io/zone
                operator: NotIn
                values:
                - ap-northeast-1d

Pod を確認するとうまく Availability Zone 1d 以外で Zone Spread になった.各 Availability Zone に 6 Pod 起動している.

$ kubectl get pods \
>   -o=custom-columns=NAME:.metadata.name,NODE:.spec.nodeName \
>   -l app=topology-spread-constraints \
>   --sort-by .spec.nodeName
NAME                                           NODE
topology-spread-constraints-6dcfb69b4f-g24gm   ip-192-168-10-100.ap-northeast-1.compute.internal
topology-spread-constraints-6dcfb69b4f-gmrbg   ip-192-168-10-100.ap-northeast-1.compute.internal
topology-spread-constraints-6dcfb69b4f-qtb8s   ip-192-168-10-101.ap-northeast-1.compute.internal
topology-spread-constraints-6dcfb69b4f-85l9t   ip-192-168-10-101.ap-northeast-1.compute.internal
topology-spread-constraints-6dcfb69b4f-z9t65   ip-192-168-10-101.ap-northeast-1.compute.internal
topology-spread-constraints-6dcfb69b4f-kxdh5   ip-192-168-10-101.ap-northeast-1.compute.internal
topology-spread-constraints-6dcfb69b4f-tqx5m   ip-192-168-50-100.ap-northeast-1.compute.internal
topology-spread-constraints-6dcfb69b4f-hk8t9   ip-192-168-50-100.ap-northeast-1.compute.internal
topology-spread-constraints-6dcfb69b4f-zsgt9   ip-192-168-50-100.ap-northeast-1.compute.internal
topology-spread-constraints-6dcfb69b4f-5ldxm   ip-192-168-50-101.ap-northeast-1.compute.internal
topology-spread-constraints-6dcfb69b4f-tqc4h   ip-192-168-50-101.ap-northeast-1.compute.internal
topology-spread-constraints-6dcfb69b4f-vkc6v   ip-192-168-50-101.ap-northeast-1.compute.internal

f:id:kakku22:20220302224603p:plain

まとめ

Kubernetes で「Pod Topology Spread Constraints」を使って Pod の Zone Spread (Multi AZ) を試した.今までも podAntiAffinity を使って実現できたけどより柔軟に設定できる.また EKS Best Practices GuidesReliability (Applications) にも Schedule replicas across nodes として紹介されているので合わせて読むと良さそう!

aws.github.io

関連記事

kubernetes.io