首页 >后端开发 >Golang >从 ECR 到 EKS 的图像无法正常工作,因为生成的 Pod 始终为 0/2

从 ECR 到 EKS 的图像无法正常工作,因为生成的 Pod 始终为 0/2

WBOY
WBOY转载
2024-02-08 22:39:08915浏览

从 ECR 到 EKS 的图像无法正常工作,因为生成的 Pod 始终为 0/2

php小编草莓在解决容器化应用部署问题时,发现从ECR(Amazon Elastic Container Registry)到EKS(Amazon Elastic Kubernetes Service)的图像无法正常工作的情况。具体表现为生成的Pod始终为0/2,这意味着容器无法正常启动或运行。这个问题可能涉及到多个方面,包括图像本身的问题、容器配置的错误或者网络环境的限制等。下面将详细介绍一些常见的解决方案,帮助开发者快速解决这个问题。

问题内容

我已经尝试了几乎所有方法来让事情走上正确的路径,但仍然无法让我的 pod 处于可用状态。

所以我有一个用 go 编写的基本应用程序。

我使用 docker build --tag docker-gs-ping . 创建了程序的映像 然后我尝试在容器内运行相同的命令 docker run --publish 8080:8080 docker-gs-ping

然后我想将我的图像保存到 amazon ecr,为此我在 ecr 中创建了一个存储库。

现在,在创建存储库后,我标记了本地中存在的图像。

docker tag f49366b7f534 ****40312665.dkr.ecr.us-east-1.amazonaws.com/docker-gs-ping:latest

f49366b7f534是我本地的图像标签。 docker-gs-ping 是 ecr 中的存储库名称。

然后我使用命令将标记的图像上传到 ecr。

docker push ****40312665.dkr.ecr.us-east-1.amazonaws.com/docker-gs-ping:latest

不确定上述命令是否会从本地推送标记的图像或最近的图像,因为无法提及要推送到 ecr 的特定图像。

目前的结果是

完成上述步骤后,我使用以下文件和命令创建了一个 vps:

eks 堆栈:

---
awstemplateformatversion: '2010-09-09'
description: 'amazon eks cluster'

parameters:
  clustername:
    type: string
    default: my-eks-cluster
  numberofworkernodes:
    type: number
    default: 1
  workernodesinstancetype:
    type: string
    default: t2.micro
  kubernetesversion:
    type: string
    default: 1.22
    
resources:

  ###########################################
  ## roles
  ###########################################
  eksrole:
    type: aws::iam::role
    properties: 
      rolename: my.eks.cluster.role
      assumerolepolicydocument:
        version: "2012-10-17"
        statement:
          - effect: allow
            principal:
              service:
                - eks.amazonaws.com
            action:
              - sts:assumerole
      path: /
      managedpolicyarns:
        - "arn:aws:iam::aws:policy/amazoneksclusterpolicy"
  eksnoderole:
    type: aws::iam::role
    properties: 
      rolename: my.eks.node.role
      assumerolepolicydocument:
        version: "2012-10-17"
        statement:
          - effect: allow
            principal:
              service:
                - ec2.amazonaws.com
            action:
              - sts:assumerole
      path: /
      managedpolicyarns:
        - "arn:aws:iam::aws:policy/amazoneksworkernodepolicy"
        - "arn:aws:iam::aws:policy/amazonec2containerregistryreadonly"
        - "arn:aws:iam::aws:policy/amazoneks_cni_policy"

  ###########################################
  ## eks cluster
  ###########################################

  ekscluster:
    type: aws::eks::cluster
    properties:
      name: !ref clustername
      version: !ref kubernetesversion
      rolearn: !getatt eksrole.arn
      resourcesvpcconfig:
        securitygroupids:
          - !importvalue controlplanesecuritygroupid
        subnetids: !split [ ',', !importvalue privatesubnetids ]

  eksnodegroup:
    type: aws::eks::nodegroup
    dependson: ekscluster
    properties:
      clustername: !ref clustername
      noderole: !getatt eksnoderole.arn
      scalingconfig:
        minsize:
          ref: numberofworkernodes
        desiredsize:
          ref: numberofworkernodes
        maxsize:
          ref: numberofworkernodes
      subnets: !split [ ',', !importvalue privatesubnetids ]

命令:aws cloudformation create-stack --region us-east-1 --stack-name my-eks-cluster --capability capability_named_iam --template-body file://eks-stack.yaml

eks vpc yaml

---
awstemplateformatversion: '2010-09-09'
description: 'amazon eks vpc - private and public subnets'

parameters:

  vpcblock:
    type: string
    default: 192.168.0.0/16
    description: the cidr range for the vpc. this should be a valid private (rfc 1918) cidr range.

  publicsubnet01block:
    type: string
    default: 192.168.0.0/18
    description: cidrblock for public subnet 01 within the vpc

  publicsubnet02block:
    type: string
    default: 192.168.64.0/18
    description: cidrblock for public subnet 02 within the vpc

  privatesubnet01block:
    type: string
    default: 192.168.128.0/18
    description: cidrblock for private subnet 01 within the vpc

  privatesubnet02block:
    type: string
    default: 192.168.192.0/18
    description: cidrblock for private subnet 02 within the vpc

metadata:
  aws::cloudformation::interface:
    parametergroups:
      -
        label:
          default: "worker network configuration"
        parameters:
          - vpcblock
          - publicsubnet01block
          - publicsubnet02block
          - privatesubnet01block
          - privatesubnet02block

resources:
  vpc:
    type: aws::ec2::vpc
    properties:
      cidrblock:  !ref vpcblock
      enablednssupport: true
      enablednshostnames: true
      tags:
      - key: name
        value: !sub '${aws::stackname}-vpc'

  internetgateway:
    type: "aws::ec2::internetgateway"

  vpcgatewayattachment:
    type: "aws::ec2::vpcgatewayattachment"
    properties:
      internetgatewayid: !ref internetgateway
      vpcid: !ref vpc

  publicroutetable:
    type: aws::ec2::routetable
    properties:
      vpcid: !ref vpc
      tags:
      - key: name
        value: public subnets
      - key: network
        value: public

  privateroutetable01:
    type: aws::ec2::routetable
    properties:
      vpcid: !ref vpc
      tags:
      - key: name
        value: private subnet az1
      - key: network
        value: private01

  privateroutetable02:
    type: aws::ec2::routetable
    properties:
      vpcid: !ref vpc
      tags:
      - key: name
        value: private subnet az2
      - key: network
        value: private02

  publicroute:
    dependson: vpcgatewayattachment
    type: aws::ec2::route
    properties:
      routetableid: !ref publicroutetable
      destinationcidrblock: 0.0.0.0/0
      gatewayid: !ref internetgateway

  privateroute01:
    dependson:
    - vpcgatewayattachment
    - natgateway01
    type: aws::ec2::route
    properties:
      routetableid: !ref privateroutetable01
      destinationcidrblock: 0.0.0.0/0
      natgatewayid: !ref natgateway01

  privateroute02:
    dependson:
    - vpcgatewayattachment
    - natgateway02
    type: aws::ec2::route
    properties:
      routetableid: !ref privateroutetable02
      destinationcidrblock: 0.0.0.0/0
      natgatewayid: !ref natgateway02

  natgateway01:
    dependson:
    - natgatewayeip1
    - publicsubnet01
    - vpcgatewayattachment
    type: aws::ec2::natgateway
    properties:
      allocationid: !getatt 'natgatewayeip1.allocationid'
      subnetid: !ref publicsubnet01
      tags:
      - key: name
        value: !sub '${aws::stackname}-natgatewayaz1'

  natgateway02:
    dependson:
    - natgatewayeip2
    - publicsubnet02
    - vpcgatewayattachment
    type: aws::ec2::natgateway
    properties:
      allocationid: !getatt 'natgatewayeip2.allocationid'
      subnetid: !ref publicsubnet02
      tags:
      - key: name
        value: !sub '${aws::stackname}-natgatewayaz2'

  natgatewayeip1:
    dependson:
    - vpcgatewayattachment
    type: 'aws::ec2::eip'
    properties:
      domain: vpc

  natgatewayeip2:
    dependson:
    - vpcgatewayattachment
    type: 'aws::ec2::eip'
    properties:
      domain: vpc

  publicsubnet01:
    type: aws::ec2::subnet
    metadata:
      comment: subnet 01
    properties:
      mappubliciponlaunch: true
      availabilityzone:
        fn::select:
        - '0'
        - fn::getazs:
            ref: aws::region
      cidrblock:
        ref: publicsubnet01block
      vpcid:
        ref: vpc
      tags:
      - key: name
        value: !sub "${aws::stackname}-publicsubnet01"
      - key: kubernetes.io/role/elb
        value: 1

  publicsubnet02:
    type: aws::ec2::subnet
    metadata:
      comment: subnet 02
    properties:
      mappubliciponlaunch: true
      availabilityzone:
        fn::select:
        - '1'
        - fn::getazs:
            ref: aws::region
      cidrblock:
        ref: publicsubnet02block
      vpcid:
        ref: vpc
      tags:
      - key: name
        value: !sub "${aws::stackname}-publicsubnet02"
      - key: kubernetes.io/role/elb
        value: 1

  privatesubnet01:
    type: aws::ec2::subnet
    metadata:
      comment: subnet 03
    properties:
      availabilityzone:
        fn::select:
        - '0'
        - fn::getazs:
            ref: aws::region
      cidrblock:
        ref: privatesubnet01block
      vpcid:
        ref: vpc
      tags:
      - key: name
        value: !sub "${aws::stackname}-privatesubnet01"
      - key: kubernetes.io/role/internal-elb
        value: 1

  privatesubnet02:
    type: aws::ec2::subnet
    metadata:
      comment: private subnet 02
    properties:
      availabilityzone:
        fn::select:
        - '1'
        - fn::getazs:
            ref: aws::region
      cidrblock:
        ref: privatesubnet02block
      vpcid:
        ref: vpc
      tags:
      - key: name
        value: !sub "${aws::stackname}-privatesubnet02"
      - key: kubernetes.io/role/internal-elb
        value: 1

  publicsubnet01routetableassociation:
    type: aws::ec2::subnetroutetableassociation
    properties:
      subnetid: !ref publicsubnet01
      routetableid: !ref publicroutetable

  publicsubnet02routetableassociation:
    type: aws::ec2::subnetroutetableassociation
    properties:
      subnetid: !ref publicsubnet02
      routetableid: !ref publicroutetable

  privatesubnet01routetableassociation:
    type: aws::ec2::subnetroutetableassociation
    properties:
      subnetid: !ref privatesubnet01
      routetableid: !ref privateroutetable01

  privatesubnet02routetableassociation:
    type: aws::ec2::subnetroutetableassociation
    properties:
      subnetid: !ref privatesubnet02
      routetableid: !ref privateroutetable02

  controlplanesecuritygroup:
    type: aws::ec2::securitygroup
    properties:
      groupdescription: cluster communication with worker nodes
      vpcid: !ref vpc

outputs:

  publicsubnetids:
    description: public subnets ids in the vpc
    value: !join [ ",", [ !ref publicsubnet01, !ref publicsubnet02 ] ]
    export:
      name: publicsubnetids
  
  privatesubnetids:
    description: private subnets ids in the vpc
    value: !join [ ",", [ !ref privatesubnet01, !ref privatesubnet02 ] ]
    export:
      name: privatesubnetids

  controlplanesecuritygroupid:
    description: security group for the cluster control plane communication with worker nodes
    value: !ref controlplanesecuritygroup
    export:
      name: controlplanesecuritygroupid

  vpcid:
    description: the vpc id
    value: !ref vpc
    export:
      name: vpcid

命令:aws cloudformation create-stack --region us-east-1 --stack-name my-eks-vpc --template-body file://eks-vpc-stack.yaml

命令后的结果:

现在我尝试部署deployment.yaml和service.yaml文件

deployment.yaml

apiversion: apps/v1
kind: deployment
metadata:
  name: helloworld
  namespace: default
spec:
  replicas: 2
  selector:
    matchlabels:
      app: helloworld
  template:
    metadata:
      labels:
        app: helloworld
    spec:
      containers:
        - name: new-container
          image: ****40312665.dkr.ecr.us-east-1.amazonaws.com/docker-gs-ping:latest
          ports:
            - containerport: 80

命令和结果:

现在service.yaml

apiversion: v1
kind: service
metadata:
  name: helloworld
spec:
  type: loadbalancer
  selector:
    app: helloworld
  ports:
    - name: http
      port: 80
      targetport: 80

命令和结果:

完成这一切后,当我运行 kubectl get 部署时,我得到如下结果:

为了调试,我尝试了 kubectl描述pod helloworld,我得到如下

C:\Users\visratna\GolandProjects\testaws>kubectl describe pod helloworld
Name:             helloworld-c6dc56598-jmpvr
Namespace:        default
Priority:         0
Service Account:  default
Node:             docker-desktop/192.168.65.4
Start Time:       Fri, 07 Jul 2023 22:22:18 +0530
Labels:           app=helloworld
                  pod-template-hash=c6dc56598
Annotations:      <none>
Status:           Pending
IP:               10.1.0.7
IPs:
  IP:           10.1.0.7
Controlled By:  ReplicaSet/helloworld-c6dc56598
Containers:
  new-container:
    Container ID:
    Image:          549840312665.dkr.ecr.us-east-1.amazonaws.com/docker-gs-ping:latest
    Image ID:
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       ImagePullBackOff
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-sldvv (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  kube-api-access-sldvv:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  23m                   default-scheduler  Successfully assigned default/helloworld-c6dc56598-jmpvr to docker-desktop
  Normal   Pulling    22m (x4 over 23m)     kubelet            Pulling image "549840312665.dkr.ecr.us-east-1.amazonaws.com/docker-gs-ping:latest"
  Warning  Failed     22m (x4 over 23m)     kubelet            Failed to pull image "549840312665.dkr.ecr.us-east-1.amazonaws.com/docker-gs-ping:latest": rpc error: code = Unknown desc = Error response from daemon: Head "https://549840312665.dkr.ecr.us-east-1.amazonaws.com/v2/docker-gs-ping/manifests/latest": no basic auth credentials
  Warning  Failed     22m (x4 over 23m)     kubelet            Error: ErrImagePull
  Warning  Failed     22m (x6 over 23m)     kubelet            Error: ImagePullBackOff
  Normal   BackOff    3m47s (x85 over 23m)  kubelet            Back-off pulling image "549840312665.dkr.ecr.us-east-1.amazonaws.com/docker-gs-ping:latest"

Name:             helloworld-c6dc56598-r9b4d
Namespace:        default
Priority:         0
Service Account:  default
Node:             docker-desktop/192.168.65.4
Start Time:       Fri, 07 Jul 2023 22:22:18 +0530
Labels:           app=helloworld
                  pod-template-hash=c6dc56598
Annotations:      <none>
Status:           Pending
IP:               10.1.0.6
IPs:
  IP:           10.1.0.6
Controlled By:  ReplicaSet/helloworld-c6dc56598
Containers:
  new-container:
    Container ID:
    Image:          549840312665.dkr.ecr.us-east-1.amazonaws.com/docker-gs-ping:latest
    Image ID:
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       ImagePullBackOff
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-84rw4 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  kube-api-access-84rw4:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  23m                   default-scheduler  Successfully assigned default/helloworld-c6dc56598-r9b4d to docker-desktop
  Normal   Pulling    22m (x4 over 23m)     kubelet            Pulling image "549840312665.dkr.ecr.us-east-1.amazonaws.com/docker-gs-ping:latest"
  Warning  Failed     22m (x4 over 23m)     kubelet            Failed to pull image "549840312665.dkr.ecr.us-east-1.amazonaws.com/docker-gs-ping:latest": rpc error: code = Unknown desc = Error response from daemon: Head "https://549840312665.dkr.ecr.us-east-1.amazonaws.com/v2/docker-gs-ping/manifests/latest": no basic auth credentials
  Warning  Failed     22m (x4 over 23m)     kubelet            Error: ErrImagePull
  Warning  Failed     22m (x6 over 23m)     kubelet            Error: ImagePullBackOff
  Normal   BackOff    3m43s (x86 over 23m)  kubelet            Back-off pulling image "549840312665.dkr.ecr.us-east-1.amazonaws.com/docker-gs-ping:latest"

我已经按照 stackoverflow 上的建议尝试了许多解决方案,但似乎没有任何对我有用的解决方案,有什么建议我可以让事情正常工作吗?预先非常感谢您。

解决方法

有几件事。首先,您应该避免使用最新标签。这是一种反模式。当您将映像推送到 ECR 时,请使用构建标签或版本号作为映像标签。其次,您需要验证您的工作线程节点是否有权从 ECR 提取映像,特别是 AmazonEC2ContainerRegistryReadOnly 策略。否则,kubelet 将无法从 ECR 中提取镜像。如果注册表与集群位于不同的帐户中,则需要创建存储库[资源]策略。请参阅 https://docs.aws.amazon.com/AmazonECR /latest/userguide/repository-policies.html

以上是从 ECR 到 EKS 的图像无法正常工作,因为生成的 Pod 始终为 0/2的详细内容。更多信息请关注PHP中文网其他相关文章!

声明:
本文转载于:stackoverflow.com。如有侵权,请联系admin@php.cn删除