前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >Harbor对接Ceph S3推大镜像retry的问题

Harbor对接Ceph S3推大镜像retry的问题

作者头像
云原生小白
发布2022-06-06 08:52:21
1.5K0
发布2022-06-06 08:52:21
举报
文章被收录于专栏:Loki

问题现象

当我们部署私有 Harbor 镜像仓库且采用 Ceph S3 作为存储后端时,您可能会经常遇到上传大容量镜像重试的问题。特别是当我们在管理 AI 模型文件发布的场景下,当我们采用 OCI 方式来封装模型文件(镜像单层超过 5GB)上传到 Harbor后,此现象就变得尤为突出。其主要现象如下:

代码语言:javascript
复制
docker images
REPOSITORY                                TAG       IMAGE ID       CREATED       SIZE
x.x.x.x:80/library/centos                 latest    adf05892850f   3 days ago    7.61GB

docker push x.x.x.x:80/library/centos
Using default tag: latest
The push refers to repository [x.x.x.x:80/library/centos]
9e607bb861a7: Retrying in 4 seconds 
  • rgw前端的lb,nginx及radosgw日志报404:
代码语言:javascript
复制
{
    "@timestamp":"2022-04-15T17:24:25+08:00",
    "@fields":{
        "remote_addr":"x.x.x.x",
        "remote_user":"",
        "body_bytes_sent":"223",
        "request_time":"0.057",
        "status":"404",
        "request":"PUT /xxx-harbor/docker/registry/v2/blobs/sha256/72/729ec3a6ada3a6d26faca9b4779a037231f1762f759ef34c08bdd61bf52cd704/data?partNumber=1&uploadId=2~EE0O6o35Ceuqg9ZII4MAT8_gnaEkr3n HTTP/1.1",
        "request_method":"PUT",
        "request_header":"{\"x-amz-copy-source-range\":\"bytes=0-10485759\",\"x-amz-content-sha256\":\"e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855\",\"accept-encoding\":\"gzip\",\"host\":\"x.x.x.x\",\"user-agent\":\"aws-sdk-go\\/1.15.11 (go1.17.7; linux; amd64)\",\"x-amz-date\":\"20220415T092425Z\",\"x-amz-copy-source\":\"xxx-harbor\\/docker\\/registry\\/v2\\/repositories\\/library\\/centos\\/_uploads\\/9731d0e9-5d78-4fa3-a42b-2cf8f08847f4\\/data\",\"content-length\":\"0\",\"authorization\":\"AWS4-HMAC-SHA256 Credential=7IRCELLC8J9BTBNQV87C\\/20220415\\/default\\/s3\\/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-copy-source;x-amz-copy-source-range;x-amz-date, Signature=7232dce2e274a01a9838760c502d78f5ce1ad12b653c2f27986fe585c7594eb3\"}"
    }
}

配置信息

Ceph S3 账户信息

代码语言:javascript
复制
账户: cloudsre
租户: legacy
bucket: harbor

Harbor 对接 Ceph S3 配置

代码语言:javascript
复制
storage:
  s3:
    region: default
    bucket: harbor
    regionendpoint: http://x.x.x.x
    multipartcopythresholdsize: 5368709120
  cache:
    layerinfo: redis
  maintenance:
    uploadpurging:
      enabled: false
  delete:
    enabled: true
  redirect:
    disable: false

Harbor镜像推送大致流程

Harbor当前配置:

  • multipartcopythresholdsize = 5G #(默认为32M、最大为5G).
  • 若image中存在单个文件 >threshold,则会将大文件分片拷贝、rgw会报错404、push镜像retry.
  • 若image不存在单个文件 >threshold,则将所有文件进行整体拷贝、push镜像成功.
  • 若将harbor的s3账户替换为 default 租户下的账户,则不论是否有大文件、push镜像都成功.

错误原因

代码语言:javascript
复制
\"x-amz-copy-source\":\"xxx-harbor\\/docker\\/registry\\/v2\\/repositories\\/library\\/centos\\/_uploads\\/9731d0e9-5d78-4fa3-a42b-2cf8f08847f4\\/data\"
  • harbor利用aws s3接口进行对象整体拷贝时:request的参数 x-amz-copy-source,ceph rgw是可以自动识别源对象是属于 租户(legacy)账户(cloudsre) 信息,即能正确找到需要拷贝的源对象
  • harbor 利用 aws s3 接口进行对象分片拷贝:即调用 UploadPartCopyInput 时,ceph rgw 不能根据 x-amz-copy-source 获取正确的租户、账户信息,后续就采用 “default” 租户下的 bucket:xxx-harbor 下的文件作为源对象进行拷贝,则报404

rgw相关代码得知,分片拷贝时正确、且可以显示指定的bucket路径为:legacy:xxx-harbor

aws s3 sdk复现

代码语言:javascript
复制
package main
import (
        "fmt"
        "github.com/aws/aws-sdk-go/aws"
        "github.com/aws/aws-sdk-go/aws/credentials"
        "github.com/aws/aws-sdk-go/aws/session"
        "github.com/aws/aws-sdk-go/service/s3"
)

var (
        // legacy/cloudsre
        accessKey = "your-accessKey"
        secretKey = "your-secretKey"
        endPoint = "http://x.x.x.x:7480"
)

func main() {
        sess, _ := session.NewSession(&aws.Config{
                Credentials:      credentials.NewStaticCredentials(accessKey, secretKey, ""),
                Endpoint:         aws.String(endPoint),
                Region:           aws.String("default"),
                DisableSSL:       aws.Bool(true),
                S3ForcePathStyle: aws.Bool(true),
        })
        svc := s3.New(sess)
        res, err := svc.CreateMultipartUpload(&s3.CreateMultipartUploadInput{
                Bucket: aws.String("xxx-harbor"),
                Key:    aws.String("cp"),
        })
        uploadID := res.UploadId
        fmt.Println(*uploadID)

        copy_input := &s3.UploadPartCopyInput{
                Bucket:     aws.String("xxx-harbor"),
                CopySource: aws.String("xxx-harbor/x.c"),
                CopySource: aws.String("legacy:xxx-harbor/x.c"),
                Key:        aws.String("cp"),
                PartNumber: aws.Int64(1),
                UploadId:   uploadID,
        }
        result, err := svc.UploadPartCopy(copy_input)
        fmt.Println(result, err)
        c_part := []*s3.CompletedPart{}
        c_part = append(c_part, &s3.CompletedPart{
                ETag:       result.CopyPartResult.ETag,
                PartNumber: aws.Int64(int64(1)),
        })
        part := &s3.CompletedMultipartUpload{
                Parts: c_part,
        }
        res1, err := svc.CompleteMultipartUpload(&s3.CompleteMultipartUploadInput{
                Bucket:          aws.String("test"),
                Key:             aws.String("cp"),
                UploadId:        uploadID,
                MultipartUpload: part,
        })
        fmt.Println(res1, err)

}

解决办法

代码语言:javascript
复制
storage:
  s3:
    region: default
+   bucket: legacy:harbor-prod
-   bucket: harbor-prod
    regionendpoint: http://x.x.x.x
+   chunksize: 10485760
+   multipartcopychunksize: 10485760
    multipartcopythresholdsize: 5368709120
  cache:
    layerinfo: redis
  maintenance:
    uploadpurging:
      enabled: false
  delete:
    enabled: true
  redirect:
    disable: false
本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2022-04-20,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 云原生小白 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 问题现象
  • 配置信息
  • Harbor镜像推送大致流程
  • 错误原因
  • aws s3 sdk复现
  • 解决办法
相关产品与服务
容器服务
腾讯云容器服务(Tencent Kubernetes Engine, TKE)基于原生 kubernetes 提供以容器为核心的、高度可扩展的高性能容器管理服务,覆盖 Serverless、边缘计算、分布式云等多种业务部署场景,业内首创单个集群兼容多种计算节点的容器资源管理模式。同时产品作为云原生 Finops 领先布道者,主导开源项目Crane,全面助力客户实现资源优化、成本控制。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档