目录
1. 用途?
2. 机制?
2.1. Cache vs artifacts
2.2. Good caching practices
2.2.1. Share caches across the same branch
2.2.2. Share caches across different branches
2.2.3. Disable cache on specific jobs
2.2.4. Common use cases:Cache Node.js dependencies
2.3. Where the caches are stored?
2.4. How archiving and extracting works
2.5. Clearing the cache manually
2.6. artifacts
1. 用途?
回顾一下:【GitLab CI/CD】:一些有用的基础知识,在默认Git strategy(fetch)下,每个 Job 执行之前,都会进行 git clean 操作,也就是说 job 执行过程中产生的中间结果,都会被清理,多数情况是没问题的。但总有一些例外情况,我们需要之前 job 执行过程中产生的中间结果,最具代表性的两类:
2. 机制?
GitLab CI/CD provides a caching mechanism that can be used to save time when your jobs are running.
Caching is about speeding the time a job is executed by reusing the same content of a previous job. Use caching when you are developing software that depends on other libraries which are fetched via the internet during build time.
If caching is enabled, it’s shared between pipelines and jobs at the project level by default. Caches are not shared across projects.
2.1. Cache vs artifacts
If you use cache and artifacts to store the same path in your jobs, the cache might be overwritten because caches are restored before artifacts.
Don’t use caching for passing artifacts between stages, as it is designed to store runtime dependencies needed to compile the project:
Caches:
Artifacts:
Both artifacts and caches define their paths relative to the project directory, and can’t link to files outside it.
2.2. Good caching practices
2.2.1. Share caches across the same branch
Define a cache with the key: ${CI_COMMIT_REF_SLUG} so that jobs of each branch always use the same cache:
cache:
key: ${CI_COMMIT_REF_SLUG}
This configuration is safe from accidentally overwriting the cache, but merge requests get slow first pipelines. The next time a new commit is pushed to the branch, the cache is re-used and jobs run faster.
To enable per-job and per-branch caching:
cache:
key: "$CI_JOB_NAME-$CI_COMMIT_REF_SLUG"
To enable per-stage and per-branch caching:
cache:
key: "$CI_JOB_STAGE-$CI_COMMIT_REF_SLUG"
备注:Predefined environment variables
2.2.2. Share caches across different branches
To share a cache across all branches and all jobs, use the same key for everything:
cache:
key: one-key-to-rule-them-all
To share caches between branches, but have a unique cache for each job:
cache:
key: ${CI_JOB_NAME}
2.2.3. Disable cache on specific jobs
If you have defined the cache globally, it means that each job uses the same definition. You can override this behavior per-job, and if you want to disable it completely, use an empty hash:
job:
cache: {}
2.2.4. Common use cases:Cache Node.js dependencies
The most common use case of caching is to avoid downloading content like dependencies or libraries repeatedly between subsequent runs of jobs. Node.js packages, PHP packages, Ruby gems, Python libraries, and others can all be cached.
By default, npm stores cache data in the home folder ~/.npm but you can’t cache things outside of the project directory. Instead, we tell npm to use ./.npm, and cache it per-branch:
build_sef:
tags:
- webdepartment
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- sef/.npm/
- sef/sef_web_legacy/build/.npm/
- sef/sef_web_modern/build/.npm/
stage: build
script:
- cd sef
- npm ci --cache .npm --prefer-offline
- npm run changelog
- cd -
- cd sef/sef_web_legacy/build
- npm ci --cache .npm --prefer-offline
- npx grunt buildcss
- npx grunt compressJS
- cd -
- cd sef/sef_web_modern/build
- npm ci --cache .npm --prefer-offline
- npm run buildcss
- npm run buildjs
- cd -
2.3. Where the caches are stored?
The runner is responsible for storing the cache, so it’s essential to know where it’s stored. All the cache paths defined under a job in .gitlab-ci.yml are archived in a single cache.zip file and stored in the runner’s configured cache location. By default, they are stored locally in the machine where the runner is installed and depends on the type of the executor.
2.4. How archiving and extracting works
This example has two jobs that belong to two consecutive stages:
stages:
- build
- test
before_script:
- echo "Hello"
job A:
stage: build
script:
- mkdir vendor/
- echo "build" > vendor/hello.txt
cache:
key: build-cache
paths:
- vendor/
after_script:
- echo "World"
job B:
stage: test
script:
- cat vendor/hello.txt
cache:
key: build-cache
paths:
- vendor/
If you have one machine with one runner installed, and all jobs for your project run on the same host:
job A
runs.before_script
is executed.script
is executed.after_script
is executed.cache
runs and the vendor/
directory is zipped into cache.zip
. This file is then saved in the directory based on the runner’s setting and the cache: key
.job B
runs.before_script
is executed.script
is executed.By using a single runner on a single machine, you don’t have the issue where job B might execute on a runner different from job A. This setup guarantees the cache can be reused between stages. It only works if the execution goes from the build stage to the test stage in the same runner/machine. Otherwise, the cache might not be available.
2.5. Clearing the cache manually
If you want to avoid editing .gitlab-ci.yml, you can clear the cache via the GitLab UI:
2.6. artifacts
示例:
build_sef:
tags:
- webdepartment
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- sef/.npm/
- sef/sef_web_legacy/build/.npm/
- sef/sef_web_modern/build/.npm/
stage: build
script:
- cd sef
- npm ci --cache .npm --prefer-offline
- npm run changelog
- cd -
- cd sef/sef_web_legacy/build
- npm ci --cache .npm --prefer-offline
- npx grunt buildcss
- npx grunt compressJS
- cd -
- cd sef/sef_web_modern/build
- npm ci --cache .npm --prefer-offline
- npm run buildcss
- npm run buildjs
- cd -
- cd sef
- mvn -q versions:set -DnewVersion=$CI_COMMIT_TAG
- mvn -q versions:commit
- mvn -q -Dmaven.test.skip=true clean deploy
- cd -
artifacts:
paths:
- sef/sef_web/target/sef_web.war
- sef/sef_wing/target/sef_wing.war
- sef/sef_muif/target/sef_muif.war
参考:
Cache dependencies in GitLab CI/CD: https://docs.gitlab.com/ee/ci/caching/