ci: clear old caches to avoid master branch cache trashing
GitHub cache action doesn't allow updating cache with the same key. We
workaround this by saving the cache with a unique key each time (added
timestamp). This works fine, but since there is a limit on cumulative
storage size for all caches, it can force the master branch cache to be
evicted if a lot of PRs are updated. Cache is evicted with LRU policy,
so as long as master branch cache is used it should stay alive, but it
can happen that only PR specifc caches were only used. As a reminder,
PRs can access the master cache, but they are isolated from each other.
Because of this, it is important to keep the master cache, which is
available to all, alive longer.
The solution is to remove all old caches per branch. This is done in a
separate workflow that validates all cache items and removes ones that
would never be used anyway. If PR is closed all caches per branch are
removed. In other cases most recently used one is preserved.
It is done in a separate workflow to limit cache manipulation access.
GitHub workflows triggered by pull_request event are run in the context
of the fork and does not have access to our token, which is good thing.
Also it is quite awkward to get PR number which triggered build
workflow, so just do a full cleanup pass.
2024-05-19 00:05:21 +00:00
|
|
|
name: Cleanup caches
|
|
|
|
|
|
|
|
on:
|
|
|
|
workflow_run:
|
|
|
|
workflows: [build]
|
|
|
|
types: [completed]
|
|
|
|
|
|
|
|
jobs:
|
|
|
|
cache:
|
|
|
|
runs-on: ubuntu-latest
|
|
|
|
steps:
|
|
|
|
- run: |
|
2024-05-20 14:58:37 +00:00
|
|
|
gh cache list -L 100 --json id,key,ref -S last_accessed_at -O desc --jq '
|
ci: clear old caches to avoid master branch cache trashing
GitHub cache action doesn't allow updating cache with the same key. We
workaround this by saving the cache with a unique key each time (added
timestamp). This works fine, but since there is a limit on cumulative
storage size for all caches, it can force the master branch cache to be
evicted if a lot of PRs are updated. Cache is evicted with LRU policy,
so as long as master branch cache is used it should stay alive, but it
can happen that only PR specifc caches were only used. As a reminder,
PRs can access the master cache, but they are isolated from each other.
Because of this, it is important to keep the master cache, which is
available to all, alive longer.
The solution is to remove all old caches per branch. This is done in a
separate workflow that validates all cache items and removes ones that
would never be used anyway. If PR is closed all caches per branch are
removed. In other cases most recently used one is preserved.
It is done in a separate workflow to limit cache manipulation access.
GitHub workflows triggered by pull_request event are run in the context
of the fork and does not have access to our token, which is good thing.
Also it is quite awkward to get PR number which triggered build
workflow, so just do a full cleanup pass.
2024-05-19 00:05:21 +00:00
|
|
|
map(select(.key | startswith("x86_64-w64-mingw32-") or
|
|
|
|
startswith("i686-w64-mingw32-") or
|
|
|
|
startswith("x86_64-windows-msvc-"))) |
|
|
|
|
group_by(.ref) |
|
|
|
|
map({
|
|
|
|
ref: .[0].ref,
|
2024-05-19 19:28:08 +00:00
|
|
|
caches: map({
|
|
|
|
key: .key,
|
|
|
|
prefix: (.key | capture("^(?<prefix>[\\w_-]+-)\\d+$").prefix)
|
|
|
|
}) | group_by(.prefix) | map({keys: map(.key)})
|
ci: clear old caches to avoid master branch cache trashing
GitHub cache action doesn't allow updating cache with the same key. We
workaround this by saving the cache with a unique key each time (added
timestamp). This works fine, but since there is a limit on cumulative
storage size for all caches, it can force the master branch cache to be
evicted if a lot of PRs are updated. Cache is evicted with LRU policy,
so as long as master branch cache is used it should stay alive, but it
can happen that only PR specifc caches were only used. As a reminder,
PRs can access the master cache, but they are isolated from each other.
Because of this, it is important to keep the master cache, which is
available to all, alive longer.
The solution is to remove all old caches per branch. This is done in a
separate workflow that validates all cache items and removes ones that
would never be used anyway. If PR is closed all caches per branch are
removed. In other cases most recently used one is preserved.
It is done in a separate workflow to limit cache manipulation access.
GitHub workflows triggered by pull_request event are run in the context
of the fork and does not have access to our token, which is good thing.
Also it is quite awkward to get PR number which triggered build
workflow, so just do a full cleanup pass.
2024-05-19 00:05:21 +00:00
|
|
|
}) |
|
|
|
|
.[]
|
|
|
|
' |
|
|
|
|
while read -r group; do
|
|
|
|
pr=$(echo "$group" | jq -r '.ref | capture("refs/pull/(?<num>[0-9]+)/merge").num')
|
2024-05-20 14:58:37 +00:00
|
|
|
if [[ -n "$pr" ]] && [ "$(gh pr view $pr --json state --jq '.state')" != "OPEN" ]; then
|
2024-05-19 19:28:08 +00:00
|
|
|
keys=$(echo "$group" | jq -c '.caches | map(.keys) | .[]')
|
ci: clear old caches to avoid master branch cache trashing
GitHub cache action doesn't allow updating cache with the same key. We
workaround this by saving the cache with a unique key each time (added
timestamp). This works fine, but since there is a limit on cumulative
storage size for all caches, it can force the master branch cache to be
evicted if a lot of PRs are updated. Cache is evicted with LRU policy,
so as long as master branch cache is used it should stay alive, but it
can happen that only PR specifc caches were only used. As a reminder,
PRs can access the master cache, but they are isolated from each other.
Because of this, it is important to keep the master cache, which is
available to all, alive longer.
The solution is to remove all old caches per branch. This is done in a
separate workflow that validates all cache items and removes ones that
would never be used anyway. If PR is closed all caches per branch are
removed. In other cases most recently used one is preserved.
It is done in a separate workflow to limit cache manipulation access.
GitHub workflows triggered by pull_request event are run in the context
of the fork and does not have access to our token, which is good thing.
Also it is quite awkward to get PR number which triggered build
workflow, so just do a full cleanup pass.
2024-05-19 00:05:21 +00:00
|
|
|
else
|
2024-05-19 19:28:08 +00:00
|
|
|
keys=$(echo "$group" | jq -c '.caches | map(.keys[1:]) | .[]')
|
ci: clear old caches to avoid master branch cache trashing
GitHub cache action doesn't allow updating cache with the same key. We
workaround this by saving the cache with a unique key each time (added
timestamp). This works fine, but since there is a limit on cumulative
storage size for all caches, it can force the master branch cache to be
evicted if a lot of PRs are updated. Cache is evicted with LRU policy,
so as long as master branch cache is used it should stay alive, but it
can happen that only PR specifc caches were only used. As a reminder,
PRs can access the master cache, but they are isolated from each other.
Because of this, it is important to keep the master cache, which is
available to all, alive longer.
The solution is to remove all old caches per branch. This is done in a
separate workflow that validates all cache items and removes ones that
would never be used anyway. If PR is closed all caches per branch are
removed. In other cases most recently used one is preserved.
It is done in a separate workflow to limit cache manipulation access.
GitHub workflows triggered by pull_request event are run in the context
of the fork and does not have access to our token, which is good thing.
Also it is quite awkward to get PR number which triggered build
workflow, so just do a full cleanup pass.
2024-05-19 00:05:21 +00:00
|
|
|
fi
|
|
|
|
for key in $(echo "$keys" | jq -r '.[]'); do
|
2024-05-20 14:58:37 +00:00
|
|
|
gh cache delete "$key"
|
ci: clear old caches to avoid master branch cache trashing
GitHub cache action doesn't allow updating cache with the same key. We
workaround this by saving the cache with a unique key each time (added
timestamp). This works fine, but since there is a limit on cumulative
storage size for all caches, it can force the master branch cache to be
evicted if a lot of PRs are updated. Cache is evicted with LRU policy,
so as long as master branch cache is used it should stay alive, but it
can happen that only PR specifc caches were only used. As a reminder,
PRs can access the master cache, but they are isolated from each other.
Because of this, it is important to keep the master cache, which is
available to all, alive longer.
The solution is to remove all old caches per branch. This is done in a
separate workflow that validates all cache items and removes ones that
would never be used anyway. If PR is closed all caches per branch are
removed. In other cases most recently used one is preserved.
It is done in a separate workflow to limit cache manipulation access.
GitHub workflows triggered by pull_request event are run in the context
of the fork and does not have access to our token, which is good thing.
Also it is quite awkward to get PR number which triggered build
workflow, so just do a full cleanup pass.
2024-05-19 00:05:21 +00:00
|
|
|
done
|
|
|
|
done
|
|
|
|
env:
|
2024-05-20 14:58:37 +00:00
|
|
|
GH_REPO: ${{ github.repository }}
|
ci: clear old caches to avoid master branch cache trashing
GitHub cache action doesn't allow updating cache with the same key. We
workaround this by saving the cache with a unique key each time (added
timestamp). This works fine, but since there is a limit on cumulative
storage size for all caches, it can force the master branch cache to be
evicted if a lot of PRs are updated. Cache is evicted with LRU policy,
so as long as master branch cache is used it should stay alive, but it
can happen that only PR specifc caches were only used. As a reminder,
PRs can access the master cache, but they are isolated from each other.
Because of this, it is important to keep the master cache, which is
available to all, alive longer.
The solution is to remove all old caches per branch. This is done in a
separate workflow that validates all cache items and removes ones that
would never be used anyway. If PR is closed all caches per branch are
removed. In other cases most recently used one is preserved.
It is done in a separate workflow to limit cache manipulation access.
GitHub workflows triggered by pull_request event are run in the context
of the fork and does not have access to our token, which is good thing.
Also it is quite awkward to get PR number which triggered build
workflow, so just do a full cleanup pass.
2024-05-19 00:05:21 +00:00
|
|
|
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|