Debugging Gitlab CI jobs

Posted by Markus Benning on April 03, 2022

I recently had problems with failing CI jobs and after few failed attempts and commits i noticed that i had to find a better way to test my changes locally without committing them to the repo.

After some research i found that gitlab-runner exec offers a way to do just that. In my case i had to run the spec job in a docker runner, but it just gave me a error:

$ gitlab-runner exec docker spec
Runtime platform    arch=amd64 os=linux pid=40400 revision=bd40e3da version=14.9.1
FATAL: unsupported script

According to #26413 gitlab-runner exec does not support YAML anchors in .gitlab-ci.yml yet. As a workaround I had to replace the anchor with the referenced value.

Example (before):

image: ruby:3.1

.install-dependencies-script: &install-dependencies-script
  - echo "I'm installing project dependecies..."

spec:
  stage: test
  before_script:
    - *install-dependencies-script
  script:
    - echo "failing dummy script"
    - test -z '1'

Must be change to:

image: ruby:3.1

spec:
  stage: test
  before_script:
    - echo "I'm installing project dependecies..."
  script:
    - echo "failing dummy script"
    - test -z '1'

After this change I was able to run the CI job on my local checkout:

$ gitlab-runner exec docker spec
Runtime platform                                    arch=amd64 os=linux pid=42134 revision=bd40e3da version=14.9.1
WARNING: You most probably have uncommitted changes. 
WARNING: These changes will not be tested.         
Running with gitlab-runner 14.9.1 (bd40e3da)
Preparing the "docker" executor
Using Docker executor with image ruby:3.1 ...
Pulling docker image ruby:3.1 ...
Using docker image sha256:a365ea82f2a306909122a89321a7a54ca58a47a90a283cc9c948046ac7ba2f22 for ruby:3.1 with digest ruby@sha256:02132b99bb12b791701ae9bd86119eb879e49478b7b5d840c6c7cc9281ee63c0 ...
Preparing environment
Running on runner--project-0-concurrent-0 via painkiller...
Getting source from Git repository
Fetching changes...
Initialized empty Git repository in /builds/project-0/.git/
Created fresh repository.
Checking out 0a95ae93 as master...

Skipping Git submodules setup
Executing "step_script" stage of the job script
Using docker image sha256:a365ea82f2a306909122a89321a7a54ca58a47a90a283cc9c948046ac7ba2f22 for ruby:3.1 with digest ruby@sha256:02132b99bb12b791701ae9bd86119eb879e49478b7b5d840c6c7cc9281ee63c0 ...
$ echo "I'm installing project dependecies..."
I'm installing project dependecies...
$ echo "failing dummy script"
failing dummy script
$ test -z '1'
ERROR: Job failed: exit code 1

FATAL: exit code 1

Next problem was that after the failed command the job immediately exits and gitlab-runner cleaned up the jobs environment leaving nothing left to debug.

I could not find a option in gitlab-runner to skip the cleanup so i worked around it by adding || sleep 3600 after the failed script command:

image: ruby:3.1

spec:
  stage: test
  before_script:
    - echo "I'm installing project dependecies..."
  script:
    - echo "failing dummy script"
    - test -z '1' || sleep 3600

That gave me some time to start a shell in the container using docker exec and check what was going on:

$ docker exec -it runner--project-0-concurrent-0-4f63e69cd20247e6-build-2 bash
# time to check whats going on...
root@runner--project-0-concurrent-0:/# kill `pgrep sleep` # get the foot out of the door
root@runner--project-0-concurrent-0:/#