Preview Gitlab CI/CD DSL

Preview Gitlab CI/CD DSL

Irrespective of the given CI/CD implementation I've run into similar issues, that are hard to debug. That is unless you push to the repository and run a test. This makes testing a potentially nail biting affair, for later stages, or items that occur specifically on a tag. Some common issues.

  • A script referenced is not marked as executable.
  • The file was not added via git.
  • The regex was slightly off.
  • A child job did not have the regex added to it.
  • Dependencies names change
  • Stage name change
  • Piles of yaml
  • Navigation is done via find, rather than go to definition
  • Debugging extension or includes is difficult
  • Mismatch of environment names (int vs integration)

I just got done working on a new pipe line for a Kubernetes deployment. The yaml file is approaching 800 lines, that's with extensions and anchors. It's a first pass, but still.

Each CI/CD system I come across requires mapping of a local build system, into it's custom syntax. Whether it be GitLab, Travis, or GitHub, it's cast to Yaml, and these pit falls can occur.

Adding Complexity for Sanity

In the type safe all the things mantra, and abuse of json schema. I have found my next target of GitLab. A community maintained schema is available here. As I dug into this there are tickets open to have an official GitLab sancioned schema. The origin of the ticket is better linting and sanity checking. But as GitLab is essentially building a custom templating language it's hard to build a schema for that. While JSON Schema has some flexibility it can only go so far. These pain points largely focus on extends and includes.

But if we look at the base whole of what is needed for a pipe line, it can be boiled down to a set number of core records. Namely Environment Variables, Services, Stages, Jobs, etc. If instead of a templating language, we have an actual programming language we no longer need the template items.

Below is a cursory sample code, much like the frm posts this not a stable dsl.

    val bootStrapJob = Stages.bootstrap.createJob("schema") {
        script = listOf(
            "./boot_strap.sh"
        )
        artifacts = Artifacts(
            paths = listOf(
                "./gitlab/src/commonMain/resources/gitlab-ci.schema.json"
            ),
            expireIn = "1 week"
        )
    }
    val buildAllJob = Stages.build.createJob("all") {
        script = listOf(
            "./gradlew :gitlab:jsonSchemaGenerate",
            "./gradlew assemble"
        )
        artifacts = Artifacts(
            paths = listOf(
                "./gitlab/build"
            )
        )
        dependencies = listOf(bootStrapJob)
    }
    val buildLintJob = Stages.build.createJob("lint") {
        script = listOf(
            "./gradlew ktlintCheck"
        )
    }
    val pipeLine = pipeLineOf {
        image = Image(name = "registry.gitlab.com/animusdesign/kotlin-mp-build/master:latest")
        stages = listOf(
            Stages.bootstrap,
            Stages.build
        )
        withJobs(bootStrapJob, buildAllJob, buildLintJob)
    }

What are we buying here? This not much really, for a simple job like this a YAML is just as good. This complexity is only worth it when the templating language is hard to maintain.

Adding Sanity

Non Executable File

Taking the above example pain point of I forgot to chmod +x a script. If we were to add a job like the following snippet.

val deployEnvJob = Stage.deployDev.createShellJob(
   "./ci_cd/scripts/get_kubecontext.sh",
   "./ci_cd/scripts/build_manifests.sh",
   "./ci_ci/scripts/deploy.sh"
)

A validator can be added to that extension method. Running through each script called in that job, and ensuring it's present, executable, and in git. With these validators, rather than just validating the syntax we can validate the intent of the pipe line, and it's ability to work given the current repository.

Drifting Dependencies

Each job creation method returns a JobResponse which contains a reference. This eases the concerns about missing a dependency reference via YAML markup.  Extending the above

val deployEnvJob = Stage.deployDev.createShellJob(
...
) {
  dependsOn(bootStrapJob, buildAllJob)
}

Even if the underlying stage name changes, or is altered it will still have the proper reference.

Mismatched Environment Names

Is it prod, production, prd, product, or promised-land. This is a personal pain point for me, where I will ask co-workers to double check this for me. As to why this is a pain point, that environment name will drive several items.

  • Secret path store
  • IAM user name
  • Domain Name
  • Git Lab deploy board

With true infra as code, and no terraform doesn't count. I start with a basic sealed class representing all the environments. That is a common module consumed across all utilities, cloud infra, pipe lines, and application. By having a strongly typed environment name, it eases the potential mis match. I encountered a production roll back recently as the terraform, pipe line, and app logic all drifted in a different direction. Looking for a different environment name. If this had been in the foundation type system it would have been mitigated.

That plays into this by allowing an environment to be defined as a type and used in the pipe line. Carrying over from other infrastructure and application code logic.

Location Pain

I haven't come up with an ideal solution to this yet. But the other pain point I've seen is the CI/CD and deployment logic living with the code. This pain point can be found on you just tagged a new version of the application, but there was a CI/CD error. Living with the code you need to do a hot fix tag release just to address the CI/CD.

Another pain point is PR approval. If I'm modifying the deployment logic, or CI/CD for an application. It becomes a hindrance to require several approvals from the development team, when it's not in their purview.

Looking Forward

There are several avenues that this can be taken. Drastically reducing complexity, and drift in pipe lines is one. But additionally it empowers developers to test prior to commit.

With a sample command tool of say lint or validate, it can run through the pipe line and validate everything is present for the pipeline to run.  Clarifying this, all jobs can start, and all items to run the jobs are present. But this still leaves room for a run time error to occur.

The next option would be validate with, where a developer specifies additional parameters. Say branch names, environment variables etc. Trying to mimic the target environment.

That is where a local run option would come in. Being able to run through the out lined jobs. Providing a developer feed back, on whether their application will work. This is contingent on having a clean check out, and ensuring no mutated local state is allowing the application to build. I.E. a build completed fine, because I had downloaded a resources but didn't in the CI/CD definition.

The last item is that a build is a build. There is a possibility to abstract this, and provide intent of necessary steps. Irrespective of a vendors specified CI/CD. That is to say a developer defines steps to build, and it can output Github, GitLab or another CI/CD format.

Summary It Works

Here is a pipe line, that demonstrates the new features. This starts with a build_pipeline step which runs the above code snippet. Then executes that output yaml pipe line.

Up next I will try and convert an existing pipe line

Cons

It's Kotlin, so most people aren't familiar with it, and slow start time.