GitOps: Build infrastructure resilient applications

François D'Agostini
10 min readSep 30, 2020

The Concept of DevOps is widely accepted and adopted into today’s business world, but automated deployments can often cause problems. In this article, we try to put in perspective the GitOps approach and how it can positively impact modern CI/CD architectures. Written and reviewed by François D'Agostini, Régis Martin, William El Kaim and Amine Eddaifi

According to the latest DORA report on the state of DevOps, most organizations across industries are now implementing DevOps practices. The concept is accepted and widely adopted.

However, its adoption is not without difficulties: only 20% of organizations can be considered ‘elite performers’ in DevOps practices, and 56% are low to medium performers.

The real value of DevOps is its capability to automatically test and deploy code. In fact, the main DevOps metrics for assessing an organization’s performance are the lead time and the deployment frequency. The more an organization deploys to production, the more Agile it becomes at improving its processes.

But automated deployments are not without difficulties either. Setting up an automation pipeline straight away can lead to a great deal of complexity. A straightforward approach is to use a classic continuous integration tool and develop custom deployment scripts. This can quickly become a fragile pipeline, with a high cost of maintenance and hence a poor efficiency.

GitOps is a recent trend in the continuous deployment area. Its momentum relies on bringing deployment processes close to software development principles. By bringing together the processes and tools used by developers and operations teams, it can definitely move an organization closer to the concept of DevOps.

In this article, the GitOps approach is assessed, and the value it brings to an organization willing to leverage it. In particular, there are four significant pillars supporting it:

  1. Git as the source of truth for managing application, infrastructure and deployment descriptions
  2. A dedicated CD process separated from CI
  3. The evolution towards Incremental infrastructure
  4. Two implementation models (PUSH and Pull)

Git as the source of truth

As its name implies, GitOps is all about putting Git as the front of Operations.

Git is an open-source code management tool which is very popular and well-mastered by the IT community. It includes features enabling versioning, tagging, and rollback capabilities. It is massively used to store source code for applications around the world and has proven its capability to host code.

GitOps’ main idea is to use Git to store the complete deployment infrastructure needed for an application to execute. A set of files, using infrastructure as code principles, is used to allocate the required infrastructure resources, to configure the deployment and to ensure a resilient execution. This makes the information stored on Git as the exact state that an application should reach and let the tools figure out how to do it:

The target system’s state is hosted on Git, not on intermediary tools

The main impact of using Git as the source of all these files (usually called Manifests), is that a developer no longer needs to work their way across all the tools used to deploy infrastructure. The only tool a developer needs to know is Git. Modifying a file and then pushing its change to the Git server is enough to make the developer’s change to the real system.

There is no need to interact with the deployment tools like Terraform or Kubernetes. The only important commands are the simple and ubiquitous Git push and Git pull requests (see figure below). This is clearly blurring the lines between a developer and an operations person.

Managing infrastructure becomes the same process as adding a feature to an app, and this approach is lowering the barrier to entry for making changes to the environment.

Git becomes the main tools to interact with various platforms and orchestrators

A developer who is not yet familiar with the entire stack can still correct bugs and perform changes easily, as they only interact with Git.

The mental load to interact with a complete system is reduced, enabling less expert people to perform updates. This enables non-operations developers to participate in the complete stack, without having spent several weeks installing all Operations’ tools.

Decoupled CI and CD processes

The goal of modern software development is full automation, pushing DevOps to its maximum, where changes to production can be made safely in a matter of minutes. Code pushed to production goes through a pipeline, defined as a set of steps involving the construction of artifacts, execution of tests and security checks, as well as provisioning the underlying resources to run the code.
There are classically three kinds of steps: integration, delivery and then deployment:

  • integration relates to changes to code, testing changes, merging into a shared code repository
  • delivery relates to the upload in shared repositories and registries of the code to be deployed
  • deployment is the transfer to production of new code, where it can be used by clients

To deliver the real value of DevOps, each of these steps needs to be automated and executed in a continuous way (continuous integration, continuous delivery and continuous deployment). In the rest of the document and for the sake of simplicity, “CD” will mean here “Continuous Delivery and Deployment”.

The GitOPs approach to succeed requires to decouple CI and CD process, which means that integration and delivery/deployment are performed independently (see diagram below).

There is value in separating continuous integration and continuous deployment tools

To understand this constraint, it is important to note that CI and CD are very different by nature.

  • Continuous integration requires understanding how to build and compile code, running several tools that will analyze the code, perform static analysis, and binary creations, and push the resulting packaged code in a repository. This activity does not take into consideration the target execution environment to execute properly.
  • Continuous deployment is, conversely, all about moving and executing software binaries in an execution environment. This step does not involve any binary creation anymore but requires to know, connect and interact with the execution environment. This process has the potential of disturbing the execution of existing applications running on the same environment, or to execution environment fault or crash (due for example to resources starvation).

Given CI and CD are different activities with different areas of responsibilities, the new trend is to use also different tools.

A continuous integration tool (Jenkins, CircleCI…) will focus on building your applications into binaries, launching a complex analysis pipeline, ensuring the code is thoroughly analyzed to meet your security guidelines.

On the other hand, a continuous deployment tool, like ArgoCD, Spinnaker and Flux CD is going to focus on efficient deployments of your workloads, including advanced deployment patterns like blue/green deployments or canary releases. These tools will also take care of easily rolling back to previous deployments if a problem is seen on the field. A CD tool will also monitor the system to prevent configuration drift.

Finally, separating CI and CD increase security, by reducing the exposure to intrusion.

  • If CI and CD are segregated, the CI Tool no longer needs to be linked with the target environment. It can limit itself to build the binaries without any rights to deploy anything. This is in contrast to bad practices making the CI tool responsible to deploy also the binaries to the target platform. In this case, if a plug-in is compromised (for example Jenkins is known to have vulnerabilities), then the attackers can also access your running system and potentially cause massive damage.
  • With CI/CD segregation, only the CD tool has the right to deploy anything. This certainly limits your exposure.
A step further: CD Tools running in the systemIn the case of container platforms like Kubernetes, the CD tools can run inside the container platform. This is even safer because there are no credentials exchanged with the outside. The CD Tool is running inside the target platform and hence can be granted the deployment capability within the cluster. This makes attacks even harder.
Running the CD Tool inside the target system improves security by lowering the attack surface

Incremental infrastructure

Using GitOPS offers a powerful new capability: on demand incremental deployment. Thus, declaring a complete infrastructure on a Git repository does not mean the whole system gets deployed every time.

Deploying quickly and often, without having to stop and restart everything, requires being able to find and apply only the changed aspect of your deployment. Microservices architecture or distributed systems (those that are not built monolithically) will naturally benefit from such a capability.

In order to work effectively, without any side effect, CD process requires a target idempotent platform.

An idempotent operation is one that can be applied multiple times without changing the result beyond the initial application. As such, only actual modification will have an effect on the running system. In continuous deployments, it means that deploying the same app version multiple times will not restart the app every time.

For example, Kubernetes is an idempotent platform (deploying the same pod multiple times will have no side effect) while CloudFoundry is not (pushing new applications, even with the same manifest, will kill and relaunch a new application, hence rebooting the application).

Idempotency provides a great advantage when redeploying a full system when only few elements have been changed: speed — deployments done in seconds, not in hours. Non-idempotent platform makes the CD pipeline more fragile and prone to errors, since custom development, usually with scripts, needs to be done to avoid unnecessary deployments.

The Push and Pull model

There are two ways to implement a CD process, depending on who is requesting the deployment: PUSH vs. PULL

The first and simpler approach is based on a push model where the deployment request is generated by Git. In this model, when your infrastructure repository is being modified, this triggers a job that will look at the changes and deploy or remove them.

This process is similar to a CI Job. The deployment job gets triggered only when a change to the repository is made.

Push Model: changes are triggered only from Git to the target system

This approach will work most of the time, but it presents two flaws:

  • the deployment is executed only when the Git repository is modified. If for any reason, the target system deviates from the Git repository, this is not going to be changed until the next commit on Git.
  • this model basically mimics a CI process and is typically used when an external tool performs the deployment. In this case, you’ll need to provide your CD tool with all credentials needed to deploy the application on the target system. This can create a security flaw.

A more modern, and DevOps compatible approach, is to use a pull model or an operator mode. This architecture gives the responsibility to the CD tool to monitor both Git but also the target system and reconcile both environments when divergence occurs.

Pull Model: changes are triggered from Git but also from deviations within the target system

The main difference from the push model is that now the CD tool will monitor both the drifts between the desired state (Git repository) and the actual state (the target system) and ensure that the actual state remains aligned with the desired state.

This avoids having a system deviating from the desired state in uncontrolled ways. As an example, if an operator is making a change directly on the infrastructure by mistake, this would be caught by the CD Tool and reverted back to the desired state described in a Git repo.

This is fundamentally what makes a dedicated CD tool different from a classic CI Tool like Jenkins which is mainly implementing a push model.

We recommend the pull model to get the full benefits of GitOps:

  • when programmable execution platforms are used (like Kubernetes or Openshift…)
  • when you opt for recent tools like ArgoCD, Spinnaker, and FluxCD.
Does a GitOps oriented system still need human intervention?Building a solution that automatically and incrementally reconciles the desired application deployment state with the current one seems totally autonomous (without any need of human intervention). But of course, there are many situations where the solution will not be able to make its own choice and will need to rely on humans. Resource failure, memory shortage, application bugs are some reasons where a human will also be in the loop. GitOps is mainly lowering the mental load used to maintain large systems, allowing humans to focus on more valuable activities.

Conclusion

GitOps is a logical evolution of DevOps. It means operations people are really working like developers, using the exact same set of practices and tools that have been in place in development for a long time.

Leveraging an existing widely used and deployed solution, Git, simplifies the adoption of such an approach. Changes to the system are eventually implemented through Git only, and do not require any complex platform-specific tools. This has the potential to on-board people more quickly and lowers the barrier to entry for new developers.

Having your whole infrastructure described in a Git server also helps with portability. Want your system deployed to a customer’s premises? Simply clone the Git repo and deploy it elsewhere. Want to revert to a past deployment? Just check out a past tag on the Git repo and off you go.

Finally, when you create a pipeline in which the only way to update your infrastructure is done through Git, then your Git commit log becomes your audit trail. All the capabilities built around Git (history comparisons, commit logs, tags, branches, etc…) can be used to track changes made to your infrastructure. This greatly simplifies your path towards compliance, as Git becomes the central point of attention for setting up workflows and security rules.

By making Git central to your operations, you are also opening up the possibility of using tools that can interact with it — a massive ecosystem. Git is so standard now that no tool is incompatible with it.

Git coupled with the right deployment tools enables true GitOps and gives you superpowers by simplifying the operations landscape.

--

--

François D'Agostini

Chief Enterprise Architect at BCG Platinion. Helping on agile transformations and scaling software development with micro services and frontend expertises