Automation – all or nothing
As I already mentioned in my article about why we love Terraform, in Ackee, we maintain infrastructure in a declaratory manner or as IaC (Infrastructure as Code). The heart of the machine called Ackee DevOps is therefore Terraform and Kubernetes. Let’s take a look how we use these tools in the real life of agency app development.
Terraform
When writing the declaratory .tf files, it is necessary to think about the above mentioned principle “all or nothing” and avoid shortcuts. What does that mean?
Nowadays, when our applications are running in public clouds (specifically GCP in our case – Google Cloud Platform), most of the infrastructure can be “clicked up”. This approach is certainly beneficial for small projects or testing, but it is doom for the principle “all or nothing”, because IaaC or Terraform implement calls of API methods. On the other hand, as we have confirmed several times, the web interface (in GCP it is specifically called Google Cloud Console), there are often several API methods called and they sometimes require a specific order. That is why we are trying to work with Terraform already from the development environment.
Dev vs. Ops
The agreement with developers is therefore the following: If you need part of the infrastructure that repeats itself, commonly for example the GCS (Google Cloud Storage) for saving persistent data, go directly to Terraform and take a look at an existing running project and define a new bucket with a modification of the parameters (mostly name). If you need a part of the infrastructure that you have never worked with, try to define it in Terraform (given that you will only spend a trivial amount of time on the job). In case it would take longer, you can “click it up” and leave the replication in Terraform to the DevOps team. In the dev environment, developers have “high user rights” so that they are not slowed down by the often busy DevOps team.
Infrastructure staging
In the stage environment, developers only have read rights and deployment of infrastructure is always performed using Terraform. This middle step serves as a kind of assurance that we do not overlook anything that was “unexpectedly created” in the web interface of the cloud provider. It can often happen that even though everything looks fine in the development environment, the stage environment can reveal that there is something missing in the Terraform definitions. Here it is ensured that git is the “only version of the truth”.
Production version of infrastructure
In the production environment, there is nothing that can surprise us and we deploy only validated Terraform setups.
(Almost) Everything as a code
When using Terraform, the biggest issue arises when it is not used. The only problems we have run into were during re-import of already created parts – sometimes it can be more complex and time consuming than creating these parts directly in Terraform.
An exception to the rule is Firebase that we use frequently, but for a long time, it was impossible to create through API (and therefore through Terraform). Even this exception has already been resolved:
https://github.com/terraform-providers/terraform-provider-google/issues/2973#issuecomment-606156150
Kubernetes
Even though Kubernetes is mostly used for the applications themselves, we mostly deploy two types of infrastructure objects in it:
Company infrastructure
Our favorite services are all running through containers: Jenkins, Gitlab, Vault, Redmine etc.
Application infrastructure
Even though we are trying to use services offered by public clouds (CloudSQL for MySQL/Postgres, MemoryStore for Redis etc.), it is sometimes necessary to deploy some supporting infrastructure into a cluster. For these needs, we commonly use Terraform Helm provider. With things that do not have a Helm chart, our own Jenkins pipeline can help which can deploy Kubernetes manifests.
One thing that helped us to reach perfection and therefore the dream “all or nothing” were the sealed secrets, that represent the “last piece of the puzzle”. In Kubernetes, it is possible to define a complete application and with the help of Jenkins pipeline deploy everything automatically from the repository directly to the cluster – this approach is called GitOps. The only thing that was messing with this approach were passwords (or API keys and other credentials) that need to be defined somewhere, because the application cannot be operated without it, but at the same time it is no good idea to have them unencrypted and saved into a repository, even with limited access.
Sealed secret helps solve this issue and thanks to it it is possible to really have everything “as a code”.
Conclusion
It proved useful to perform the development cycle of infrastructure with gradually stricter focus on the completeness of the code where in the development environment, we leave a lot of room for agile development together with our backend developers, in stage environment, we validate the functionality together and in production environment, we can confidently deploy functioning infrastructure.