In this article, we share Open Climate Fix’s decision-making process around selecting Airflow as a data orchestrator and why and how we’re managing it ourselves.
Airflow and other data orchestrators are designed to make data-flows easy and more robust. Tools like Dagster and Prefect provide a pleasant UI, you can see business logic, they simplify triggering new events, and you can see what might have gone wrong. Throw in a bit of version control, and you’ve got a production ready system.
At Open Climate Fix we’ve found that there’s business-value value in understanding the problem we’re trying to solve with a new technology. Just because a solution’s trendy and everybody’s using it doesn’t mean it’s an appropriate solution for Open Climate Fix. And we want to avoid paying for bells and whistles that fix problems we don’t have.
Managed instances such as GCP Cloud Composer or AWS MWAA provide an infrastructure-free route to orchestration, removing the requirement for the developer time investment, as well as enabling interoperability with other services. Business-critical workflows can be confidently offloaded to these instances due to their focus on resilience and scalability, and because they define security scopes using pre-existing IAM tools from the cloud provider. All this at a cost that is far lower than the salary of a dedicated DevOps engineer, even to the point where they are only spending a few days a month managing the service. Clearly, deploying a manual Airflow instance is not going to be the best option for a large majority of users when compared with a cloud-managed solution.
In the Pre-Airflow era, we handled the orchestration using ECS task scheduling. ECS task scheduling is simple for setting off cron jobs and definitely got Open Climate Fix started. It also has no cost on top of the compute that has to be paid anyway. However, as the number and business importance of those tasks increases, cron jobs can quickly become an inadequate solution, especially in the case of failure. Diagnosing an error in a workflow spanning multiple tasks is tricky: without a centralised place to view the workflow graph it is difficult to find the point and reason of failure.
An orchestrator seemed like the logical next step for Open Climate Fix. Amongst the many options available, Airflow seemed the sensible choice. It is an open source orchestration solution which has been in active use for almost 10 years. Its longevity and community support give it a leg up on its competitors, and, crucially, it can run AWS ECS tasks, where our jobs are defined already.
We created a docker compose file with the following services:
Trick: The DAGs and logs need to be accessible by all parts of the docker-compose. This took ages (2 days) to do, and we got very frustrated by `write permission errors`.
We created about ten DAGs ( and growing) which are used to provide our 24/7 solar generation forecast for National Grid ESO. Some of these DAGs run every 5 minutes, some run once a day.
For some companies the AWS MWAA managed service for Airflow can be set up very quickly and will be the appropriate choice. However, for companies trying to keep their AWS overhead costs low, and with some experience of coordinating services in the cloud, or just a willingness to learn, taking Airflow deployment into your own hands could prove effective and make good financial sense.
Thanks for reading and if you would like to find out more you can see what we have done on GitHub or get in contact if you have any questions.