A pretty exciting look into the future of CI job scaling at GitLab is taking shape and it’s clear they’re tackling some long-standing infrastructure pain points. Currently, GitLab uses Docker Machine to spin up ephemeral runners, which works… but comes with several limitations. Docker Machine is essentially in maintenance mode, and its abstraction layer can be clunky and inconsistent across different cloud providers. For large-scale CI/CD systems, that means reliability, speed, and fine-grained control are compromised, none of which are ideal in a fast-moving dev environment.

Enter the new proposed architecture: a pair of internal tools, Fleeting and Taskscaler, designed to interact directly with cloud provider APIs. According to this GitLab epic, Fleeting acts as the abstraction layer between GitLab and the provisioning of virtual machines, while Taskscaler takes on the role of scheduling and scaling runners dynamically. This combination allows GitLab to scale CI jobs much more efficiently, eliminating the Docker Machine bottleneck and offering greater flexibility and control. By engaging directly with cloud APIs, they can spin up VMs faster, use cloud-native features, and build smarter auto-scaling logic.

What I really like about this direction is that GitLab is being transparent about the design and evolution of their CI architecture. Through their engineering handbook, you can actually follow along with their thinking—understanding the trade-offs, interim steps, and technical constraints. It’s a reminder that scalable CI isn’t just a “feature”; it’s a foundational piece of modern DevOps. When your platform can scale runners intelligently, you reduce build wait times, better manage costs, and ultimately ship faster and more reliably.