ECS AutoScaler

Large-scale compute clusters are expensive, so it is important to use them well. Utilization and efficiency can be increased by running a mix of workloads on the same machines: CPU- and memory-intensive jobs, small and large ones, and a mix of batch and low-latency jobs – ones that serve end-user requests or provide infrastructure services such as storage, naming or locking.

The challenge of scaling containers on top of Amazon ECS

ECS topology is built on Clusters, each cluster has Services (which can be referred to as applications), services are running Tasks. Each task has a Task definition which tells the scheduler how much resources the task requires.

If your cluster runs 10 machines of c3.large (2 vCPUs and 3.8 GiB of RAM) and 10 machines of c4.xlarge (4 vCPUs and 7.5 GiB of RAM), so your total vCPUs is 60*1024 = 61,440 CPU Units and the total RAM is 113 GiB.

But, what happens if a single Task requires 16 GiB of RAM? Although you have plenty of RAM and CPU, it won’t start. The solution? MCS’s ECS AutoScaler.

Once the ECS AutoScaler is enabled for both “Headroom” and “Autoscale”, there is no need to keep any existing scaling policies based on CPU or Memory reservation in your ECS cluster. Remove any existing scaling rules from the configuration.

Spotinst ECS AutoScaler dynamically scales your cluster up and down to ensure there are always sufficient resources to run all tasks and at the same time maximizing resource efficiency in your cluster. It does this by optimizing task placement across the cluster in a process we call Tetris Scaling, and by automatically managing Headroom – a buffer of spare capacity (in terms of both memory and CPU) that makes sure that when you want to scale more tasks, you don’t have to wait for new instances to launch while preventing instances from being over-utilized.

Make sure that all scaling preferences that are managed by AWS- such as ASG’s and Service Auto Scaling – are disabled, in order to prevent AWS from launching additional unmonitored instances in the cluster.
ECS AutoScaler scale down behavior

Once the ECS AutoScaler is enabled on a group, Elastigroup monitors the ECS Cluster for idle instances. An instance is considered idle if it has less than 40% CPU and Memory utilization.

When an instance is found idle for the specified amount of consecutive periods, Elastigroup will find spare capacity in other instances, Drain the instance tasks, reschedule those on other instances and terminate the idle instance.

Scale down uses the Evaluation Period which is defined as the number of consecutive minutes to check before determining that an instance is underutilized.

Note: Scale-Down actions are limit to 10% of the cluster size at a time.
ECS AutoScaler Constraints

ECS AutoScaler supports built-in and custom ECS task placement constraints within the scaling logic. Task placement constraints gives you the ability to control where your tasks are scheduled, such as in a specific Availability Zone or on instances of a specific type. You can utilize the built-in ECS container attributes or create your own custom key-value attribute and add a constraint to place your tasks based on the desired attribute.

For further information – [link to ECS task placement utilization constraint setup]

Note: When ECS AutoScaler is enabled, Weighted scaling is disabled.