They can be cancelled if CPU drops below the scale-in threshold. In my case the ...

They can be cancelled if CPU drops below the scale-in threshold. In my case the activities were CPU-heavy, batch-style, and not client-facing — so preferred occasional retries and slightly longer runtimes over blowing up the AWS bill. For that workload, CPU-based autoscaling was perfectly fine.

I originally ran this setup on Temporal Cloud, and pulling detailed worker/queue metrics directly from Cloud can be tricky... you need to expose custom worker metrics yourself, then pipe them into CloudWatch. If you host Temporal yourself, it is easier:)