Scaling and autoscaling
Scaling is the act of increasing or decreasing the size of a workload or resource depending on the demand for it. Autoscaling is doing this automatically based on some sort of trigger.
As is often the case with workloads and applications, you can become a victim of your own success. The more your application succeeds, the greater the strain on it due to demand from users or services. To manage this strain often requires limitations placed on access to your application. You should do this if you don’t want to get overwhelmed with requests, trust me, because someone will try to do exactly that. But you should also have provisions in your infrastructure that can help it grow naturally with your growing user base.
That is where scaling comes in. Scaling can be done either vertically (adding greater computing power to a device) or horizontally (adding more computers). When performing one powerful act, vertical scaling is ideal and when processing a greater number of requests, you’ll need horizontal scaling. Most DevOps workloads require the latter over the former.
We will now explore the different types of scaling based on how hands-on you have to be with the workload that you are scaling. We will start with manual scaling and slowly escalate toward a more automated approach.
Manual scaling with Python
Before we dive into autoscaling, let’s just look at some regular scaling (done with Python, of course). We will vertically scale an instance manually using Python’s SDK for AWS. I will be using just my regular local IDE. But you can do this with any combination of Python, AWS CLI, and an AWS account. So, let’s head into the steps you would need to take to manually scale an EC2 instance using Python scripts:
- Here is the code to create an EC2 instance (this will be up in the book’s repository as well):

Figure 4.4 – Function to create an EC2 instance
And when you run it, you’ll get the instance ID (which you will need for this next part):

Figure 4.5 – EC2 instance created with a unique ID
You’ll see that the instance with that same instance size and ID has been created on the AWS EC2 console:

Figure 4.6 – Running EC2 instance
2. Now, vertical scaling acts on that same instance but the instance size cannot be changed while it is running, so we will stop the instance first:

Figure 4.7 – Function to stop an EC2 instance
This code will stop the instance when it is run. Confirm that the instance is stopped and note the size of the instance is still t2.nano:

Figure 4.8 – Stopped EC2 instance
3. Now, let’s write the code to modify the instance into a t2.micro instance:

Figure 4.9 – Code to update an EC2 instance
After running this code, you’ll notice that on the console, your instance is now a t2.micro instance:

Figure 4.10 – Updated EC2 instance size
4. So, once you restart the instance, it will have that extra power available.
You may have noticed that this is a slog. And vertical scaling is – more often than not – a slog of downtime. While there are use cases for things like these (especially when you need to work with bigger individual machines), it’s not the norm. Usually, horizontal autoscaling is better for your use case because of the lesser amount of downtime associated with the process. We’ll dive into that now.