Automating server maintenance and patching
I once had a friend whose only job most days was to wait for a website to go down, check why it had gone done, and do one of two things or commands that he had been given to bring it back up again. I had another friend whose job was to manually restart an NGINX server whenever it went down. I once met a man whose job largely involved just downloading CSVs from one place, putting them somewhere else, and clicking a start button. Now, to some of you, that might sound like a swell gig (to me, it doesn’t sound half bad either), but the thing about it is that it is a waste of time for both that individual’s employers and that individual themselves. There is no growth or improvement for either, and in my experience of life, that is a waste of human life.
In the coming samples, we are going to see how we can maintain multiple instance fleets based on a series of common commands and then we are going to find a way to patch an OS after we have discovered the type of OS it is running. We are going to do all of this with the help of Python.
Sample 1: Running fleet maintenance on multiple instance fleets at once
Maintaining a server involves a lot of work – a lot of repetitive work. This is why server maintenance was initially automated. It minimizes human error and also makes sure that the process occurs the same way every time. A fleet of servers works similarly. It is just about using the automation script for all of them since they are copies of an original server. But what about multiple instance fleets with different needs? Here, Python can be of assistance. All you need to do is associate each fleet with the correct script for maintaining it. This can allow you to manage multiple fleets over multiple clouds if you want to. So, without further ado, let’s see how we can do that:
- Let’s first write the code for AWS instances to find the instances that are running:
import boto3
ec2_client = boto3.client(‘ec2’)
response = ec2.describe_instances(Filters=[{‘Name’: ‘instance-state-name’, ‘Values’: [‘running’]}])
aws_instances = response[‘Reservations’]
This will give us a list of instances from EC2 to use. You can use a number of identifiers to define your fleet. You can even use a pre-defined system manager fleet.
2. Let’s now do the same thing for Google Cloud Compute Engine instances:
from google.cloud import compute_v1
instance_client = compute_v1.InstancesClient()
request = compute_v1.AggregatedListInstancesRequest()
request.project = “ID of GCP project that you are using”
gcp_instances= instance_client.aggregated_list(request=request, filter=”status:RUNNING”)
In the Google Cloud Platform (GCP) code, there are a few differences because you need to specify the GCP project ID and you have to define the request to the API along with the API itself.
3. Now, let’s find a command to run through these instances. It can be any placeholder command. You can later use the commands you want for it:
command = “sudo reboot”
#for AWS
ssm.send_command(InstanceIds=aws_instances, DocumentName=”<Whatever you want to name it>”,
Comment=’Run a command on an EC2 instance’,
Parameters={
‘commands’: [command]
}
)
#for Google Cloud
import os
import subprocess
from google.oauth2 import service_account
from googleapiclient import discovery
# Load the service account credentials
service_account_file = ‘<file_name_here>.json’
credentials = service_account.Credentials.from_service_account_file(
service_account_file, scopes=[‘https://www.googleapis.com/auth/cloud-platform’]
)
# Create a Compute Engine API client
compute = discovery.build(‘compute’, ‘v1’, credentials=credentials)
# Get the public IP address of the VM instance
request = compute.instances().get(project=”<your_project>”,instance=”your_instance_name”)
response = request.execute()
public_ip = response[‘networkInterfaces’][0][‘accessConfigs’][0][‘natIP’]
# SSH into the VM instance and run the command
ssh_command = f’gcloud compute ssh {instance_name} –zone {zone} –command “{command}”‘
try:
subprocess.run(ssh_command, shell=True, check=True)
except subprocess.CalledProcessError:
print(“Error executing SSH command.”)
The preceding code for GCP and AWS differs a bit because of the way that the APIs have been developed for it. However, they both will produce the result of executing an SSH command on their servers.
So, if we iterate through the lists that we previously produced through the function to update them with a command, we can make a mass change or update to our entire instance fleet.
This method is good for a generic fleet where we presume that all the OS are the same or that they run the same commands. But what if we were in an environment where the OS could be different? How would we then go about using commands? In the next section, we will explore this possibility.