
SaaS Pricing Strategies: Optimize for Growth
December 18, 2019
MacOS Malware Outbreaks 2019 | The Second 6 Months
December 18, 20193. Automate cost optimizations
The best way to make sure that your team is always following cost-optimization best practices is to automate them, reducing manual intervention.
Automation is greatly simplified using a label–a key-value pair applied to various Google Cloud services. For example, you could label instances that only developers use during business hours with “env: development.” You could then use Cloud Scheduler to schedule a serverless Cloud Function to shut them down over the weekend or after business hours and then restart them when needed. Here is an architecture diagram and code samples that you can use to do this yourself.
Using Cloud Functions to automate the cleanup of other Compute Engine resources can also save you a lot of time and money. For example, customers often forget about unattached (orphaned) persistent disk, or unused IP addresses. These accrue costs, even if they are not attached to a virtual machine instance. VMs with the “deletion rule” option set to “keep disk” retain persistent disks even after the VM is deleted. That’s great if you need to save the data on that disk for a later time, but those orphaned persistent disks can add up quickly and are often forgotten! There is a Google Cloud Solutions article that describes the architecture and sample code for using Cloud Functions, Cloud Scheduler, and Stackdriver to automatically look for these orphaned disks, take a snapshot of them, and remove them. This solution can be used as a blueprint for other cost automations such as cleaning up unused IP addresses, or stopping idle VMs.
4. Use preemptible VMs
If you have workloads that are fault tolerant, like HPC, big data, media transcoding, CI/CD pipelines or stateless web applications, using preemptible VMs to batch-process them can provide massive cost savings. In fact, customer Descartes Labs reduced their analysis costs by more than 70% by using preemptible VMs to process satellite imagery and help businesses and governments predict global food supplies.
Preemptible VMs are short lived– they can only run a maximum of 24 hours, and they may be shut down before the 24 hour mark as well. A 30-second preemption notice is sent to the instance when a VM needs to be reclaimed, and you can use a shutdown script to clean up in that 30-second period. Be sure to fully review the full list of stipulations when considering preemptible VMs for your workload. All machine types are available as preemptible VMs, and you can launch one simply by adding “-preemptible” to the gcloud command line or selecting the option from the Cloud Console.
Using preemptible VMs in your architecture is a great way to scale compute at a discounted rate, but you need to be sure that the workload can handle the potential interruptions if the VM needs to be reclaimed. One way to handle this is to ensure your application is checkpointing as it processes data, i.e., that it’s writing to storage outside the VM itself, like Google Cloud Storage or a database. As an example, we have sample code for using a shutdown script to write a checkpoint file into a Cloud Storage bucket. For web applications behind a load balancer, consider using the 30-second preemption notice to drain connections to that VM so the traffic can be shifted to another VM. Some customers also choose to automate the shutdown of preemptible VMs on a rolling basis before the 24-hour period is over, to avoid having multiple VMs shut down at the same time if they were launched together.
5. Try autoscaling
Another great way to save on costs is to run only as much capacity as you need, when you need it. As we mentioned earlier, typically around 70% of capacity is needed for steady-state usage, but when you need extra capacity, it’s critical to have it available. In an on-prem environment, you need to purchase that extra capacity ahead of time. In the cloud, you can leverage autoscaling to automatically flex to increased capacity only when you need it.
Compute Engine managed instance groups are what give you this autoscaling capability in Google Cloud. You can scale up gracefully to handle an increase in traffic, and then automatically scale down again when the need for instances is lowered (downscaling). You can scale based on CPU utilization, HTTP load balancing capacity, or Stackdriver Monitoring metrics. This gives you the flexibility to scale based on what matters most to your application.
High costs do not compute
As we’ve shown above, there are many ways to optimize your Compute Engine costs. Monitoring your environment and understanding your usage patterns is key to understanding the best options to start with, taking the time to model your baseline costs up front. Then, there are a wide variety of strategies to implement depending on your workload and current operating model.
For more on cost management, check out our cost management video playlist. And for more tips and tricks on saving money on other GCP services, check out our blog posts on Cloud Storage, Networking and BigQuery cost optimization strategies. We have additional blog posts coming soon, so stay tuned!