AWS Autoscale added a new feature “Warm Pool”. Let us explore this feature and see how luroConnect uses this to reduce hosting costs.
The autoscale latency problem
Usually, AWS Autoscale will launch a new server with the given AMI image based on the launch configuration or launch template configured. Launching a new server takes about 4 minutes or more. So let us say a scale-out event is configured for launching a server when the CPU across all autoscale instances exceeds 70% for 1 minute. Now, let us say a sale promotion on facebook causes a surge in traffic causes this event to trigger. It takes AWS 4+ minutes to respond and add a new server. If during this 4 minute period, the surge goes past 70% and say reaches 90-100%, it is likely that visitors will see a slowdown or even errors. The 4+ minute period is called the autoscale latency and in designing the scale-out and scale-in parameters, it plays a crucial role.
For a website that sees frequent surge in traffic in short spurts, one would be prompted to use a lower threshold for a scale-out event. A lower threshold will result in frequent triggering of scale-out events.
At the same time the scale-in threshold will also have to be reduced to ensure enough spread between scale-out and scale-in events. A lower spread will result in an unhealthy sequence of a scale-out event adding a resource for it to be immediately removed.
Autoscale designers then tend to add higher number of minimum instances, possibly of larger sizes. That reduces the effectiveness of autoscale – and increases AWS costs.
Lowering the autoscale latency results in a better autoscale system. As the latency reduces, the need for larger number of minimum instances or larger size instances reduces. This results in savings in the AWS bill.
Introducing the warm pool
AWS now introduces the concept of a warm pool. The costs saving of a warm pools come from AWS policy for not charging for instances in stopped state – except for the disks. A warm pool is a set of autoscale instances that are launched but kept in stopped state. When a scale-out event happens, the latency Is now reduced to the boot time of an instance and any initialization needed – we measured adding 3 instances took about 35 seconds to start serving traffic for Magento.
A scale-in policy simply stops the selected instance and add it back to the warm pool.
How to use a warm pool?
If you are using launch template for your autoscale, creating a warm pool is easy and documented here. If using lifecycle events, newer events have been introudced.
If using a launch configuration, we suggest upgrading to a launch template before using a warm pool. While upgrading to a launch template is easy, it is advisable to read about launch templates as they are a different and a larger concept.
Changing your instance image when in a warm pool
AWS has support for “instance refresh” – a term used by AWS to indicate an update in AMI image for all running and warm pool instances in a single command. However, this update has a crucial flaw – it can keep your website inaccessible for a short time. This is due to AWS terminating an instance before adding one. If an image has to be updated – such as a new code deploy – a custom strategy has to be deployed to ensure the website does not go down.
luroConnect support for warm pool
luroConnect now supports warm pools across all its autoscale plans, with a scripted image update policy that ensures 0 downtime during image change as well as a code deploy strategy that ensures 0 downtime on code deploy.
Issues with AWS Reference architecture and tools for a Magento application
At luroConnect we implemented our autoscaling system after addressing flaws in many implementations we had seen.
As AWS autoscale by default is integrated into AWS load balancers – ELB or ALB. Using AWS reference implementation will put the code in a autoscale instance with nginx or apache with php and the code. Traffic can be routed through the ELB/ALB which will handle SSL and route the traffic to each autoscale instance.
When code has to be updated, a new AMI will be created and AWS instance refresh can be run to update the instances.
You could use AWS CodeDeploy as described here but you need to set it up to make sure Magento setup upgrade can be run when required.
Problems with autoscale implementations for Magento
- Issues configuring FPC (Full Page Cache) with this configuration : If varnish is configured on all autoscale instances (as we have seen many implementations do), each server will warm caches on its own. Clearing pages from cache will also be difficult. Using redis as a FPC increases per page latency for cached pages.
- Media and var folders are needed to be shared across all servers. NFS is typically used to share. However, the configuration of each autoscale instance has to be such that it can discover and mount the folders from the NFS server.
- When a code change has to be deployed, it is not clear how it can be done without causing a downtime of the website. Using AWS Code Deploy requires a complex setup to ensure setup upgrade is run before one of the 0 downtime strategies can be used.
- When a new server is launched, conditions to check the health of the website are not easy to write. This results in a few error responses before the server is ready to serve traffic.
- It is difficult to use a AWS ALB to route traffic for specific purposes – for example, routing traffic to a wordpress server for /blog urls.
luroConnect Autoscale on AWS : Smooth setup and running.
luroConnect Autoscale solves these problems.
luroConnect lets AWS monitor instances and decide when to add or remove (scale out or scale in) instances. luroConnect autoscale for AWS adds cloudwatch events and lifecycle management generated by AWS Autoscale to ensure a very smooth Autoscaling operation. luroConnect uses nginx as a load balancer and does not require a ALB/ELB to operate. luroConnect Autoscale supports AWS Autoscale with warm instances and has a mechanism to update the AMI when needed without any downtime.
- Using nginx as a load balancer allows high flexibility in deciding which urls go to varnish for full page cache and which should be directly served by php. varnish as a full page cache gives the maximum impact of full page caching.
- A nfs server holds shareable content of magento – specific media and var folders for example. Using NIS, autofs and NFS, each new app server is able to discover the NFS share.
- When a code change has to be deployed, php code using nfs is shared to each app server. A php reload and opcache configuration will ensure the new code is kept in the php opcache memory for all future operations. A php file from NFS share is loaded only once.
- Before a server is added to the nginx load balancer, extensive checks are done to ensure the new autoscale instance is ready to take traffic, including warming the opcache.
- nginx as a load balancer brings in a lot of flexibilty in routing traffic such as a /blog to a wordpress website, custom rewrites, etc.