devops

Using AWS Autoscale “warm pools” to reduce costs

AWS Autoscale added a new feature “Warm Pool”.  Let us explore this feature and see how luroConnect uses this to reduce hosting costs.

The autoscale latency problem

Usually, AWS Autoscale will launch a new server with the given AMI image based on the launch configuration or launch template configured. Launching a new server takes about 4 minutes or more. So let us say a scale-out event is configured for launching a server when the CPU across all autoscale instances exceeds 70% for 1 minute. Now, let us say a sale promotion on facebook causes a surge in traffic causes this event to trigger. It takes AWS 4+ minutes to respond and add a new server. If during this 4 minute period, the surge goes past 70% and say reaches 90-100%, it is likely that visitors will see a slowdown or even errors. The 4+ minute period is called the autoscale latency and in designing the scale-out and scale-in parameters, it plays a crucial role.

For a website that sees frequent surge in traffic in short spurts, one would be prompted to use a lower threshold for a scale-out event. A lower threshold will result in frequent triggering of scale-out events.

At the same time the scale-in threshold will also have to be reduced to ensure enough spread between scale-out and scale-in events. A lower spread will result in an unhealthy sequence of a scale-out event adding a resource for it to be immediately removed.

Autoscale designers then tend to add higher number of minimum instances, possibly of larger sizes. That reduces the effectiveness of autoscale – and increases AWS costs.

Lowering the autoscale latency results in a better autoscale system. As the latency reduces, the need for larger number of minimum instances or larger size instances reduces. This results in savings in the AWS bill.

Introducing the warm pool

AWS now introduces the concept of a warm pool. The costs saving of a warm pools come from AWS policy for not charging for instances in stopped state – except for the disks. A warm pool is a set of autoscale instances that are launched but kept in stopped state. When a scale-out event happens, the latency Is now reduced to the boot time of an instance and any initialization needed – we measured adding 3 instances took about 35 seconds to start serving traffic for Magento.

A scale-in policy simply stops the selected instance and add it back to the warm pool.

Warm Pool For Autoscale

How to use a warm pool?

If you are using launch template for your autoscale, creating a warm pool is easy and documented here. If using lifecycle events, newer events have been introudced.

If using a launch configuration, we suggest upgrading to a launch template before using a warm pool. While upgrading to a launch template is easy, it is advisable to read about launch templates as they are a different and a larger concept.

Changing your instance image when in a warm pool

AWS has support for “instance refresh” – a term used by AWS to indicate an update in AMI image for all running and warm pool instances in a single command. However, this update has a crucial flaw – it can keep your website inaccessible for a short time. This is due to AWS terminating an instance before adding one. If an image has to be updated – such as a new code deploy – a custom strategy has to be deployed to ensure the website does not go down.

luroConnect support for warm pool

luroConnect now supports warm pools across all its autoscale plans, with a scripted image update policy that ensures 0 downtime during image change as well as a code deploy strategy that ensures 0 downtime on code deploy.

Issues with AWS Reference architecture and tools for a Magento application

At luroConnect we implemented our autoscaling system after addressing flaws in many implementations we had seen.

As AWS autoscale by default is integrated into AWS load balancers – ELB or ALB. Using AWS reference implementation will put the code in a autoscale instance with nginx or apache with php and the code. Traffic can be routed through the ELB/ALB which will handle SSL and route the traffic to each autoscale instance.

When code has to be updated, a new AMI will be created and AWS instance refresh can be run to update the instances.

You could use AWS CodeDeploy as described here but you need to set it up to make sure Magento setup upgrade can be run when required.

Problems with autoscale implementations for Magento

  1. Issues configuring FPC (Full Page Cache) with this configuration : If varnish is configured on all autoscale instances (as we have seen many implementations do), each server will warm caches on its own. Clearing pages from cache will also be difficult. Using redis as a FPC increases per page latency for cached pages.
  2. Media and var folders are needed to be shared across all servers. NFS is typically used to share. However, the configuration of each autoscale instance has to be such that it can discover and mount the folders from the NFS server.
  3. When a code change has to be deployed, it is not clear how it can be done without causing a downtime of the website. Using AWS Code Deploy requires a complex setup to ensure setup upgrade is run before one of the 0 downtime strategies can be used.
  4. When a new server is launched, conditions to check the health of the website are not easy to write. This results in a few error responses before the server is ready to serve traffic.
  5. It is difficult to use a AWS ALB to route traffic for specific purposes – for example, routing traffic to a wordpress server for /blog urls.

luroConnect Autoscale on AWS : Smooth setup and running.

luroConnect Autoscale solves these problems.

luroConnect lets AWS monitor instances and decide when to add or remove (scale out or scale in) instances. luroConnect autoscale for AWS adds cloudwatch events and lifecycle management generated by AWS Autoscale to ensure a very smooth Autoscaling operation. luroConnect uses nginx as a load balancer and does not require a ALB/ELB to operate. luroConnect Autoscale supports AWS Autoscale with warm instances and has a mechanism to update the AMI when needed without any downtime.

  1. Using nginx as a load balancer allows high flexibility in deciding which urls go to varnish for full page cache and which should be directly served by php. varnish as a full page cache gives the maximum impact of full page caching.
  2. A nfs server holds shareable content of magento – specific media and var folders for example. Using NIS, autofs and NFS, each new app server is able to discover the NFS share.
  3. When a code change has to be deployed, php code using nfs is shared to each app server. A php reload and opcache configuration will ensure the new code is kept in the php opcache memory for all future operations. A php file from NFS share is loaded only once.
  4. Before a server is added to the nginx load balancer, extensive checks are done to ensure the new autoscale instance is ready to take traffic, including warming the opcache.
  5. nginx as a load balancer brings in a lot of flexibilty in routing traffic such as a /blog to a wordpress website, custom rewrites, etc.

Would you like to switch to a modern hosting platform?

Schedule a call of a free evaluation!

With features like ~0 downtime code deploy and autoscale to reduce your hosting costs, luroConnect offers you unparalleled hosting environment for Magento.

Schedule a call and we will show you how we can

  • Improve your hosting, possibly with autoscale
  • Have a managed dev, staging and production environment
  • Server performance measured every minute with alerts for a slowdown
  • A multi point health check every day
  • Optimized hosting costs

How Magento can get near 0 downtime deployment

Factor III of the 12 Factor App says “Store config in the environment”.

12 Factor App is what devops lives by – a set of 12 principles written by Adam Wiggins for predictable web app deployments.

Storing configuration in environment, separate from code has the advantages of reliable deployment along with reduced time to deploy. It allows separation of the build stage from the deploy stage, with some deploys being just a change in a softlink to the web root folder.

Historical preview : Magento 1

Magento 1 did not have much of a build process – js and css were not versioned, magnification was “online” first access based as was database upgrade information, configuration was stored in the database.

The most reliable way to go from a dev configuration to a live configuration would require a set of known steps that would work or changes directly to the database.

luroConnect developed its own build and deploy process. In our build step we

  • get source code from git
  • minify css and js files in the skin and js folders using a grunt based process
  • set appropriate file ownership and permissions

During the deploy phase, we

  • Copy app/etc/local.xml from a secure deployment configuration area (our environment)
  • modify the core config data to add a version string in the skin and js URLs
  • access the website once through the index.php to cause the update scripts to run

Deploy process is of course run with the site in maintenance – we prefer to do this at the nginx level. Mostly it is a small blip.

Historical preview – pre Magento 2.2

Early Magento 2 builds were similar – except there was some help from the bin/magento command. Our deploy process did not need to version the static access anymore. Plugin enable / disable was given via config.php. Our deployment environment contained env.php.

However, developers had to manually configure and experiment with some options.

Site bringup required devops to access the admin panel or update the database with custom sql – enabling varnish, setting up CDN with a static URL, etc.

Magento 2.2 and beyond

Magento adopted the direction of the 12 factor app and presented in Magento Live UK 2017 a new set of features that would help in ensuring an ability to split the application configuration and environment configuration. Application configuration was defined in app/etc/config.php which is advised to be in git and hosting environment and secure details are kept in env.php which should not be kept in git.

It is a slightly weak conformance – as commented by 12factor app “This is a huge improvement over using constants which are checked into the code repo, but still has weaknesses: it’s easy to mistakenly check in a config file to the repo; there is a tendency for config files to be scattered about in different places and different formats, making it hard to see and manage all the config in one place. Further, these formats tend to be language- or framework-specific.”

Magento has fixed this in 2 ways

  1. The language specific aspect is addressed to some extent in Magento by allowing to use bin/magento cli to edit env.php for sensitive data. The config:sensitive:set directly writes to env.php. These commands no not require the database, hence, can be set in a pre-deploy step.
  2. Use of scoped environment variable names. These would be set in Nginx configuration or an include file such as fastcgi_params.

However, there is no documented way to set database details – except to manually edit the env.php file.

The app:config:dump command

A great help in maintaining a known configuration of the application (which 12factor app suggests be committed to git). This ensures communication between developer to operations.

The app:config:dump command writes to config.php and env.php. While config.php is suggested to be committed to git, env.php should not be committed to git.

If a value is in config.php, the Magento admin panel does not allow the parameter to be edited. This locking helps with giving stability to the application configuration. It ensures the application is developed and tested with a known configuration.

The figure alongside shows the suggested flow.

Suggested flow for using app:config:dump

Why is Magento deployment yet keeping site in maintenance?

However, we find that even after 2 1/2 years of announcement, the acceptance and understanding of these features is weak. Leaving websites in maintenance mode as code is deployed.

Developers are failing to maintain a discipline to own the configuration or devops to understand the application’s build and deploy process.

There are some practical problems as well. An eCommerce manager would like to have control on the live website on say, when backorders would be allowed storewide. Since this is locked in config.php, this request has to go through developers or devops.

luroConnect near 0 downtime deploy

luroConnect’s Magento 2 build is in a pipeline – such as a bitbucket pipeline. A commit triggers the pipeline that does the following

  • composer install (with the compose cache to speed this process)
  • bin/magento setup:di:compile
  • bin/magento setup:static-content:deploy

The contents are then tarred and sent to the staging and production servers.

Upon deploy the contents are untared, deployment related files like env.php are copied, media and var are softlinked. The web root softlink is changed to point to this new release. The process is slightly more complicated when multiple autoscale instances are running, as running instances are replaced with ones with new code.

If required the bin/magento setup:upgrade command is run and only then is it required to keep the site in maintenance.

Would you like to switch to a modern hosting platform?

Schedule a call of a free evaluation!

With features like ~0 downtime code deploy and autoscale to reduce your hosting costs, luroConnect offers you unparalleled hosting environment for Magento.

Schedule a call and we will show you how we can

  • Improve your hosting, possibly with autoscale
  • Have a managed dev, staging and production environment
  • Server performance measured every minute with alerts for a slowdown
  • A multi point health check every day
  • Optimized hosting costs

12 factor app and Magento

Adam Wiggins’ 12 factor app (https://12factor.net) is a highly respected standard for web apps. While written with SaaS applications in mind, let us explore and see how Magento and the ecosystem stands up to these factors.

1. Codebase. One codebase tracked in revision control, many deploys.
Magento is in git and hence a typical Magento project should not have a problem with this.
However, if you use vue-storefront, a popular PWA frontend to Magento, this is broken. Vue-storefront has 2 repos of its own in addition to the Magento repo, all becoming one app.
Another violation happens when a plugin vendor gets ssh access to your live server to fix a plugin issue. Plugin vendors have a serious problem integrating their code into multiple source bases without Magento supporting a versioned plugin architecture out-of-the-box.

2. Dependencies. Explicitly declare and isolate dependencies.
With composer Magento solves this problem.
Violation of plugins is a case in point – many plugins are installed not as composer dependencies. Instead they make it to the merchant repo.

Magento uses php and typical websites are deployed using php-fpm. One may argue that the php-fpm plugins that Magento depends on are not explicitly declared. Leading to the application not working exactly in 2 environments. Another case in point is dependency on php version.

3. Config. Store config in the environment.
12 factor app requires environment variables to be used. Magento has split application and environment configuration between config.php and env.php.
Here is what 12 factor says.
“Another approach to config is the use of config files which are not checked into revision control, such as config/database.yml in Rails. This is a huge improvement over using constants which are checked into the code repo, but still has weaknesses: it’s easy to mistakenly check in a config file to the repo; there is a tendency for config files to be scattered about in different places and different formats, making it hard to see and manage all the config in one place. Further, these formats tend to be language- or framework-specific.”

However, Magento has worked towards this. Specifically, with bin/magento config:set and bin/magento config:sensitive:set commands are a useful way for hosting providers to be 12 factor compliant.

luroConnect has always stored hosting configuration settings separately from the release. Upon deployment of code, the contents of deployment folder are copied. Sometimes they have settings for the application. These include hosting specific as well as sensitive settings. We are moving to using config:set and config:sensitive:set for versions of Magento that support it. We will also move towards storing sensitive variables in secure key stores.

4. Backing services. Treat backing services as attached resources.
“Resources can be attached to and detached from deploys at will.”

While Magento is very good at storing key connections outside the application and database, violations exist in 3rdparty plugins. To “ease” the deployment most store credentials and connectivity details in the database. Another issue is with SMTP plugins, instead of depending on magento’s default use of localhost and let postfix configuration manage the actual email sending, developers see the convenience of storing this information in the database.

Check out this post on SMTP and postfix configurations.

5. Build, release, run. Separate build and run stages.
Magento has been improving the code deployment process. The setup upgrade is the only command that, if needed, requires the site under maintenance.

6. Processes. Execute the app as one or more stateless processes.
Twelve-factor processes are stateless and share-nothing. Any data that needs to persist must be stored in a stateful backing service, typically a database.

Magento is very good on this count if used with nginx and php-fpm.

7. Port binding. Export services as port binding.
“PHP apps might run as a module inside Apache HTTPD” is flagged as a violation if the apache is also used as a webserver.
nginx + php-fpm gives the best isolation and performance of any stack. Php processes can be independently controlled in a server running php-fpm while nginx can be used for routing and handling web requests, terminating SSL, etc.

8. Concurrency. Scale out via a process model.
Magento is very good at this. Aided by php-fpm process model that complies with the 12 factor app, it is possible to build a cluster to handle only checkout urls for example, with routing handled by an application load balancer such as nginx.

9. Disposability. Maximize robustness with fast startup and graceful shutdown.
While Magento and php are good at this, some notes are in order.
A reload of php-fpm by default will kill all php processes even though they may be executing a request. Ensuring no new traffic is coming to the php-fpm, and waiting for draining by checking the status for number of active processes (with a timeout ofcourse) will ensure gracefulness in shutdown.
In order to ensure robustness against sudden death of the php-fpm process, it is best to keep the queue length (listen.backlog) to a small number. Turns out managing the queue to scaleout helps in application performance as well.

10. Dev/prod parity. Keep development, staging, and production as similar as possible.
The 12 factor app describes 3 gaps – time, personnel and tools. Based on our experience, the personnel gap is eliminated by automation. A commit trigger based automated CI/CD pipeline with an automated deploy to staging and production ensures there is no personnel gap.

A development environment with write access to git can be created with a similar infrastructure to help developers debug issues.

11. Logs. Treat logs as event streams.
Magento allows creation of multiple log files. Modern logging such as monolog allows more control of what is and what isn’t logged. Logs are also generated by nginx, php-fpm and other services used.
Streaming logs for querying and analysis is typically done by your hosting provider.

luroConnect uses fluentd to capture logs. Logs are sent to our Insight service, which analyzes data per minute, hour or day.

12. Admin processes. Run admin/management tasks as one-off processes.
Magento supports cron and rabbitmq based processes. In addition, setup upgrade is also used to change the state of the database during deployment.
However, suggested access to developer for “run arbitrary code or inspect the app’s models against the live database” is not recommended by luroConnect due to security and the risks of the application stability with the state being altered arbitrarily.

Deploying a Magento PWA project

Why PWA might be the future of headless eCommerce

Progressive Web Apps (PWAs) are designed to address the mobile revenue gap indicated below.

In most markets, online retail has a higher proportional audience reach among mobile users than desktop. However, mobile sales numbers are much lower. There can be many reasons for this, some in the realm of technology.

At one time it was thought mobile engagement was best achieved using an app. This was based on data for mobile users in general, but was possibly skewed towards gaming and social media. An app has advantages of being able to deliver notifications, use mobile features the way a typically website cannot, be installed as an icon resulting in easier access.

On the flip side, there is data suggesting mobile users were reluctant to install apps due to issues like memory limitations. Other disadvantages include a need for an OK from Apple to be on the app store, lack of easy-to-test infrastructure, the rather slow process of distributing updates – for example some users may not have auto update on.

Using service worker technology, widely supported by browsers today, the PWA (or progressive web application) will give some of the benefits of an app without the cost of consuming memory, requiring an OK from Apple and being as easy to deploy new updates as it is to update a website. PWAs are also as easy to test as a website. A PWA will install on a mobile as an icon – much like an app, service workers allow push notifications as well local storage, allowing for some offline capabilities.

From a technology perspective PWAs pose a completely different problem – the relative newness of technology means developers are limited, as are systems to reliably host & deploy. The cost of development will come down as more developers and websites adopt the technology.

Hosting related issues with PWA for Magento

From a hosting and code deployment perspective. Vue-storefront for example, replicates the entire catalog in elasticsearch and uses 2 nodejs processes to run the frontend of the store. PWA Studio is expected to be Magento (read php) native, yet the reference implementation of its Upward Specifcation Is in nodejs. Both developer and production environments pose challenges.

Developer Environment

It is no more a localhost WAMP stack that you can deploy and get a development environment for PWA setup. A single project will require setup of various components (vue-storefront, vue-storefront-api, Magento, graphql, elasticsearch, redis, rabbitmq, etc).

The developer environment will affect the learning curve of the many new developers starting with these new technologies as well as the productivity of experienced developers.

Here are some challenges

  1. Launch a development environment for a new project
  2. Setup a developer environment for an existing project
  3. An ability to change and test any component easily – for example if a js file was modified that affects the UI, what component(s) need be redeployed? Can this process be automated?

It is too early for us to start work on solutions for developers – since we do not do project work ourselves. However, we are working with our partners and we have an eye on releasing a “developer stack”. Contact us if you would be interested.

Production Environment

We recently took a PWA website live. Developed by our partner Codilar, it was a first for us. Some of the challenges faced and lessons learnt are summarized below.

  1. Setting up a production environment.
    Since there were so many components, not natively supported by Magento, many configuration files had to be manually modified.

    1. Vue-storefront (the UI end that replaces varnish in a classic Magento 2) needs to communicate to Redis for Full Page Cache and the vue-storefont-api
    2. Vue-storefront-api communicates with elasticsearch and Magento 2 backend via a rest API. Ideally vue-storefront should replicate the entire catalog through a indexing process into elasticsearch, but that is not fully operational yet.
    3. Magento 2 has its own redis cache and redis session. Magento 2 FPC is not used. Magento 2.3 uses RabbitMQ in addition to connection to its database.
      Here is our architecture for the deployment. We used Virtual machines as shown. We did not use a containerised architecture. The reasons will possibly be a different blog post.
  1. Starting nodejs processes automatically. Vue-storefront uses pm2 for process management. However, developer information and documentation is written using yarn to run the pm2 processes with log files being stored in ~/.pm2. In order for better control from a system administration perspective, we installed pm2 at the global level, generated systemd files (using pm2 startup) and modified them to suit the environment. We can now use “service vue-storefront start/stop/restart”.
  2. Monitoring all the components.
    Log files for each component are taken to a central log processing server using CNCF project fluentd.
    A key challenge is observability of failures. A Magento 2 API failure is not obvious. An error return code from vue-storefront needs to be traced to vue-storefront-api to Magento. Correlating the actual hit that caused a non-fatal Magento error is another challenge.
  3. How to deploy new code with minimum or even 0 downtime
    For Magento 2, until a database change is required (via a bin/magento setup:upgrade and/or indexing), we have a process to make a deployable package, giving an opportunity to deploy with 0-downtime. Check out our bitbucket pipeline presentation.
  4. How can one deploy a vue-storefront based PWA?
    The project we migrated ran on 2 git repos – one for Magento, the other for vue. Upon deploy we need to find the files that changed since the last release and decide if the change is in Magento, vue-storefront or vue-storefront-api and decide the build steps appropriately. Presently since the repos are different, we have 2 separate builds running on the production servers. A pipeline based deploy is our next step.

Note: We think a monorepo for both Magento and vue is essential in the long run due to possibility of versioning incompatibilities.

Conclusion

This is yet early work-in-progress and we hope to update our process and keep updating this article as we go.