scale

Choosing Hosting platform for Magento : Digital Ocean

One in a while you may want to check on the technology and options available for Magento hosting – technologies change, hosting costs reduce, life becomes easier for you and the merchant. In this multi-part series we will explore cloud platforms and their suitability to hosting a production Magento website. In this article we will explore how seriously you should consider Digital Ocean as a platform for hosting. Digital Ocean (DO) offers VMs it calls “droplets”. With an excellent blog and a friendly attitude towards developers, DO is a serious hosting contender. Many developers naturally recommend using DO to merchants for hosting Magento. How real is it? Let me explore a few pros and cons.

DO Choice of vCPU and memory

DO offers “shared CPU” and “dedicated CPU”. For eCommerce hosting we always prefer dedicated CPU. Shared CPUs work differently on all cloud providers and quite often when you need it the most, you do not get enough. Each vCPU is a Intel hyper-thread. As of this writing (July 2020), we see use of Intel Xeon Gold 6140 @ 2.30 GHz with DDR4 memory at 2.6 GHz. Our simple memory speed test showed about 1.8GHz effective throughput to memory.

DO disk : Love / hate relationship!

I love the disk speed – you get good SSD performance with no throttling. Even for attached storage. Unlike other cloud platforms, you don’t have to juggle with figuring out how many IOPS you need, once you understand for that platform how IOPs translates to speed. To test the disk I use a simple but effective method to test the disk speed. I create a file with /dev/zero using dd of 1GB. dd gives me the write speed. Here is the command I use : dd if=/dev/zero of=$tempFile bs=1M count=1024 conv=sync oflag=direct tempFile has the path to a temp file in the mounted disk I am testing For both direct disk as well as mounted block device we see 200-350 Mbps disk speeds across all droplets our customers use. This is the best we have seen in cloud platforms. Physical hardware can give upto 450Mbps speeds.

So, why the hate?

We have seen disk crashes – thankfully on staging servers. So, when it comes to production servers, we always recommend customers to have a DR plan to minimize loss of data when this happens. Our suspicion is that the storage is not in a managed RAID, hence a disk crash is a droplet specific event.

Network

File transfer speed test

: A large file transfer using scp on the internal network is done in about 130 MB/sec – about a 1 Gb/sec speed.

NFS

NFS of block storage performs poorly. We use nfscache so most reads are served from cache. However, if there is a need for a large number of reads or writes, the performance can drop dramatically.

Examples of NFS hurting performance

Magento 1 : As images are created in the frontend, there is a check to see if an image file exists before an image url is included in a page. This invariably results in slowdown. Magento 1 and 2 : large log files. Magento stores log files in a shared folder. Multiple lines per hit of logs will result in slowdown. Experience :  We were in the process of taking a Magento 2 website live that had about 2500 errors written to a log file. The issue was related to M1 migration resulting in some attributes not defined in M2. App servers saw 30% CPU in I/O wait and the site came to a crawl. Php access logs showed under 10% CPU utilization.

Non availability of autoscale

DO does not come with an autoscale option (you could get autoscale in their Kubernetes which we are not considering here). This means you may have to keep capacity for a holiday sale for example.

Managing server costs

Server costs even when you shut them down

DO charges for servers that are shut down. If you do not want to be charged fully for a server, storing an image is an option.

Recommendation

We host customers on Digital Ocean if they do not have a need for scaling and are willing to have a DR plan. This generally increases the cost of the overall solution. Also stores that do not need scaling such as :
  • PWA - Vue-storefront based stores which use nodejs
  • A high hit ratio varnish FPC magento store (depends on many factors, but essentially requirement for scaling multiple app servers is less likely)
 

How Magento can get near 0 downtime deployment

Factor III of the 12 Factor App says "Store config in the environment".

12 Factor App is what devops lives by - a set of 12 principles written by Adam Wiggins for predictable web app deployments.

Storing configuration in environment, separate from code has the advantages of reliable deployment along with reduced time to deploy. It allows separation of the build stage from the deploy stage, with some deploys being just a change in a softlink to the web root folder.

Historical preview : Magento 1

Magento 1 did not have much of a build process - js and css were not versioned, magnification was "online" first access based as was database upgrade information, configuration was stored in the database.

The most reliable way to go from a dev configuration to a live configuration would require a set of known steps that would work or changes directly to the database.

luroConnect developed its own build and deploy process. In our build step we

  • get source code from git
  • minify css and js files in the skin and js folders using a grunt based process
  • set appropriate file ownership and permissions

During the deploy phase, we

  • Copy app/etc/local.xml from a secure deployment configuration area (our environment)
  • modify the core config data to add a version string in the skin and js URLs
  • access the website once through the index.php to cause the update scripts to run

Deploy process is of course run with the site in maintenance - we prefer to do this at the nginx level. Mostly it is a small blip.

external

Historical preview - pre Magento 2.2

Early Magento 2 builds were similar - except there was some help from the bin/magento command. Our deploy process did not need to version the static access anymore. Plugin enable / disable was given via config.php. Our deployment environment contained env.php.

However, developers had to manually configure and experiment with some options.

Site bringup required devops to access the admin panel or update the database with custom sql - enabling varnish, setting up CDN with a static URL, etc.

Magento 2.2 and beyond

Magento adopted the direction of the 12 factor app and presented in Magento Live UK 2017 a new set of features that would help in ensuring an ability to split the application configuration and environment configuration. Application configuration was defined in app/etc/config.php which is advised to be in git and hosting environment and secure details are kept in env.php which should not be kept in git.

It is a slightly weak conformance - as commented by 12factor app "This is a huge improvement over using constants which are checked into the code repo, but still has weaknesses: it’s easy to mistakenly check in a config file to the repo; there is a tendency for config files to be scattered about in different places and different formats, making it hard to see and manage all the config in one place. Further, these formats tend to be language- or framework-specific."

Magento has fixed this in 2 ways

  1. The language specific aspect is addressed to some extent in Magento by allowing to use bin/magento cli to edit env.php for sensitive data. The config:sensitive:set directly writes to env.php. These commands no not require the database, hence, can be set in a pre-deploy step.
  2. Use of scoped environment variable names. These would be set in Nginx configuration or an include file such as fastcgi_params.

However, there is no documented way to set database details - except to manually edit the env.php file.

external

The app:config:dump command

A great help in maintaining a known configuration of the application (which 12factor app suggests be committed to git). This ensures communication between developer to operations.

The app:config:dump command writes to config.php and env.php. While config.php is suggested to be committed to git, env.php should not be committed to git.

If a value is in config.php, the Magento admin panel does not allow the parameter to be edited. This locking helps with giving stability to the application configuration. It ensures the application is developed and tested with a known configuration.

The figure alongside shows the suggested flow.

Suggested flow for using app:config:dump

Why is Magento deployment yet keeping site in maintenance?

However, we find that even after 2 1/2 years of announcement, the acceptance and understanding of these features is weak. Leaving websites in maintenance mode as code is deployed.

Developers are failing to maintain a discipline to own the configuration or devops to understand the application's build and deploy process.

There are some practical problems as well. An eCommerce manager would like to have control on the live website on say, when backorders would be allowed storewide. Since this is locked in config.php, this request has to go through developers or devops.

luroConnect near 0 downtime deploy

luroConnect's Magento 2 build is in a pipeline - such as a bitbucket pipeline. A commit triggers the pipeline that does the following

  • composer install (with the compose cache to speed this process)
  • bin/magento setup:di:compile
  • bin/magento setup:static-content:deploy

The contents are then tarred and sent to the staging and production servers.

Upon deploy the contents are untared, deployment related files like env.php are copied, media and var are softlinked. The web root softlink is changed to point to this new release. The process is slightly more complicated when multiple autoscale instances are running, as running instances are replaced with ones with new code.

If required the bin/magento setup:upgrade command is run and only then is it required to keep the site in maintenance.

Would you like to switch to a modern hosting platform?

Schedule a call of a free evaluation!

With features like ~0 downtime code deploy and autoscale to reduce your hosting costs, luroConnect offers you unparalleled hosting environment for Magento.

Schedule a call and we will show you how we can

  • Improve your hosting, possibly with autoscale
  • Have a managed dev, staging and production environment
  • Server performance measured every minute with alerts for a slowdown
  • A multi point health check every day
  • Optimized hosting costs

Free as in freedom, not free beer!

Introduction

The open source community has had this saying for long, though there are many, including myself, who do not understand what this difference means.

With recent changes to open source license agreements, this difference has come to the fore. For example, changes in the open source licensing of redis and mongodb has restricted how AWS and other cloud providers can conduct their business. Directly relevant to eCommerce merchants is the effect to open source Magento since Adobe’s acquisition of Magento and the path that has been followed

Magento Open Source vs Enterprise

Adobe’s business reason to have Magento is (in my opinion) to complete their offering. Adobe has seen a huge, public and successful transition from their traditional business of one-time purchase of packaged software to a subscription and even a SaaS model. With the acquisition of Magento and the Commerce Cloud licensing model, Adobe clearly thinks Magento should be packaged with hosting – hence the word Cloud in their offering and their pricing of the Enterprise license that includes cloud hosting. This transition is seen in SAP’s hybris commerce offering that includes hosting to make SAP Commerce Cloud. Unlike SAP hybris though, Magento is open source.

If you accept the Adobe Magento Commerce Cloud offering, you submit to the fixed set of features that are offered by their cloud or subscribe to a SaaS service for integration – though sometimes even that requires qualification or may not be possible.

For example, if you want PWA, you are limited and have to wait for the PWA Studio. If you want an improved search interface, you are limited to their choice. Similarly, for CDN, image optimization and Web Application Firewall, you are limited to Fastly, Adobe’s choice in the matter.

Or, perhaps, you use a plugin that is connected to a SaaS service.

When a Magento website is self-hosted though, the choice was to install a plugin or enhance the code that may require a service which has to be hosted. In the examples above, you may want to use vue-storefront or use one of many systems for search or use ImageMagick as an image optimization solution.

From Free Beer to Freedom!

The earlier licensing model of Magento pushed the decision to a “board level” – companies like ours always take the supported version. Mostly the open source version of Magento was attractive to those who were attracted by the “free beer”.

However, now the decision is that of freedom – since the paid version comes with restrictions.

If you feel guilty of being a taker of open source, you can sponsor community commits back to the open source Magento. The community participation is not negligible. Matt Asay of Adobe suggests it may be as high as 50% in this article.

(Indeed there are community participants who think Adobe is gaining from open source contributions, but that is a different blog article).

Shameless plug!

Full stack managed hosting support from luroConnect, gives you the benefit of supported opensource and the flexibility to build your own solution around it. The entire stack is based open source – from linux to Magento - including nginx, ModSecurity, redis, elasticsearch, sphinx and ofcourse mysql. (We relunctantly also allow ioncube encrypted plugins as well). Coupled with a release process from your git. Hosted on any cloud or open hosting providers – in the age of Uber you don’t need to own your data centers. Our multi-layered security approach and proactive monitoring comes standard. With additional features like a disaster recovery plan, image optimization, peep-hole maintenance and a dashboard to monitor and control key tasks such as code deployment or indexing, we bring peace-of-mind to Magento hosting. Check out our pricing and you can connect with us.

Magento Open Source vs Commerce Cloud

Magento Commerce Cloud does offer additional features. See alongside (zoom for a larger view) for a comparison taken from magento.com. Some key features like WYSIWYG editor will never be released in open source.

However, not everyone needs all of the features and some of these features are available from other plugin vendors or custom development from the many certified and non-certified agencies.

Infact, even if you are a Commerce Cloud customer, you will need customizations and potentially more plugins and even 3rdparty SaaS services to have a fully working store.

Conclusion

Magento Open Source is now a very viable option for all stores – brands and high volume stores included. With many options to customize and integrate, you have the freedom to make your own best of breed solution and not be restricted by the Adobe environment. With Managed hosting service you can get optimized and scalable websites.

6 tips to speedup Magento websites

Introduction

Website speed is very crucial to conversions. There are many studies that indicate how website slowdown of even one second can cause drop in conversions. And websites that have taken this learning have seen conversions grow. Take Pier 1 for example in this article.

Our tips are oriented towards what you need to do on the server side to improve performance especially during high traffic times, when server response times tend to be inconsistent. These tips will speedup Magento websites.

Tip # 1 : Restrict Bots

When times are a busy, you need to be mindful of who you allow into your store. Restricting BOTs is easy to implement by checking the 'user agent" field. With lowering of server load this will speedup Magento or help reduce resources.

Tip # 2 : Rate limit hits from the same IP

This is one more reduce tip that may work in your situation. Hits from the same IP - until your target audience is a group that is behind a firewall - will restrict simple robotic attempts to crowd your site - especially if you are running a time bound flash sale.

Bonus Tip : Use CDN to serve static content

During peak traffic, CDNs help in offloading not (just) servers but actually network bottlenecks of your servers. This causes a speedup in Magento by reducing the load on the entire system.

[Tweet "Rejecting unwanted traffic & BOTS is a great way to scale #Magento. http://bit.ly/2xb77Le"]

Tip # 3 : Use php7 instead of php5.x

php7 performance is almost 50% higher than php5. We have seen this in processing CPU intensive Magento hits. But, Magento 1.x does not officially support php7 and 3rd party patches. We like the one from inchoo. Magento 2 should be on php7.

Tip # 4 : Consider adding separate servers or pools for your checkout flow (or admin access)

While site performance is important, checkout flow performance is even more important as it is on the business end of your sales funnel. Consider adding a separate checkout servers or pools. They are akin to High Occupancy (HOV) or diamond lanes in traffic. This helps to speedup Magento's checkout process.

The same concept can be applied to restrict number of resources you reserve for admin access.

Tip # 5 : Manage Magento Indexing

Leaving Magento Indexes to "Update on Save" can get your newly uploaded products available for shopping faster, but can slow down the site! If business permits, move indexing to slightly low volume times of the day (or night). This setting leads directly to speedup Magento front end.

Tip # 6 : Monitor the performance of your caches

A Magento site has many caches and each one helps in reducing the server load and improve response times. But you need to monitor for hit to miss ratios and out of memory conditions. This achieves a speedup in Magento as more cache hits mean faster response. The caches one should look at are php opcache, Magento block cache, FPC (Full Page Cache) and mysql cache.

Conclusion

By reducing, redirecting and monitoring server, the overall performance of a website can be improved.

We can analyze your site for free

Schedule a call

Not happy with your website performance and want an expert to look at it?

  • We will analyze your site using public information.
  • We will ask you to give us a 1 day web server log file.
  • We will try to identify what steps if any you should take to improve your sites performance goals.

Watch our webinar on performance and scaling in Magento

Using an analogy to vehicular traffic we explain performance and scaling in Magento.
Key takeaways

  • Know how to compare hosting options
  • Importance of good code
  • How to scale
  • Tuning Magento

A non technical guide to scaling Magento (or any other website)

Performance & Scaling of a Magento web site are often confused. As a store owner who may not be technical a close analogy with real life will help in talking to your hosting providers and other experts.

It is no coincidence that hits to a website is called as traffic. We take this analogy further, to explain what factors matter to performance and scaling a website.

Website performance is like a car – higher performance cars drive faster and can cover a distance in a shorter time. Similarly, a higher performant website will serve a page fast. This is often measured as page load speed. A critical component of page load that the server is responsible for, is server response time. Like measuring performance of a car, measuring the page load speed is done in test mode with little or no traffic. Sometimes the performance is measured at a random time without looking at other traffic to the site. That is like test driving a car through traffic.

Scaling is like building highways and roads for the cars to move on. Highways are resources – CPU, memory, network that the hits to a website will utilize. The task of a Magento scaling expert is to architect a system – servers and sizes, services to run on each server, connectivity of the servers and access from internet, etc.

Hits to a website is like traffic of random cars on the highway. Each vehicle seems to have a mind of its own, joining and leaving the highway. Each visitor to the website will take their own journey visiting different pages.

Some Observations

Observation 1 : Like a car cannot drive at its highest speed possible at all times due to traffic, a website too cannot perform at its best best all the time. Understanding the factors that make the website perform at its optimal level all the time would be the task of both the developer and the server architect.

Observation  2: Like in traffic we have vehicles of different performance, in a website all URLs do not perform equally. A category page may not perform the same way as product detail page for example.

Observation 3: Better throughput will be achieved with the same resources, if the vehicle performance is improved – some bottlenecks can be avoided if the vehicles moved faster. Similarly, a better performing website is likely to scale better.

Observation  4: Like in traffic in order to scale one has to find the bottleneck in the highway that is causing the current slowdown, fix it and then look for the next bottleneck. This is a change in the hosting infrastructure and architecture, different from the website performance.

Observation  5: A traffic designers job is to ensure maximum number of vehicles can pass the highway at the best speed for each vehicle. A hosting designers job is to ensure maximum traffic is handled in a way that each hit is best served.

What lessons can we learn from traffic management

Lesson 1: To better manage traffic highway system has to be designed that is scalable. Mostly by bottleneck analysis we can derive what needs to be done. For example, is database a bottleneck, is file system access a bottleneck, etc.

Lesson 2: When traffic increases, possibly beyond the capacity of the highway, traffic management has to account for one more variable – starvation. The amount of time a vehicle has to wait at a metered light to enter the highway. The longer the wait, more frustration from the drivers who will find a better route to their destination.

Lesson : On a highway lanes are drawn. A better hosting will make lanes. The way most hosting providers take traffic is analogous to not having lanes with the hope that the maximun throughput will be achieved by letting hits contend for resources. The operating system stands to decide what process gets to use resources.

What are the recommended steps to achieve scaling?

As a first step to server side scaling, we move the database layer out to another instance or server. The main reason is that it is better to allocate resources in a single server when the workload is similar.

In our multi part series we take you through achieving scaling. The series is aimed at a store owner who need not be technical but is ultimately responsible to take a decision on the store. Until now you had to depend on an expert. However, there are no clear answers and the expert is making judgement calls based on most likely their prior experience. As a matter of fact no 2 webstores so results of efforts vary. This series will make you better informed.

We start by looking at a popular form of scaling – using FPC or Full Page Cache and other types of caches.

In order to help with scale, another important aspect is code quality specifically related to scaling. Scaling is difficult to achieve reliably if there is any externally dependent blocking service executed as part of the hit. Examples include sending email directly to a recipient or a external service, sending information from the server to an external service. All such processing should be done with some form of a queue handled either by an different process such as a cron job. Until Magento 1.9.2.4, the default email sending was inline for example, slowing the order success page being shown.

Autoscaling adds and removes servers (and hence resources) – something traffic managers cannot do with highways. This gives website scaling an advantage to be more elastic.

Magento Caches – Scaling Magento Part 2

Introduction

Caches are an integral part of a strategy for performance and scaling a Magento website. Managing caches is a core function of infrastructure management. A Cache is the term used to describe a mechanism to store calculated values such as a query or HTML so a subsequent request does not need to recalculate.

While Part 1 looked at FPC specifically, in this article we review all other caches needed for a Magento store. Some of these are commonly found to be referred in best practices, but a deeper understanding will help put them in perspective of Magento.

[Tweet "Performance of a website is like that of cars. Scale is like building a highway and interchanges."]

In general, for any cache the factors to consider are

  • Effectiveness – measured in terms of impact on page speed. Caches should be highly effective.
  • Invalidation flexibility – is an inbuilt automatic mechanism available? Is the mechanism too aggressive? A High score means the mechanism is very flexible.
  • Performance of miss – if a miss were to occur what would be the performance in terms of page speed. A low score represents bad performance.
  • Cost of refill –how much would it cost to refill the cache completely to be back in its most effective state?
  • Hit ratio achievable – In a real world environment if hit ratio were to be measured periodically what can be expected. A high would be over 99% hits over a reasonable period such as a day.
  • Memory required : How much memory is needed on the server side to cache the content.
Cache Type Effectiveness Invaliation Flexibility Peformance of miss cost to refill Hit ratio achievable Memory required
Browser High High Low Moderate High N/A
CDN High Low Moderate Moderate High N/A
FPC High Low Low High Moderate High
HTML Block Cache Moderate High Moderate Moderate High Low
Mysql cache High Low Low High High Moderate

Browser cache

  • Easy to setup – at the hosting level. Static resources like css, js and images should have caching enabled.
  • There is no need for invalidation as the browser checks for each cached resource if a newer version is available.
  • The ability of browsers to find changed resources means cacheable resources should have a very long expiry – say over a year.
  • Browser’s request for detecting change has a performance impact based on number of resources.
  • A key question to be answered is how merging css and js files impact browser caching performance. If merging is enabled, each page type (home, category, product, CMS) are likely to have different css and js files. Not merging allows the browser to cache individual files and reuse them across pages. However, in HTTP/1 browsers limit the number of connections to each domain. So, the advice is to not merge css and js files if using HTTP/2 or splitting domains for skin and js if using HTTP/1.

CDN

  • CDN networks cache at edge servers closer to users – reducing round trip latency from browser to server.
  • CDN also offloads the server from serving static cacheable resources, improving network performance of the server as well as freeing up server CPU to serve dynamic content.
  • CDNs may also take the load of SSL validation however, caution is needed here as the traffic between CDN and server may be unsecure making the site vulnerable to some type of attacks.
  • CDNs are notorious for invalidations – some charge for APIs, others take a few minutes before the invalidations are effective across all edge points. When evaluating a CDN, this is a key factor that is not evaluated.
  • Having many edge locations may not be a good thing - as each edge records the first access to a resource as a miss
  • While a single miss is easily retrieved from the server or backup store, a full invalidation requires multiple GBs to be transferred to make the cache effective again
  • CDNs when full give great performance benefit on page load times
  • Modern CDNs like section.io can also do FPC (html) caching using a distributed varnish cache architecture.

FPC

We have reviewed aspects of FPC in part 1 here.

[Tweet "FPC & Varnish is not the solution to bad Magento php code."]

Recap

  • FPC caches full HTML pages – except for variable content
  • Excellent for caching dynamic content
  • Requires very high amount of RAM on the server side
  • Depending on the quality of code, FPC invalidation or miss can have impact on resource utilization
  • Best implemented with autoscale – so servers are added automatically when cache is invalidated

HTML Block Cache

  • Also caches HTML but cached at the block level. Magento uses HTML blocks for building a page.
  • Since blocks may be shared across pages, these blocks do not have a high impact on invalidation as they will be regenerated once and used multiple times
  • Can dramatically improve performance if used consistently and correctly.
  • Needs developer help as many blocks are not cached by default. To cache a block, one needs a unique key that correctly identifies its variation. Check this technical blog info.
  • Invalidation can be either via a key or time (TTL). If using a key, developer needs to write appropriate event callbacks to detect change.
  • Examples of major speed up include home page blocks where latest products are shown. Depending on frequency of store updated, a 10 minute to 1 hour TTL on the block will result in dramatic improvement of home page speed.

We can analyze your site for free

Schedule a call

Do you know how your website performs and want an expert to look at it?

  • We will analyze your site using public information.
  • We will run a synthetic test from the internet.
  • We will ask you to give us a 1 day web server log file.
  • We will also try to identify what steps if any you should take to improve your sites performance goals.

mysql cache

Mysql can store result queries in memory. The amount of memory is specified in my.cnf as a combination of query_cache_limit (max memory for a single query result) and query_cache_size (max memory all cached queries).

  • mysql automatically invalidates a cached query if any table that was used in the query changes.
  • to access the cache mysql uses a lock on a table thereby reducing the effectiveness of the cache
  • Many mysql articles recommend smaller cache values due to the lock problem but it is best to test the size of the cache for your situation. It is best to monitor "low mem prunes" and "table locks wait". The first gives the number of times a query was cached by removing another and the latter gives the number of times table lock was not immediately available. Both should be as low as possible.

Our experience with mysql cache suggests when front caches are empty, they are useful. The cache has a negative effect on add to cart and checkout.

Mysql caches also have negative side effects when a large catalog is loaded. However, we recommend to keep indexing to be on schedule, disable mysql cache before indexing starts and enable when indexing ends.

There are other caches too

  • Magento has caches other than block. These should always be enabled on a production server.
  • Php opcache
    php code is read and converted to "opcodes" which are then interpreted. An opcode cache stores the opcodes reducing one step each time. Opecode cache should have sufficient RAM and keys (equal to number of php files).
    Using opcache.php the status should be checked regularly.
  • Operating system cache
    linux has an excellent file system cache - whenever additional RAM is available, linux will store opened files in RAM. This is important if for example you do not have a CDN. Files will actually be retrieved most likely from RAM rather than disk.

Scaling Magento Series

Performance & Scaling of a Magento web site are often confused. As a store owner who may not be technical a close analogy with real life will help in talking to your hosting providers and other experts. It is no coincidence that hits to a website is called as traffic!

Performance of a website is like a car – higher performance cars drive faster and can cover a distance in a shorter time. Similarly, a higher performant website will serve a page fast. This is often measured as page load speed. A critical component of page load that the server is responsible for is server response time. Like measuring performance of a car, measuring the page load speed is done in test mode with little or no traffic. Sometimes the performance is measured at a random time without looking at other traffic to the site. That is like test driving a car through traffic.

Core Infrastructure is like an engine – Better CPU with L2/L3 cache, faster memory, better disk performance will improve the engine you use. There are simple commands in linux and ways to find this information. Refer here for CPU performance, memory performance and disk performance.

Good code is like good fuel – Just as bad fuel will hurt the performance of a car, bad code will hurt the performance of a website. Refer here for identifying bad code in Magento.

Page Load Time is like transmission – the mechanism that delivers speed to the page. This is achieved by optimizing in code the above-the-fold content, use of CDN,  use of browser cache, using gzip on all text content, optimizing images, delivering appropriately sized images, minifying css and js, optimizing the use of marketing pixels.

Scaling is like building highways and roads for the cars to move on. Highways are resources – CPU, memory, network that the hits to a website will utilize. The task of a Magento scaling expert is to architect a system – caches, servers and sizes, services to run on each server, connectivity of the servers and access from internet, etc.

Hits to a website is like traffic of random cars on the highway. Each vehicle seems to have a mind of its own, joining and leaving the highway. Each visitor to the website will take their own journey visiting different pages.

Observation : Like a car cannot drive at its highest speed possible at all times due to traffic, a website too cannot perform at its best best all the time. Understanding the factors that make the website perform at its optimal level all the time would be the task of both the developer and the server architect.

Observation  : Like in traffic we have vehicles of different performance, in a website all URLs do not perform equally. A category page may not perform the same way as product detail page for example.

Observation : Better throughput will be achieved with the same resources, if the vehicle performance is improved – some bottlenecks can be avoided if the vehicles moved faster. Similarly, a better performing website is likely to scale better.

Observation  : Like in traffic in order to scale one has to find the bottleneck in the highway that is causing the current slowdown, fix it and then look for the next bottleneck. This is a change in the hosting infrastructure and architecture, different from the website performance.

Observation  : A traffic designers job is to ensure maximum number of vehicles can pass the highway at the best speed for each vehicle. A hosting designers job is to ensure maximum traffic is handled in a way that each hit is best served.

What lessons can we learn from traffic management

Lesson : To better manage traffic highway system has to be designed that is scalable. Mostly by bottleneck analysis we can derive what needs to be done. For example, is database a bottleneck, is file system access a bottleneck, etc.

Lesson : When traffic increases, possibly beyond the capacity of the highway, traffic management has to account for one more variable – starvation. The amount of time a vehicle has to wait at a metered light to enter the highway. The longer the wait, more frustration from the drivers who will find a better route to their destination.

Lesson : On a highway lanes are drawn. A better hosting will make lanes – a thought we think is unique to our style of hosting Magento. The way most hosting providers take traffic is analogous to not having lanes with the hope that the maximum throughput will be achieved by letting hits contend for resources. The operating system stands to decide what process gets to use resources.

What are the recommended steps to achieve scaling?

As a first step to server side scaling, we move the database layer out to another instance or server. The main reason is that it is better to allocate resources in a single server when the workload is similar.

In general these techniques can be used for scaling

Caching. Simply stated do not recompute results that can be reused. The results could be HTML (either a complete page or a part of a page or a page with holes to be filled), or json (returning data to a API call) or sql query results, etc. Cache entries require either an expiry time or an invalidation event. Caches work better when stale content is not a major problem.

Queing. Another powerful technique is putting things to work in a queue vs doing them in real time. A queue would then have a poll for results to update when results are ready or a trigger to update. Magento makes it easy to write trigger events and many are used for 3rd party integrations. Unfortunately events are fired inline – when the visitor to the website waits for the response. It is better to use queing system Another popular event trigger is to email.

Tuning. Monitoring and tuning to improve scale as a continuous process. If you are not monitoring you cannot improve. Monitoring does not mean CPU and RAM. Measuring actual response times, cache hit / miss ratios, queue lengths and alerting and analysing these parameters.

Sharding. If database is the bottleneck, sharding – either vertically or horizontally – can help in reducing the load. This works equally well on mysql as well as nosql databases, but requires code to be reworked. A properly sharded database can cause parts of applications to be split allowing for greater app level scalability.

Laning services. Another powerful technique achievable by stateless APIs, SoA, microservice architecture and other design patterns. This allows easy scaling as busier lanes can be scaled out independently. Along with database sharding, lanes can be made very deep and can then be scaled out physically and geographically.

In our multi part series we take you through achieving scaling. The series is aimed at a store owner who need not be technical but is ultimately responsible to take a decision on the store. Until now you had to depend on an expert. However, there are no clear answers and the expert is making judgement calls based on most likely their prior experience. As a matter of fact no 2 webstores so results of efforts vary. This series will make you better informed.

We start by looking at a popular form of scaling – using FPC or Full Page Cache.

In order to help with scale, another important aspect is code quality specifically related to scaling. Scaling is difficult to achieve reliably if there is any externally dependent blocking service executed as part of the hit. Examples include sending email directly to a recipient or a external service, sending information from the server to an external service. All such processing should be done with some form of a queue handled either by an different process such as a cron job. Until Magento 1.9.2.4, the default email sending was inline for example, slowing the order success page being shown.

Autoscaling adds and removes servers (and hence resources) – something traffic managers cannot do with highways. This gives website scaling an advantage to be more elastic.

Part 1 : Full Page Cache (FPC)

Part 2 : Other Magento Caches

Part 3 : Code quality

Part 4 : Optimizing Checkout Flow

Part 5 : Hardware

Part 6 : Hosting

Part 7 : Auto scaling

Full Page Cache (FPC): Scaling Magento Part 1

Introduction

The most important and often ignored factor to scaling is the quality of code. Well written code will scale better. The next most important factor is perhaps caching. There are many types of caches that developers, managers and store owners need to understand. Full Page Cache (FPC) is seen by store owners as a magic solution to speed issues. Understanding the benefits and compromises of a caching mechanism is important to understand scaling.

FPC Options

Magento Enterprise 1.x and Magento Open Source & Commerve 2.x, both have a FPC module inbuilt.

There are many plugins available for Magento Community 1.x. Some hosting providers will help setup a Varnish based FPC with appropriate hole punching.

Magento 2 has two mechanisms for FPC - php based (called FPC) and varnish. Varnish is the preferred option for production due to the architecture and speed of response.

The discussion below applies to all these mechanisms as well as to Magento 1.x and Magento 2.

What is Full Page Cache (FPC)?

FPC is a cache of a full HTML page – except variable content such as a login status or items in cart or stock status of a product or sometimes even price of a product. When a hit is encountered - i.e. the page required is in the cache, FPC will return very fast compared to when there is a miss i.e. the page is not in the cache, which will require re-generating the content. FPC may store the cache in files, but more likely for maximum benefit, it will be stored in memory.

FPC affects resource utilization – memory and CPU. As with all caches, we trade memory for CPU time.

Traditional FPC stores the page in Magento cache and is a part of Magento. Varnish stores the page HTML after it has been generated and is not a part of Magento. It is a separate process.

What FPC is not!

Let us understand FPC better

  • Memory needed to store the entire site in FPC
    Let us say each page is 100KB and you have 10000 pages to cache. That would take about 1GB of RAM. The problem is when the number of pages or page size starts rising above this, the RAM requirement goes up. So, if you now had 20000 pages (result of each option in layered navigation for example), you would need 2GB or if each page was 120KB the 20000 pages would need  4GB. Pages are not just products – they are category pages as well. If layered navigation is added the pages multiply fast as each combination is unique and needs to be stored independently. If you start exceeding the RAM available, you need to decide what to do when you hit the memory limit.
  • Cache warming.
    Cache warming is the process of automatically adding pages to the cache before a real visitor hit comes to the cached page. When a cache is cleared, you may need to warm the cache to make FPC effective early. Cache warming uses a crawler to artificially visit pages of a site. A typical crawler will recursively crawl the site starting from the home page. This sounds logical but here are some things to think through

    • If possible find the most likely pages you need to be in the cache and warm the cache with only those pages. This will give the maximum benefit.
    • If you cannot fit all the pages in memory, the use of crawling to warm the caches becomes a problem – they will recycle pages out of memory at random, not based on the end user popularity of the pages.
    • When the cache is being warmed your resource requirement in terms of CPU will rise as both the crawler and real traffic are being served.
    • If possible crawl the site in parallel – the earlier the pages get cached the more likely a visitor to a page will already be in the cache (scoring a hit).

Performance degradation on FPC full invalidation

The above figure shows the bad response immediately when a FPC that had built to 1.5GB was cleared completely. The top image is from redis usage graph from munin and the one below is AWS cloudwatch latency (time to serve a page) averaged per minute. The latency came down as AWS Autoscale added more instances, costing money.

  • Invalidating the cache :
    Magento automatically invalidates FPC (internal or varnish) by tagging or hashing the content with keys that refer to the type of content. For example, it may generate a tag / hash CATEGORY_123 if the page depends on category 123. Now, when category 123 changes, Magento sends out a invalidate message that says "all pages that have tag / hash CATEGORY_123 should be invalid". Magento has a elaborate tag convention.
  • FPC and robotic crawlers (BOTS)
    Even if you do not use a crawler for warming, robotic crawlers on the internet (such as google’s indexer Googlebot) will start filling the FPC cache with pages they happen to crawl. It is our advice that a site with FPC should have robots.txt and a front end processor (nginx, WAF) restricting BOTs.
  • CPU and time needed to re-generate a page
    A FPC can fully invalidate (clear) due to a (p)html or css file changing or partially due to a data change such as a product update. A miss from FPC results in the page being regenerated. The CPU requirement for a miss is much higher than a hit. If a crawler is used to warm the cache or if traffic is high, CPU requirement can be quite high as the FPC fills up. Yet, the visitor experience is not good during this period. Using autoscale, this performance degradation can be contained to some extent as additional instances are launched to handle the high CPU requirement.
  • Discipline when using FPC – know when invalidation happens
    It is important to add discipline for code update as it has the worst effect on user experience.

    • Code update should be done at low traffic times.
    • Category changes should be carefully planned at low traffic times.
    • Magento indexing should be set to manual (M1) or on schedule (M2) with a cron running the indexer.

Our recommendation for FPC

  • Do not use a random crawler to warm the FPC cache. Use a page popularity based crawler to warm the cache if necessary.
  • Avoid using a crawler during high traffic – the crawler will compete for system resources with live traffic
  • If possible update code during low traffic times as it causes FPC to invalidate
  • If your site is horizontally scaled, pre-launch instances to your load balancer before invalidating FPC, either explicitly or indirectly, so the latency of starting an instance does not further worsen the user experience

Magento 1.x FPC Plugins

  1. Free Lesti FPC : https://github.com/GordonLesti/Lesti_Fpc. Use this guide to install
  2. Magento connect search results for FPC

Should FPC be a part of scaling strategy?

FPC is concerned with speed. Scale is concerned with the process that helps the site add resources when needed. FPC helps in scalability by reducing the use of resources per hit to the website under certain conditions. It changes the dynamics of when and how many resources will be needed.

FPC has to be considered to be part of scaling strategy - but as one of many parts.

Read part 2 where we discuss other Magento caches.

Read the overview of our Magento scaling series here.

We can analyze your site for free

Schedule a call

Not happy with your website performance and want an expert to look at it?

  • We will analyze your site using public information.
  • We will ask you to give us a 1 day web server log file.
  • We will try to identify what steps if any you should take to improve your sites performance goals.