Author - Pradip Shah

Full Page Cache (FPC): Scaling Magento Part 1

Introduction

The most important and often ignored factor to scaling is the quality of code. Well written code will scale better. The next most important factor is perhaps caching. There are many types of caches that developers, managers and store owners need to understand. Full Page Cache (FPC) is seen by store owners as a magic solution to speed issues. Understanding the benefits and compromises of a caching mechanism is important to understand scaling.

FPC Options

Magento Enterprise 1.x and Magento Open Source & Commerve 2.x, both have a FPC module inbuilt.

There are many plugins available for Magento Community 1.x. Some hosting providers will help setup a Varnish based FPC with appropriate hole punching.

Magento 2 has two mechanisms for FPC – php based (called FPC) and varnish. Varnish is the preferred option for production due to the architecture and speed of response.

The discussion below applies to all these mechanisms as well as to Magento 1.x and Magento 2.

What is Full Page Cache (FPC)?

FPC is a cache of a full HTML page – except variable content such as a login status or items in cart or stock status of a product or sometimes even price of a product. When a hit is encountered – i.e. the page required is in the cache, FPC will return very fast compared to when there is a miss i.e. the page is not in the cache, which will require re-generating the content. FPC may store the cache in files, but more likely for maximum benefit, it will be stored in memory.

FPC affects resource utilization – memory and CPU. As with all caches, we trade memory for CPU time.

Traditional FPC stores the page in Magento cache and is a part of Magento. Varnish stores the page HTML after it has been generated and is not a part of Magento. It is a separate process.

What FPC is not!

Let us understand FPC better

  • Memory needed to store the entire site in FPC
    Let us say each page is 100KB and you have 10000 pages to cache. That would take about 1GB of RAM. The problem is when the number of pages or page size starts rising above this, the RAM requirement goes up. So, if you now had 20000 pages (result of each option in layered navigation for example), you would need 2GB or if each page was 120KB the 20000 pages would need  4GB. Pages are not just products – they are category pages as well. If layered navigation is added the pages multiply fast as each combination is unique and needs to be stored independently. If you start exceeding the RAM available, you need to decide what to do when you hit the memory limit.
  • Cache warming.
    Cache warming is the process of automatically adding pages to the cache before a real visitor hit comes to the cached page. When a cache is cleared, you may need to warm the cache to make FPC effective early. Cache warming uses a crawler to artificially visit pages of a site. A typical crawler will recursively crawl the site starting from the home page. This sounds logical but here are some things to think through

    • If possible find the most likely pages you need to be in the cache and warm the cache with only those pages. This will give the maximum benefit.
    • If you cannot fit all the pages in memory, the use of crawling to warm the caches becomes a problem – they will recycle pages out of memory at random, not based on the end user popularity of the pages.
    • When the cache is being warmed your resource requirement in terms of CPU will rise as both the crawler and real traffic are being served.
    • If possible crawl the site in parallel – the earlier the pages get cached the more likely a visitor to a page will already be in the cache (scoring a hit).

Performance degradation on FPC full invalidation

The above figure shows the bad response immediately when a FPC that had built to 1.5GB was cleared completely. The top image is from redis usage graph from munin and the one below is AWS cloudwatch latency (time to serve a page) averaged per minute. The latency came down as AWS Autoscale added more instances, costing money.

  • Invalidating the cache :
    Magento automatically invalidates FPC (internal or varnish) by tagging or hashing the content with keys that refer to the type of content. For example, it may generate a tag / hash CATEGORY_123 if the page depends on category 123. Now, when category 123 changes, Magento sends out a invalidate message that says “all pages that have tag / hash CATEGORY_123 should be invalid”. Magento has a elaborate tag convention.
  • FPC and robotic crawlers (BOTS)
    Even if you do not use a crawler for warming, robotic crawlers on the internet (such as google’s indexer Googlebot) will start filling the FPC cache with pages they happen to crawl. It is our advice that a site with FPC should have robots.txt and a front end processor (nginx, WAF) restricting BOTs.
  • CPU and time needed to re-generate a page
    A FPC can fully invalidate (clear) due to a (p)html or css file changing or partially due to a data change such as a product update. A miss from FPC results in the page being regenerated. The CPU requirement for a miss is much higher than a hit. If a crawler is used to warm the cache or if traffic is high, CPU requirement can be quite high as the FPC fills up. Yet, the visitor experience is not good during this period. Using autoscale, this performance degradation can be contained to some extent as additional instances are launched to handle the high CPU requirement.
  • Discipline when using FPC – know when invalidation happens
    It is important to add discipline for code update as it has the worst effect on user experience.

    • Code update should be done at low traffic times.
    • Category changes should be carefully planned at low traffic times.
    • Magento indexing should be set to manual (M1) or on schedule (M2) with a cron running the indexer.

Our recommendation for FPC

  • Do not use a random crawler to warm the FPC cache. Use a page popularity based crawler to warm the cache if necessary.
  • Avoid using a crawler during high traffic – the crawler will compete for system resources with live traffic
  • If possible update code during low traffic times as it causes FPC to invalidate
  • If your site is horizontally scaled, pre-launch instances to your load balancer before invalidating FPC, either explicitly or indirectly, so the latency of starting an instance does not further worsen the user experience

Magento 1.x FPC Plugins

  1. Free Lesti FPC : https://github.com/GordonLesti/Lesti_Fpc. Use this guide to install
  2. Magento connect search results for FPC

Should FPC be a part of scaling strategy?

FPC is concerned with speed. Scale is concerned with the process that helps the site add resources when needed. FPC helps in scalability by reducing the use of resources per hit to the website under certain conditions. It changes the dynamics of when and how many resources will be needed.

FPC has to be considered to be part of scaling strategy – but as one of many parts.

Read part 2 where we discuss other Magento caches.

Read the overview of our Magento scaling series here.

We can analyze your site for free

Schedule a call

Not happy with your website performance and want an expert to look at it?

  • We will analyze your site using public information.
  • We will ask you to give us a 1 day web server log file.
  • We will try to identify what steps if any you should take to improve your sites performance goals.

Securing your Magento site

Executive Summary

Owning a Magento website means you have a resource you own and control on the internet. Keeping it secure in all aspects is a responsibility. You need to ensure

  • your business data is secure.
  • your customer data and privacy is secured.
  • your customer’s computers are protected when accessing your website.

Your privacy policy indicates your desire to keep your customers’ data secure. You would intentionally not violate the policy but a cyber attack on your infrastructure can cause a violation.

It is not required to be a famous to be in the eyes of the attacker – most attackers today prowl the internet looking for vulnerable sites and automatically get to work. New types of attacks are being designed and hence keeping a site protected is a continuous battle.

As a person ultimately responsible for your website, you need to periodically review your security processes and see they are enhanced for new vulnerabilities.

This article has an overview of the aspects of security you need to worry about. In addition it goes in depth as well to make it actionable.

Aspects of security infrastructure for a Magento Website

  • Protection of the web edge infrastructure. Ensure you keep the bad guys out and the good guys in. Using a Web Application Firewall (WAF) with appropriate configuration will help.
  • Server protection. Ensure access to the server restricted, preferrably without a password, limit port access and protect files and folders with permissions that are just needed. Also ensure the server has the latest security patches installed. Need to know access should enforced. Code updates should be automated. A staging site should be used for updates from vendors. Plugins should be acquired from known and reputable vendors.
  • Code protection. Ensure your code is patched with the latest Magento (or any platform and plugins). In addition train developers to use safe practices to not leave holes.
  • User data protection in transit. Ensure https access to the entire site so all access is secured against in transit hacking as users use your website. Ensure all admin access is restricted either with dual password or two factor authentication or IP restricted.
  • Review all access periodically Change passwords and admin URLs regularly.
  • Run vulnerability scanners Vulnerability scanners are available for testing many aspects of security. Blackbox scanners such as Trustwave, Secure Works, help in checking if they can find external vulnerabilities. Static analyzers such as Fortify scan the code to find code level vulnerabilities.

Protecting Web Edge Infrastructure

The key role of this protection is to keep the bad guys out and allow the traffic you want in.

At the minimum configure nginx or apache (your webserver) to

  • rate limit hits from a single IP address
    However, if you get legitimate hits from behind a firewall these may generate false positives. We recommend to log the IPs that get restricted and review. Whitelist the ones that are ok.
  • rate limit admin server access if you cannot IP limit the access to admin
    Many bots are trying passwords to get acsess to the admin panel
  • Have a mechanism in place to block IPs.

As with any protection of this nature, detecting bad is a key. OWASP is an online community which has come up with a core rule set (CRS). A Magento site needs a WAF protection either on the nginx server level or with an external service such as Webscale Networks, Cloudflare, Sucuri.
The OWASP ModSecurity Core Rule Set (CRS) provide protections against the following category of attacks.

  • SQL Injection (SQLi)
  • Cross Site Scripting (XSS)
  • Local File Inclusion (LFI)
  • Remote File Inclusion (RFI)
  • Remote Code Execution (RCE)
  • PHP Code Injection
  • Metadata/Error Leakages
  • Project Honey Pot Blacklist
  • GeoIP Country Blocking, etc
System Admin Note

mod_security is a open source project available for apache and nginx. Installation is not difficult. However, you need to tune the rules to your environment. Some rule tuning may require developer or marketing input.

Protect the admin panel login

  • Change the admin URL periodically. Do not use the same name as in test or development environemnts
  • IP restrict admin access to only your known IPs
  • Put a simple http access in front of the admin URL – so there are 2 levels of usernames and passwords and it makes it non standard for automatic hacks
  • 2 factor authentication to admin panel – may need a plugin
  • Change all admin user passwords frequently
  • Monitor log file for frequent unsuccessful attempts to login

Server Protection

A hacked or broken in server – popularly shown in movies – gives access to your server to the hacker.  Web prowlers are continuously trying to see what ports are open in the server and what access they can get. Being a very traditional form of hacking, it is highly automated. By the very same count, protecting a linux server is not very difficult either. Follow some guidelines and review process.

  • Keep only known ports open and limit which IPs can access these ports
  • Disable any form of password access (like ftp, ssh). Use keys instead.
  • Keep server patched with latest security patches
  • Run only services that you need. For example never run ftp as it gives password based access to the server.
  • Follow very strict rules on file / folder permissions as well as linux groups and users
  • Periodically scan servers for viruses signatures
  • Periodically review access keys
  • Automate routine processes and restrict with keys including code update
  • If a db server is a separate server further restrict access to this server . If using a autoscale architecture, further restrict access to app servers as well.
  • Allow no exceptions – even for a critical fix do not give access to a developer for example.

The following system admin note has technical details for linux system administrators to execute the strategies above

System Admin Note

Depending on the configuration we recommend the following incoming ports to be open :

On the only app server without load balancer
Port of external interface Protocol / usage
80 http, main site open to the world if site not fully secure.
443 https, main site open to the world
22 ssh, possibly open only to specific IP addresses
On the db server
Interface / port Protocol / usage
External / 22 Only if you must to restricted IP addresses
Internal / 6603 Restricted to internal IPs that run app servers
Internal / 22 Ssh access from the other servers
Constellation of app servers, a db server and a nginx load balancer
Host / Interface / port Protocol / usage
Load balancer / external / 80, 443 http and https open to the world
Load balancer / external / 22 ssh, possibly open only to specific IP addresses
Load balancer / internal /2049 Nfs server
Db server / internal / 6603 Mysql, for app server access
App server / internal /1110 Nfs, for file access across all app servers

Note : If using a load balancer device or service, port 22 access should be given to only the main app server, possibly the nfs host.

System Admin Note : ssh configuration for secure access

ssh needs to be configured to a more secure mode. (Refer linux man page)

  • Change default port to non standard port (say) 2020. This will thwart an obvious attempt to breakin. However, an attacker can find the port using a port scan.
  • Disable root ssh access. This is crucial. This means that someone can never login to the server directly as root.
  • Disable password access. Weak passwords are the most common way a breakin is successful. It is common that a server being attacked will have multiple password generators trying to access the server.
  • Enable only key access. With this setting only a private key holder who has a public key stored in the .ssh/authorized_keys file will be allowed access.

/etc/ssh/sshd_config :

...
# set port to non standard 2020
Port 2020
...

# Authentication:
#LoginGraceTime 2m
#PermitRootLogin yes
PermitRootLogin no

# To disable tunneled clear text passwords, change to no here!
#PasswordAuthentication yes
#PermitEmptyPasswords no
PasswordAuthentication no

# GSSAPI options
#GSSAPIAuthentication no
GSSAPIAuthentication yes

Note : First enable key access, exchange keys, ensure ssh to keys works before disabling password

Note : When changing ssh configuration ensure a ssh session to the root server is separately running. This will prevent from being locked out of the server when changing ssh configuration.

System Admin Note : What users are needed

We recommend the following users

  • Login user
    • This user has sudo access (without password)
    • Ssh key exchanged
    • Admin logs in as this user
  • Site user (production)
    • Website will be hosted under this users (/home/production/www NOT /var/www/html)
    • This user may have ssh access to deploy code
    • This user does not have sudo access
    • If using multiple app servers, this users id will be shared across all app servers
  • Web server user (apache)
    • Nginx and php processes will be run as this user
System Admin Note : File and Directory Permission settings
  • Only media/pub and var folders should have 770 permissions
  • All other folders in should have 750
  • All files under var should have 660
  • All other files should have 640
  • production user should be in apache group

Protect your code

  • Ensure Magento code is patched to the latest. Subscribe to Magento updates and setup process to check for patch releases on a monthly basis.
  • When the plugin vendor. Many simple looking community plugins can be weak on security and leave the site prone to attacks.
  • Train developers to use safe practices.
  • Keep the code repository (svn / git) access secure and passwordless for developer access.
  • Ensure external respository provider web access is secure and with 2-factor authentication.
  • Ensure backup servers are protected with access to write only given to automatic backup process

User Data Protection in Transit

Customer data in transit is the protection of data sent and received from your website to their browser. Internet connections to websites are not direct – the data hops from server to server in between. A server compromised on the way can tap into your data without anyone knowing it.

  • Using https for the complete site is an obvious requirement. Using higher security options in https is even better.
  • Emails sent should not contain sensitive data. A recent Magento patch fixed such an issue – passwords were earlier sent in emails.

Review all access periodically

Setup a security task force that reviews and reports on security once a month.
The charter of the task force would be evaluate the following

  • Who has access the the server and do they yet require this access. The access may be via keys and when a employee or contractor leaves, the key should be removed
  • Who has access to development environment. Have required access been withdrawn appropriately.
  • Who has access to version control systems. Have required access been withdrawn appropriately.
  • Admin access Are all admin users yet needing access?
  • Were admin passwords reset at agreed frequency?
  • Magento Patch status
  • OS patch and update status
  • External security scan (if arranged) status
  • Action taken report for the last period on any security issues
  • Backup and restore process review to ensure data and code is appropriately backed up

Conclusion

Protecting a eCommerce asset like a website in these times of automated hacks is a major challenge. Periodic reviews is a must as the attacks and defenses are quickly evolving. Being a small site is no excuse – the attackers, typically robots do not care.

Watch our webinar on performance and scaling in Magento

Its free!

Using analogy to vehicular traffic we explain performance and scaling in Magento.
Key takeaways

  • Know how to compare hosting options
  • Importance of good code
  • How to scale
  • Tuning Magento

We can analyze your site for free

Schedule a call

Not happy with your website performance and want an expert to look at it?

  • We will analyze your site using public information.
  • We will ask you to give us a 1 day web server log file.
  • We will try to identify what steps if any you should take to improve your sites performance goals.

Transactional Email Deliverability of your Magento Store

The internet started with email and email continues to be a very important means of communication for a Magento site. Emails that are sent directly in relation to an activity on the website such as a registration or purchase – are called transactional emails. Transactional emails occupy a different place in the email marketing category and are governed by less strict rules worldwide.

Importance of Transactional Email for Magento stores

Deliverability of transactional emails is a key to customer satisfaction and loyalty. If a customer requesting a password reset does not get an email in time in the the inbox would result in possibly loosing the customer.

Why is Transaction Email Deliverability a problem?

If email is fundamental to internet why is email deliverability a issue?
In order to protect email infrastructure from spammers, many services created spam lists – IP addresses that have previously been used to spam and are blacklisted. There is no single authority with such lists, leading to the deliverability problem. The IP you get assigned by your cloud provider may not be the clean in all the lists and it is too difficult to find and much more difficult to get cleared. Transactional email providers come to the rescue – their business is to increase deliverbility.

What can be classified by a transactional email?

Newsletters, even opted in, do not classify as transactional email. If you do not send newsletters through Magento, all emails that go out will be transactional.
However, you maybe crossing the line if you send out upsell / crosssell in your email order confirmation for example.

Third party providers

There are many providers and it is a very competitive market a search on google for transactional email will get you many results and comparisons.
Here are a few recently updated comparisons

How to get started with transactional email for Magento?

  • Check if you have an existing subscription to a transactional email service – indirectly. For example, if you are hosted on softlayer, you may get sendgrid credits. If you use Mailchimp to send newsletters, you may have mandrill credits.
  • Signup for the service – most of them have a free tier
  • We think having a Magento plugin is not a requirement if you are self hosted on a VPC or better. Read on, we think using the SMTP service is better option than a plugin or code integration.

Before you install the Magento plugin, read this!

  1. Plugins add a drag to the system – like it or not, each plugin you add, contributes to a slowdown of Magento due to the architecture. Many plugin authors are guilty of passing in additional features into the plugin.
  2. Plugins for transactional emails are “inline” i.e. the email is sent while the purchaser is waiting for a confirmation. That is a dependency on an external system. Occasionally the service may have slowed down and that delay will be added to the wait for the customer.
  3. Local email systems are automatically configured to retry in case of upstream infrastructure failure. If configured at the system level, the email is sent only to the local system, form where it goes into a queue which the systems email service will relay. If for some reason the remote email service is not responding, the queue will remain active and a retry will be attempted after sometime.
  4. Do not select the service based on the availability of a Magento plugin – that is the least important part of the evaluation

How to setup

All providers use TLS for SMTP communication on port 587. It will be required to open port 587 in the firewall to ensure emails be sent out.

Note : Some cloud services notably Google Cloud Platform does not allow communication on ports 25 or 587. For such services you need to use a transactional email service provider that allows SMTP communication over a non standard port.

Use the guide below to get your username and password and then use the steps to setup postfix

For Mandrill

Username : mandrill username
Password : Get Key (Dashboard->Get API Keys->NewAPI Key)
Domain : smtp.mandrillapp.com

For Amazon SES

Username & Password : https://docs.aws.amazon.com/ses/latest/DeveloperGuide/smtp-credentials.html
Domain (as per region, this is for US West) : [email-smtp.us-west-2.amazonaws.com]:
Domain verification : http://docs.aws.amazon.com/ses/latest/DeveloperGuide/verify-domain-procedure.html

For Sendgrid :

Get this certificate and store in /etc/postfix/ssl
wget https://certs.godaddy.com/repository/gd_bundle-g2-g1.crt
username : sendgrid account username
password : sendgrid account password

postfix setup

    1. Ensure SASL authentication package like cyrus is installed.
    2. Ensure you have a FQDN (Fully Qualified Domain Name). The command hostname -f should report a host.domain type of name. It is preferred you use the domain you are sending from
    3. Ensure postfix is installed (and sendmail is not)
    4. Edit /etc/postfix/sasl_passwd and enter SMTP_DOMAIN, username and password as per the transactional email platform.
    5. chmod 600 /etc/postfix/sasl_passwd
    6. psotmap /etc/postfix/sasl_passswd
    7. edit /etc/postfix/main.cf and add the following to the bottom of the file
# enable SASL authentication
smtp_sasl_auth_enable = yes
# tell Postfix where the credentials are stored
smtp_sasl_password_maps = hash:/etc/postfix/sasl_passwd
smtp_sasl_security_options = noanonymous
# use STARTTLS for encryption
relayhost =<refer platform info>
## For mandrill
smtp_use_tls = no
## For sendgrid
smtpd_tls_security_level = may
smtp_tls_CAfile = /etc/postfix/ssl/gd_bundle-g2-g1.crt
## For Anazon SES
smtp_use_tls = yes
smtp_tls_security_level = encrypt
smtp_tls_note_starttls_offer = yes
  1. restart the postfix service
  2. test by sending an email and watching the result in /var/log/maillog

5 Questions to ask before selecting a Magento Managed Hosting Provider

5 Questions to ask your Magento Managed Service Provider

  1. Does the managed provider help you with setup of a CDN?
    Content Delivery Networks or CDN is the best way to get a boost of your page load time. A CDN takes away the load on your server by caching commonly used files that do not change frequently. Such files include css, js and image files. For a Magento store, images constitute a large amount of time and bandwidth of a page.
    Large CDNs have an additional advantage of having multiple end points or servers near your customers. This reduces latency and images tend to appear immediately.
    The good news is that CDNs are now a commodity – they are aggressively priced with some large CDNs having unlimited free plans.
    The only catch is that you need to “invalidate” or “purge” cached content from the CDN if you were to change the same file on the live server. Until automated, this procedure can be a burden.
    Your Magento Managed Hosting Provider should provide you with practical and complete options for a real CDNs, including help in configuring your Magento site to use your selection
  2. Does the provider have a disaster recovery plan – not just a backup?
    Backup is as good as the restore process.
    Yes! Just having backup is a essential but not sufficient.
    If your site were to be hit by a disaster, how would you be able to bring up your site on another server? Would all services start as desired?
    Funnily we have seen many magento servers do not come back to full service on a simple reboot, far less a disaster.
    Moreover, backups should be taken “off-site” – not within the same zone or provider you currently have.
    And your Managed Magento Hosting provider should give you a written disaster recovery plan based on different types of scenarios and that you can put into motion if the time arises.
  3. Does your provider regularly evaluate and tune for optimal cache performance?
    Magento site performance is quite often driven by the condition of its various caches. Caches store results in RAM vs either on disk or having to compute the results. As a quick look at the Cache Management page on Magento’s admin panel shows there are many application level caches. In addition, mysql stores query results in a in-memory cache, however requires the cache to be sized and enabled in a configuration file. The php code can be stored in a opcode cache in the php server. This also requires memory allocation and enablement.
    Since no two magento stores are the same, there is no standard configuration. It requires observation and tuning.
    Your Magento Managed Hosting provider should give you information and tune the cache, possibly even advising you with server sizing in the process.
  4. What scalability plan has the provider made for you?
    As hits increase, it may be required for your site to scale. But, how will you even know the time has come?
    In todays world, a slowdown of your Magento site is equivalent of downtime. Consider this – each drop of a second in your load time results in drop of visitors.
    If traffic spurted due to a online or offline ad, how would you know the server capacity needed to handle the spurt? Do you need autoscaling and loadbalancing technologies?
    Your Magento Managed Service provider should be on your side in advising
  5. Do you know the deliverability of your transactional emails?
    An eCommerce site sends many emails – new user registration, password change, order confirmation, abandoned cart. These emails need high rate of deliverability. If emails are directly delivered from the host, there are 2 problems – It is not known which emails are delivered and which are not. amd secondly mail servers are constantly upgrading their needs of the senders ability to send, resulting in increased non deliverabilityMany services are available to handle such emails – Amazon SES, Mandrill, Sendgrid to name a few. Each give a varying level of granularity of reports and pricing plans to support your volume.Most of these services have Magento Plugins, some premium. However, a email plugin will attempt to send the email “inline” – for example the order email is sent while the customer is waiting for the success page. A better solution is to send the email to the local system queue, and let the system process the queue by sending the email to the transaction email provider.

    Your Managed Magento Service provider should be able to configure your server’s email system to use your chosen transaction email provider?

 

Think before you execute

How system admins can use planning and checklists to improve quality

Introduction

Sounds like a right statement to make an obvious but in this article I will look at how it applies practically to a system admin

It starts at the beginning

When a task (in the form of a incident request or a service request) comes to a system admin, this is when the principle starts – how deep do you think. My observation is that a natural thing to do is to take a superficial view of the task and then either a google search is performed or a quick think is done to find commands will resolve the task.

The fundamental issue with this approach is that it is solution centric not problem centric. i.e. not enough time is spent in understanding the problem and coming up with a solution. Another problem with the approach is that there is a rush to execute as being in front of the system and executing is what will make things work.

Thinking before executing addresses both of these

  • Has the problem well understood?
    If the problem is not well understood, the advice on google would not be understood and only coincidentally will you solve the problem. Quite likely the problem will resurface. If on the other hand the problem is well understood, the solution quite often will be obvious or the advice on the internet will make sense.
  • Once the problem is understood, a detailed plan is needed.
    Most tickets require multiple steps to solve. If each step is not well documented, you are forced to plan while executing – in our experience very difficult.
  • Our approach is to come up with a detailed plan before executing. It is a modified and more comprehensive checklist. We have SOP (Standard Operating Procedures) for common tasks, but this works even if we do not have a SOP.
    We use a spread sheet.

    • At the top we have information we will need before we start executing. This includes information such as IP (internal or external) of the systems we wiil use, passwords if any that we will need, etc.
      Here is a sample of a sheet we use for rehosting.
      info_sheet
    • Below that is a table that is both a planning as well as a execution tool. The table has the following headers :

      • Function – overall heading for a set of commands
      • Step details / reference – SOP reference and commands to be executed
      • Command – the actual command that will be executed. Note the commands are created before they are executed.
      • Status – marked upon execution

    audit_trail

    Conclusion

    Checklists help being purposeful in execution and the result is a audit trail that can be used to debug issues later on.

minify css js offline

Minify css and js for Magento as a build process

How to improve page load speed without server overhead so you can serve more pages.

Need for a build?

Magento being written in php, an interpretive language, the need to build is not essential for deployment. Moreover, since many small store owners are not technical or do not have a full time technical team, solutions that just work inline are preferred. For example, using plugins for css and js minify, or transfer to CDN as and when needed inline, or even use Google’s excellent pagespeed plugin.
Unfortunately, each one of these inline steps though improve page load speeds, result in a ever-so-slight slow down of the server each time. On a high traffic site, this results in inconsistent performance and user experience. We even zip the static content in .gz files so the web server (nginx in our case) does not have to spend a few milliseconds each time – assuming ofcourse you do not have a CDN that can zip.

Grunt, the task builder

We have used Grunt ( http://gruntjs.com/) as a task builder. Grunt is a popular javascript task builder written in nodejs. We use grunt to do many release oriented activities – packaging a release, installing a release, minify css, js, etc. In this article – a first of a series we plan – we will go through the process of installation of grunt and offer a solution to minify js and css flles as well as optimize images in the skin directory.

Installing grunt

  • Install nodejs and npm
    curl -sL https://raw.githubusercontent.com/nodesource/distributions/master/rpm/setup_4.x | sudo bash -
    sudo yum install nodejs npm
  • Install grunt
    sudo npm install -g grunt-cli
  • Download our Gruntfile.js and related code
    mkdir /scripts
    cd /scripts
    git clone https://github.com/luroconnect/gruntformagento.git
    cp –r gruntformagento/src/* .

Run grunt to minify css and js (and more)


cd /scripts
grunt optimize
Typical output :
Running "copy:skin" (copy) task
Created 229 directories, copied 1769 files

Running "copy:js" (copy) task
Created 197 directories, copied 893 files

Running "uglify:skin" (uglify) task
>> 30 files created.

Running "uglify:js" (uglify) task
>> 301 files created.

Running "cssmin:skin" (cssmin) task
>> 149 files created. 2.34 MB ? 1.77 MB

Running "imagemin:skin" (imagemin) task
Minified 1412 images (saved 400.26 kB)

Running "compress:skinjs" (compress) task
>> Compressed 32 files.

Running "compress:skincss" (compress) task
>> Compressed 155 files.

Running "compress:js" (compress) task
>> Compressed 332 files.

Done, without errors.

What is done by optimize :

  • Create 2 directories skin.min and js.min initially with identical content as skin and js respectively
  • Run the minifyfor css and js on the skin.min and js.min directories. .min.js files are not minified.
  • Run image optimizer on skin (png,jpeg)
  • Generate .gz gzipped files – for static delivery of gzip. See note below on nginx configuration.

Update Magento Web URLs

Update the Magento unsecure and secure skin and js URLs to point to skin.min and js.min respectively where minified content is kept.

Update nginx configuration

nginx configuration to load .gz static content if it exists

#/* static content can have expiry set to long */
location ~* \.(jpg|jpeg|gif|png|css|js|ico|swf|woff|woff2|svg|TTF)$ {
gzip_static on;
#access_log off;
log_not_found off;
expires 360d;
}

Gzip_static on tells nginx to serve the .gz file of a static file it exists rather than nginx compressing it.

Run optimizer on images in media/wysiwyg

grunt media
copy optimized images from media.min/wysiwyg to media/wysiwyg manually

Conclusion

We firmly believe in creating a documented release process. And Grunt with our Gruntfile.js goes a long way in making this a reality. In this article we have introduced the minfication, image optimization and gzip compression of static files. Try it and let us know if you have any suggestions.

This script can be run directly on the live server, but make sure you do it at a low traffic time.

How to evaluate a hosting service for Magento?

Introduction

With so many choices in hosting service it is difficult to decide what to use for hosting of a production Magento server. The first question most commonly asked is – should one use a physical (or bare metal) server or a virtual machine (or cloud server)? Many people think the obvious answer is Virtual Machines. After all this is the way the world is thinking and all cannot be wrong. But, let us take a closer look for Magento hosting. Magento is typified by 2 factors – high CPU utilization for php interpretation and mysql performance limited by both CPU and disk writes typically for operations such as reindexing and high order volumes.
 
In this article we talk about these factors and how to help evaluate your preferred platform.

CPU speed

Most cloud service providers like AWS, Azure or Softlayer do not define what you get when you ask for say 2GHz CPU – do you get 100% of the power of the CPU or is the CPU shared?
Hypervisor technology used to create Virtual Machines, easily allows one to overcommit CPU. Overcommit means that the number of CPUs on a physical hardware can be lesser than the number of CPUs in all the VMs running on that server. Emperical studies such as here(https://www.datadoghq.com/the-top-5-ways-to-improve-your-aws-ec2-performance/) have proven that overcommit does happen.
 
This means that test results and live site performance are subject to current usage of your neighbours.
In addition, VMs need hypervisors to run beneath the VM – this adds to overhead as well as latency.
 
In a study by Forrester a case was made for using bare metal infrastructure, now that some leading cloud providers have made it easy to spin a new bare metal server.
 
On a physical or bare metal server, processor power, disk space, memory and other system resources are not shared with noisy neighbours so there is high correlation between test results and live performance.

Disk speed

High speed disks vs network access disks. A SSD or a 15K SAS for example can give the boost you need when upload products and reindex or you get many orders such as during a holiday season.

  • Unthrottled performance. When you buy guaranteed IOPs for example, what happens when you exceed the limit? What if you have a burst need that exceeds the provisioned IOPs?
  • Use a locally attached SSD where available vs a network storage. Local storage will be faster than network storage by orders of magnitude.
  • Consider RAID configurations for better performance

Here is some raw disk performance metrics1 we got when testing some popular hosting providers

Provider	Softlayer  Softlayer  Softlayer	Azure	Azure   AWS  Ukfast    DO
Type		Physical   Physical   VM	VM-D2V2	VM-D2V2 EC2  Physical  VM
Disk		Magnetic   SSD	      Local	Default	Blob    EBS  SSD       Default
unbufferred 	144	   451	      200	20.6	65      57   130       150
bufferred 	155	   602	      217	1000	65      263  1500      300

Notes

  • All data in MB/sec reported by the linux dd command/li>
  • Unbuffered dd if=/dev/zero of=/tmp/test bs=256M count=4 oflag=dsync
  • bufferred dd if=/dev/zero of=/tmp/test bs=256M count=4
  • Digital Ocean performance varied in a wide range 55 MB/s to 150MB/s

This simple test shows relative disk performance on various platforms without a RAID configuration.
(Refer Roman’s wiki)

Which hosting service is preferred?

  • Physical (or bare metal) servers give the best “performance” of Magento production hosting. The key reason is that they scale in a predictable way when traffic peaks.
  • You can scale horizontally with cloud servers – many hosting providers now give the option to mix VMs and bare metal on the same subnet. >We prefer such vendors.
  • Not all VMs and servers are made equal – test before you commit.

Nginx or Apache : Best server for Magento

Introduction

Apache server has been for years been the default http server linux hosts use. However, recently there have been many newer “lighter” http servers. This blog article focuses on Magento hosting. Magento is a php based web eCommerce framework. Nginx requires php-fpm to process php requests. So, this comparison is really apache vs nginx + php-fpm. Apache offers MPM (Multi-Processing Module) configurations pre-fork, worker and event. In this discussion we will use the “event” MPM.
This discussion is very popular. Examples include this. We focus on Magento here.

Key Differences between apache and nginx

There are some differences architecturally that make nginx look slightly better for Magento hosting.
(more…)

Secure access to a Magento server

Today the biggest threat to your Magento production server are external threats – of being hacked. While you may not be a high value target, hackers run crawlers on the internet to discover servers with weak security and attack. In this article we discuss secure access to a Magento server. An OS level attack if successful can only be fully repelled by re-imaging the server. But preventing a OS level attack is easier than you think – if you follow some simple guidelines.

A Magento production server should have restricted access for all. Insecure, password based access should be disabled. If more than one server is used in a constellation, ssh access to the setup should be restricted to only one server.
(more…)

Will HTTP/2 help my Magento Store?

Introduction

HTTP/2 is the new http standard. Most browsers, including Chrome, Opera, Firefox, Internet Explorer 11, Safari, Amazon Silk and Microsoft Edge support HTTP/2. Nginx and other web servers too support HTTP/2. Magento 1.x and Magento 2 work very well with HTTP/2. In this article we see the benefit of HTTP/2 and give some configuration recommendations for Magento store owners or administrators.

What are key differences of HTTP/2 ?

At a high level, HTTP/2:

  • is binary, instead of textual
  • is fully multiplexed, instead of ordered and blocking
  • can therefore use one connection for parallelism
  • uses header compression to reduce overhead
  • allows servers to “push” responses proactively into client caches (more…)