MageHost - Reducing load with simple optimisations

Gewijzigd op Tue, 20 Dec 2022 om 02:29 PM

Bots

Not every environment gets a high load from real traffic, sometimes bots have an high influence on this.

A lot of measures are taken on our side such as a global block list and a block list based on https://www.abuseipdb.com/, however, some optimisations should happen on the side of the application. Below are the most important ones to take.


Rate limiting SEO bots

robots.txt is your friend when it comes to optimising how bots approach your web application.

For bots like Bing and Yandex, you can set a crawl-delay directive. When you do this, you need to take into consideration that you want the bots to still be able to crawl your whole website in a short amount of time. It is possible to calculate this yourself, below example limits the bots to 8640 pages a day (86400/$crawl-delay).

User-agent: *
Crawl-delay: 10

Do not forget to validate your robots.txt after making the change.


An important sidenote:

crawl-delay is not an official parameter and as such not all SEO bots respect this setting.

Search engines like Baidu or Google have their own settings per website. You can check their respective help centers for more information.


Blocking non-desired (SEO) bots

Some customers do not care about their rankings and visibility in certain search engines and want to block them all-together. Malicious bots or bots that only serve to crawl your content also ignore robots.txt, it also comes in handy to be able to block these bots. You can do this based on the useragent  in the .htaccess of your application.

RewriteCond %{HTTP_USER_AGENT} "$USERAGENT"
RewriteRule ^ - [F,L]

Before applying this, please make sure to test the effect on a dev server or with a .htaccess tester.


Layered navigation

Layered navigation can not only be bad for your SEO ranking (duplicate content), it can also cause more load because of SEO bots crawling every option. These options are mostly uncached and can cause a lot of load. If your layered navigation is build with links, consider adding rel="nofollow" to your links. A lot of custom layered navigation extensions also avoid links by using javascript instead of anchors.


Caching bots

Most e-commerce applications, have a vary cookie on the second hit on any page, this can be leveraged for many reasons (currency, language switching, customer segmentation,...). This cookie is included in the default vcl in the vcl_hash section. Most (SEO) bots do not support the use of cookies, therefore they will always hit the application instead of your full page cache because your visitors don't warm a cache without this cookie in the hash. This can be solved many ways, both in the code as in the vcl. It is a useful optimisation but should be handled with care as it can break the intended segmentation.


Indexing

Reducing the fields you index for search

A lot of customers want to index as much attributes and text as possible. This can lead to a very large index. Indexing as much as possible also does not necessarily improve your search results. If you have a slow search engine or long indexing times, re-evaluate what you index. This is an easy - but often overlooked improvement.


Disabling flat tables

Refering to the Magento 2 help center, it is advised to disable flat tables starting from Magento 2.1. See: https://support.magento.com/hc/en-us/articles/360034631192


At the moment, we have seen varying results with some customers with a lot of attributes seeing a decline in frontend performance. Please test this on your staging environment before altering on production.


If you do leave on the flat tables, consider analysing your catalog and try to decrease the number of indexed attributes.


MySQL indexes

Sometimes indexes and the frontend are slowed because of badly optimised queries or missing MySQL indexes. You can check how much iterations your query makes by using EXPLAIN before your select query.


Cron optimisation

Timing

Make sure you analyse your traffic and run heavier crons like product imports on moments with as less traffic as possible. The graphs for this can be provided by our support or consulted in Google Analytics.

Was dit artikel nuttig?

Dat is fantastisch!

Hartelijk dank voor uw beoordeling

Sorry dat we u niet konden helpen

Hartelijk dank voor uw beoordeling

Laat ons weten hoe we dit artikel kunnen verbeteren!

Selecteer tenminste een van de redenen

Feedback verzonden

We stellen uw moeite op prijs en zullen proberen het artikel te verbeteren