Kitt Cache Warmup Crawler for Shopware 6 (LCSW)

Ask A Question LiteSpeed Authorized Support Partner

Kitt Cache Warmup Crawler for Shopware 6 (LCSW)

The fastest Cache Crawler for successful eCommerce


  • Up to 10-times faster as the built-in Cache Warmer
  • Dynamic Load Control
  • GUI Version and Version for Command Line Interface (Cron)
  • Tuneup Settings for Power Users
  • Cache Inventory
  • Route Name Selection
Note: Kitt Cache Crawler is not a Shopware Extension. Kitt Cache Crawler is a standalone Application that runs parallel to Shopware, so don't try to install it as an Extension in Shopware.This will fail! Download the .zip File, unzip it locally and read the install notes carefully.


Faster and flexible Cache Warmup for Shopware 6

With the release of version 6 of Shopware, a lot has changed in terms of cache. With version 5 you could still set the cache behavior reasonably flexibly, in version 6 there is only the option of either caching everything or nothing. This also applies to the warmup of the cache. There is no way to warm up the cache specifically for a specific URL or URL group.

The Kitt Cache Crawler offers a way out of this dilemma. Kitt is a standalone application that is installed in parallel to Shopware 6 and works completely autark from it, but Kitt is developed especially for the needs of Shopware 6.

Kitt has numerous features that make the warmup of the Shopware cache much faster and has dynamic server load management. The latter ensures that the server is not overloaded by the crawling and the shop operation is not affected by this. The Kitt Cache Crawler can therefore be run independently of the respective visitor volume.

The Kitt Cache Crawler also enables flexible crawling, which can be used, for example, to select which route names should be crawled or which products and categories should be crawled from a specific release or update date.

All in all, the Kitt Cache Crawler has numerous setting options to flexibly adapt the crawl behavior to individual needs.

Requirements to run Kitt Cache Crawler for Shopware 6

Kitt Cache Crawler is a standalone Application that has to be installed in parallel to your Shopware Installation. Installation is simple and can be done by everyone. If you were able to install Shopware, then you can install Kitt as well.

Requirements:

  • Shopware 6
  • Apache Web Server (All other web server are unsupported)
  • PHP Version >= PHP 7.4 & < 8.2
  • cURL
  • ionCube Loader Version > 11
  • /proc/stat must be readable
  • /proc/cpuinfo must be readable
  • PHP Function sys_getloadavg must exist
  • DirectAdmin is unsupported!!!

All requirements are mandatory! If any of the listed requirements are not met, Kitt will not run.
If you do not know whether all requirements are available on your server, ask your hosting provider. All listed Requirements are normally Standard, even on Shared Hosting Server.

The fastest and safest method to check the requirements is the Kitt Tester Script. Download the tester script, upload it to the root directory of the Shopware installation and request the tester script in the browser.


Kitt Cache Crawler makes Shopware 6 complete

Kitt Cache Crawler has numerous setting options that allow you to adapt the cache warmup to your personal needs, but also to the available server resources. However, KITT is already preconfigured during installation, so it is usually sufficient to accept these default settings. The default settings are suitable for both shared hosting and dedicated servers, so you don't have to spend time trying to figure out which setting is best.

Each Kitt Cache crawler version contains 2 different crawler versions that work the same way. Kitt can be run as a GUI version within the control panel, but can also be set up as a cron job. Both versions use the same configuration.

You should pay special attention to the server load control. Pretty much every crawler available on the market for the cache warmup crawls URLs regardless of the resulting server load, which can sometimes be significant. As a result, this increased load can affect shop operations. In order to mitigate this circumstance and not to affect shop operations too much during the warmup, administrators usually run the cache warmup at night. This restriction does not exist with the Kitt Cache Crawler, since it not only has a load control, but also automatically adapts the crawl behavior to the current load status. It can therefore never come to an overload.

Crawler Settings


Delay
Kitt Cache Crawler - Delay
Delay sets a Timeout in Microseconds between each Request.
Threads
Kitt Cache Crawler - Threads
Threads setting is one of the most important settings to determine crawl speed. The larger the value, the more requests are executed at the same time. However, the thread setting also has the greatest impact on the server load. Too high a value exponentially increases the risk that pages cannot be cached. It is also advisable to leave the default value when using a dedicated server.

Max Server Load
Kitt Cache Crawler - Max Server Load
Max server load control is an important core feature of Kitt Cache Crawler. You should pay close attention to this. Especially if you are inexperienced in handling your server, but the Max Server Load control will help you not to overload your server. It should be noted here that the load on a server is not determined by the CPU load or RAM consumption alone. The functions for checking the load contained in most server control panels are therefore not very meaningful. The Max Server Load control not only gives you reliable information about the load status, but also regulates it. This means that the crawl speed is dynamically adjusted to the current load status. The particular advantage of this load control is that you can run Kitt at any time of the day and regardless of the number of visitors in the Shopware shop, since there can never be an overload and therefore not an impairment of shop operations.
Network Timeout
Kitt Cache Crawler - Network Timeout
With every type of Internet use, it can happen that there may be a disruption and it may take longer for a request to be carried out successfully. If there are a large number of requests during crawling, this can significantly increase the crawling time. In order to regulate the waiting time, it can be limited with the network timeout setting.

Limit
Kitt Cache Crawler - Limit
If, for whatever reason, it is necessary to limit the number of URLs to be crawled, this can be set with the limit setting. This setting is also used to split the total number of URLs along with the offset setting. The handling of limit and offset is based on the well-known MySQL queries.
Offset
Kitt Cache Crawler - Offset
See Limit.

Logging
Kitt Cache Crawler - Logging
If logging is enabled, it feeds the cache inventory with cache status data from each requested URL.
cURL Debugging
Kitt Cache Crawler - cURL Debugging
If you want to know the exact details from the experts, you can get very meaningful information about every cURL request via cURL debugging.

Follow Redirection
Kitt Cache Crawler - Follow Redirection
Kitt Cache Crawler obtains its information about which URLs are to be crawled from the Sitemap.xml. After this should be up to date, there should actually be no redirects. If this is the case, putty can be configured to follow a redirect URL, thereby caching the redirected URL. If possible, this function should only be used in exceptional cases, since following a redirect can sometimes have a dramatic effect on the entire crawl time.
Route Name Selection
Kitt Cache Crawler - Route Name Selection
In order to give the rigid cache behavior of Shopware version 6 more flexibility, the Kitt Cache Crawler allows you to select and limit the URLs to be crawled. Ultimately with the aim of warming up the cache of URLs that are actually necessary. This significantly increased flexibility significantly reduces the time for crawling and reduces the need for server resources.

Server IP Address
Kitt Cache Crawler - Route Name Selection
To warm up the cache, Kitt simulates conventional user requests. Therefore, Kitt is subject to the same network conditions as normal user requests. However, Kitt requires direct access to the host, bypassing a CDN node. It is therefore necessary to enter the public IP address of the origin host.
Crawl Output
Kitt Cache Crawler - Route Name Selection
If Kitt is run manually in the console (CLI), the output can be activated if necessary. The output then shows the requested URLs, the transfer time (TTFB) and the resulting load per URL

URLs Most Wanted
Kitt Cache Crawler - Route Name Selection
The Most Wanted Title of this Feature stands for URLs Most Wanted and is a unique Feature unique to the Kitt Cache Crawler. If this Function is activated, Kitt does not use the Sitemap as the Basis for the URLs to be crawled, but rather the most requested URLs. This reduces the Number of URLs to be crawled by up to 70%, which inevitably leads to a significant Reduction in Server Load.

In addition, the Most Wanted Function also records the URLs that can never be included in the Sitemap. For the first time, the Pagination Pages, Search Result Pages and Filter URLs are also crawled and the cache is warmed up for them.

Unique Cache Crawler Features

  • Multithread Process Mode The multithreading process mode allows multiple requests to be executed simultaneously and makes crawling lightning fast without overloading the server.
  • Crawling up to 200,000 URLs within 1 hour Kitt Cache Crawler can crawl up to 200,000 URLs within 1 hour, even on Shared Hosting. This is up to 20x faster as built-in Crawler.
  • Dynamic Load Control Kitt Cache Crawler cares about the Load of your Server. A Dynamic Load Control checks the current Load while Crawling. When the load gets too high, Kitt automatically reduces the speed and the number of concurrent requests until a predefined limit is reached. This dynamic load control guarantees that frontend operation is not adversely affected by crawling.
  • Auto-Adjust Configuration Kitt Cache Crawler comes with pre-configured Settings, that fits in most cases. So you don't have to learn which settings are the best. Each setting is also provided with recommendations that give you even more security when making any changes.
  • Control Panel Version and Version for Command Line Interface (CLI) Each Kitt Cache Crawler version can be run in at least 2 ways. Kitt can be run from within the Control Panel, but also for use in the Command Line Interface. The CLI version is therefore also suitable for use as a cron job.
  • Tune-up Settings A wealth of tuning settings are available for experts to make Kitt even faster.
  • Cache Inventory The cache inventory optionally gives a detailed report on the cache status of each URL. This report also includes information about Cache Vary's cache status.
  • Route Name Selection Route Name Selection allows to filter and to limit which kind of Routes should be crawled. This also includes a Filter to limit the to be crawled URLs by Days.

URLs Most Wanted

The Most Wanted Title of this Feature stands for URLs Most Wanted and is a unique Feature unique to the Kitt Cache Crawler. If this Function is activated, Kitt does not use the Sitemap as the Basis for the URLs to be crawled, but rather the most requested URLs. This reduces the Number of URLs to be crawled by up to 70%, which inevitably leads to a significant Reduction in Server Load.

In addition, the Most Wanted Function also records the URLs that can never be included in the Sitemap. For the first time, the Pagination Pages, Search Result Pages and Filter URLs are also crawled and the cache is warmed up for them.

Kitt Cache Crawler vs Built-in Crawler

Features
Built-in Crawler
Kitt Cache Crawler
Request Method
Serial
MultiThread
Crawling Speed
<10,000/24h
<200,000/h
Delay
Threads
1
unlimited
Dynamic Load Control
Cache Inventory
Network Timeout
Limit
Offset
Logging
Curl Debug
Purge Method
fixed
Redirection
on/off
Cache Compression
identity, gzip, deflate, br

Still Questions?

We try very hard to give you as much information as possible. If you can't find the information you need or if you don't understand something, don't hesitate to ask us.
Ask A Question

More Kitt Cache Crawler Versions