How to save traffic on webserver

Started by musorhik, Sep 05, 2022, 10:03 AM

Previous topic - Next topic

musorhikTopic starter

A loaded web project consumes terabytes of traffic. On large numbers, savings of 10-20% can significantly save money and help not to go beyond quotas.
What should I do if the traffic dangerously approaches the limits of your web  hosting tariff or even goes beyond them?



In that topic, we will analyze the basic techniques that help save traffic on a server.
Squeeze it!

The easiest way to save on traffic is to compress it. This loads the web server processor, but allows you to give data to the client faster, reducing their size in order to close connections faster. Mostly Deflate-compatible algorithms are in use, but there are also exotics.

Gzip

The most common compression algorithm. Compresses lossless, with a good compression ratio (configurable from 1 to 9, 6 by default) and fast decompression. Simple and effective, suitable in most cases.

Nginx

gzip            on;
gzip_min_length 1000;
gzip_proxied    expired no-cache no-store private auth;
gzip_types      text/plain application/xml;


Apache

AddOutputFilterByType DEFLATE text/plain
AddOutputFilterByType DEFLATE text/html
AddOutputFilterByType DEFLATE text/xml
AddOutputFilterByType DEFLATE text/css
AddOutputFilterByType DEFLATE application/xml
AddOutputFilterByType DEFLATE application/xhtml+xml
AddOutputFilterByType DEFLATE application/rss+xml
AddOutputFilterByType DEFLATE application/javascript
AddOutputFilterByType DEFLATE application/x-javascript

Zopfli

A modern alternative to gzip, compresses 3-8% better, but much slower (decompresses on the client at the same speed). It works on Deflate, so it is 100% compatible with zlib, browser support is also complete.

git clone https://code.google.com/p/zopfli/
cd zopfli
make


Nginx

gzip_static     on;


Brotli

Like Zopfli, it was developed in the depths of Google. It can compress not only in static, but also on the fly, like gzip. Unlike the previous algorithms, it not only searches for repetitions in the text, but also immediately maps through its dictionary, which has a lot of tags and standard code phrases, which is extremely effective for html/css/js compression:
if Zopfli gives about 8% compression after gzip, then Brotli is able to throw about 10-15% more, and someone has 23% at all! But it is supported only in https and is incompatible with zlib/deflate.

Nginx

Support in the form of a standard module is only in Plus, the usual Nginx must be built with a third-party module (--add-module=/path/to/ngx_brotli):

git clone https://github.com/google/ngx_brotli.git
git clone https://github.com/bagder/libbrotli.git
./autogen.sh
./configure
make



cd /path/to/nginx
./configure --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-http_ssl_module --with-http_realip_module --with-http_addition_module --with-http_sub_module --with-http_dav_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_random_index_module --with-http_secure_link_module --with-http_stub_status_module --with-http_auth_request_module --with-http_xslt_module=dynamic --with-http_image_filter_module=dynamic --with-http_geoip_module=dynamic --with-http_perl_module=dynamic --with-threads --with-stream --with-stream_ssl_module --with-stream_geoip_module=dynamic --with-http_slice_module --with-mail --with-mail_ssl_module --with-file-aio --with-ipv6 --with-http_v2_module --with-cc-opt='-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2' --with-ld-opt='-Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,--as-needed' --add-module=/path/to/ngx_brotli
make


Config:

brotli_static   on;


In dynamic mode:

brotli      on;
brotli_comp_level   6;
brotli_types   text/plain text/css text/xml application/x-javascript;


Apache

Everything is easier here, install mod_brotli and configure the module:

<IfModule brotli_module>
 BrotliCompressionLevel 10
 BrotliWindowSize 22
 AddOutputFilterByType BROTLI text/html text/plain text/css text/xml
 AddOutputFilterByType BROTLI text/css
 AddOutputFilterByType BROTLI application/x-javascript application/javascript
 AddOutputFilterByType BROTLI application/rss+xml
 AddOutputFilterByType BROTLI application/xml
 AddOutputFilterByType BROTLI application/json
 </IfModule>


Cache it!

It is also possible to unload the channel between the user and the web server, minimizing the need to reload resources. If the file has been cached, then at the next request, the browser will receive its contents locally.

HTTP headers Cache-control, Expires and Vary allow you to design a very flexible caching policy, although you can put max-age=2592000 everywhere.

Nginx

location ~* ^.+\.(js|css)$ {
   expires max;
}


Apache

<ifModule mod_headers.c>
    <FilesMatch "\.(html|htm)$">
        Header set Cache-Control "max-age=43200"
    </FilesMatch>
    <FilesMatch "\.(js|css|txt)$">
        Header set Cache-Control "max-age=604800"
    </FilesMatch>
    <FilesMatch "\.(flv|swf|ico|gif|jpg|jpeg|png)$">
        Header set Cache-Control "max-age=2592000"
    </FilesMatch>
    <FilesMatch "\.(pl|php|cgi|spl|scgi|fcgi)$">
        Header unset Cache-Control
    </FilesMatch>
</IfModule>
<ifModule mod_expires.c>
    ExpiresActive On
    ExpiresDefault "access plus 5 seconds"
    ExpiresByType image/x-icon "access plus 2592000 seconds"
    ExpiresByType image/jpeg "access plus 2592000 seconds"
    ExpiresByType image/png "access plus 2592000 seconds"
    ExpiresByType image/gif "access plus 2592000 seconds"
    ExpiresByType application/x-shockwave-flash "access plus 2592000 seconds"
    ExpiresByType text/css "access plus 604800 seconds"
    ExpiresByType text/javascript "access plus 604800 seconds"
    ExpiresByType application/javascript "access plus 604800 seconds"
    ExpiresByType application/x-javascript "access plus 604800 seconds"
    ExpiresByType text/html "access plus 43200 seconds"
    ExpiresByType application/xhtml+xml "access plus 600 seconds"
</ifModule>

Distribute it!

A good CDN for a loaded site usually costs a decent amount of money, but for the first time, a free one is enough. Loading heavy resources via CDN can reduce traffic several times! Do not neglect that opportunity, especially when it is shareware. There are a lot of one-time articles with the tops of free networks on the Internet, but Cloudflare is always put in the first place.

Conclusion

If you are not using at least gzip yet — welcome to the Internet, here more than 80% of sites work with it. If you don't have enough standard compression and -9, use Brotli with a backup in the form of Zopfli (since the brothers don't have 100% coverage yet). You can save a lot of traffic on this:

    gzip: 50-95% compression depending on the content. The average on the web is 65-80%
    Zopfli: +3-8% compression relative to gzip on average, but sometimes 10%
    Brus li: +10-15% compression relative to gzip, depending on the content with rare shots up to 20% and higher.


Cache data on the client, that reduces traffic during repeated visits by 99% or lower, depending on the chosen caching policy and changes on the site.

Use CDN for content delivery and basic balancing. The main blow is taken by the distributing web server, while the traffic to the main one will be reduced significantly. How much exactly depends on the network, the load and the selected operating mode.
All of the above unfolds in the shortest possible time and does not require a complex restructuring of the web server architecture, which gives you time for its competent design. Compress, cache, distribute, and monitor your costs so as not to get into big money.
  •  

nisha03

There is also the modernizr js library, which checks the web browser when loading the page and if it is chrome, simply adds the webp class to the body, and for other browsers no-webp.
Then just write different css with a web option that looks at a webp file, and with no-webp, which looks at a jpg/png file.
There is a "picture" wrapper tag for img.
  •  

Bubunt

Nothing is written about the fact that in order for gzip_static and brotli_static to work in Nginx, in fact, it is necessary not only to include these directives, but also to squeeze the content yourself. The directives only instruct the server to use pre-compressed content, if there is one. And if there is no pre-compressed content, then the directives themselves are useless. So you'll have to download the brotli and zopfli tools and then write a script that will go through your content and compress everything.
If the static changes from time to time, then the script will need to be run again later so that the compressed versions would become relevant. To do this, you will either need to make some kind of automatic trigger, or take care of organizational measures.

Nothing is written about intermediate caches and about the Accept-encoding headers in the request and Vary:accept-encoding in the response.
Without that, there are non-exclusive chances to shoot yourself in the foot when configuring caching. Well, that is, the statement that Brotli works only over HTTPS is connected with these very intermediate caches. It is said about Brotli, but not about Vary.

Nothing is said about the need to use versioning. For example you cache /index.html for 1 day (because, for example, there is a price list and it changes once a day) and the style /default.css is cached for 1 year.
Let's assume that the web page and the style were radically synchronously changed. The client will come in a week, he will have a new page and the style will be old, which usually leads to a breakdown of the layout. To prevent this problem, it is necessary to name styles, including an ever-increasing version number in the file name. Well, after that, also do not forget to squeeze the styles after the change.

Again — minification of images, styles, scripts and the pages themselves. For texts — rollback from utf-8 to 8-bit encoding, if possible, we save every byte here.
  •  

Piyush

The modern consumer, who has been convinced by his own experience of the correctness of the expression about the location of the free cheese, sees in such proposals another mousetrap.
And he's partly right. Many small providers offer "unlimited" in the cheapest of their plans, thus luring customers. But the obvious calculation shows that such a proposal cannot be honest. In order not to go broke and keep the "ceiling" of the advertised unlimited, such providers are forced to limit the volume of traffic of each client. For these providers, the volume of their total traffic is also critical, because they, in turn, buy it from large service providers.
Data centers are usually in a diametrically polar position. Their communication channels of the highest power do not need to count every kilobyte passed through them.
So, for data centers, an unlimited tariff plan is an honest offer for the client simply not to count every penny. Moreover, it is not the cheapest offer and is calculated based on the actual rather active use of the network.

There are no restrictions imposed here, but do not confuse the concepts. "Unlimited" for our person is synonymous with the expression "a lot and for free." In fact, this is far from the case.
The concept of "unlimited" does not mean that the client can freely transfer terabytes of data per month. With this approach, it will pose a real threat to the provider. On the other hand, you still need to manage to download a huge volume or gain a million audience of visitors.
  •