4

Compressed HTTP Requests with Curl and PHP

 2 years ago
source link: https://php.watch/articles/curl-php-accept-encoding-compression
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Compressed HTTP Requests with Curl and PHP

Published On2021-10-19

PHP Curl - HTTP Compression with ACCEPT_ENCODING
Compression is a vital and effective way to increase the performance of web pages and web apps. For text-based resources such as HTML files, CSS/JS files, SVG files, etc., compressing the resource at the server prior to the transmission, and decompressing it at the browser can greatly reduce the bandwidth and transfer times.

For the server and web browser, this compression step is quite opaque, in that the server compresses the resources prior to sending it to the browser, and the browser decompresses them before rendering them. The server-side software and the front-end developers do not need to handle the compression/decompression steps.

There are few compression algorithms developed through the years, and browsers and servers can negotiate the correct compression algorithm using HTTP headers.

When making an HTTP request, the browser indicates the encoding algorithms it supports in an Accept-Encoding HTTP header. If the server supports any of the specified encoding algorithms, it may decide to compress the response, and indicate that with an Content-Encoding HTTP header. IANA maintains a list of registered encoding algorithms.

PHP Curl - HTTP Compression with ACCEPT_ENCODING

The Wikipedia page for PHP is around 549 KB in size, and when it compressed with Brotli (br), it is only 92 KB. HTML pages, JSON responses, SVG files, CSS/JS files, and other text-based files compress well, and the computational cost it takes to compress/decompress resources is small compared to the network transfer times the compression saves.


Although browsers default to use compression when fetching resources, PHP's HTTP clients do not. Curl, the popular HTTP Client supports encoding negotiation and opaque compressed requests, but it needs to be enabled.

$ch = curl_init('https://en.wikipedia.org/wiki/PHP');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_exec($ch);

Above is simplified example of making an HTTP request to the PHP page on Wikipedia. Curl does not use HTTP compression by default, and results in larger transfer size and request time:

curl_getinfo($ch, CURLINFO_TOTAL_TIME); // 0.81 sec
curl_getinfo($ch, CURLINFO_SIZE_DOWNLOAD_T); // 548 KB

CURLINFO_SIZE_DOWNLOAD_T value reveals that Curl download 548 KB of data from the remote server. Because there is no compression support indicated by Curl (via the Accept-Encoding header), the server did not compress the response.

Using the CURLOPT_ENCODING option, it is possible to explicitly specify the values for the Accept-Encoding header. It is not necessary manually decode the response, because Curl does it automatically. Note that setting the CURLOPT_ENCODING option is different from setting Accept-Encoding header manually. If set manually, Curl does not automatically decode the response.

CURLOPT_ENCODING value accepts a few types of values which are not known to be intuitive.

  • CURLOPT_ENCODING = null: Resets the value, disables header and automatic decoding.
  • CURLOPT_ENCODING = "": Curl automatically sends the appropriate header based on the supported algorithms, and automatically decodes the response.
  • CURLOPT_ENCODING = "identity": Client expects the server to not encode the response in any way.
  • CURLOPT_ENCODING = "gzip": Client is capable of gzip algorithm.
  • CURLOPT_ENCODING = "br": Client is capable of Brotli br algorithm.

CURLOPT_ENCODING option in fact accepts any string value, and Curl will attempt to decode it if the responding Content-Encoding header contains a known encoding algorithm.

The most useful and appropriate value for CURLOPT_ENCODING is an empty string (""). It enables encoding, but does not explicitly state the algorithms. This effectively enables all algorithms supported and selected by Curl.

  $ch = curl_init('https://en.wikipedia.org/wiki/PHP');
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
+ curl_setopt($ch, CURLOPT_ENCODING, '');
  curl_exec($ch);

With the CURLOPT_ENCODING="" option, Curl now makes HTTP requests with an appropriate Accept-Encoding header, listing all algorithms it supports.

The headers sent-out can be later retrieved using curl_getinfo function:

$ch = curl_init('https://en.wikipedia.org/wiki/PHP');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_ENCODING, '');
curl_setopt($ch, CURLINFO_HEADER_OUT, true);
curl_exec($ch);

$headers = curl_getinfo($ch, CURLINFO_HEADER_OUT);
var_dump($headers);
  string(103) "GET /wiki/PHP HTTP/2
  Host: en.wikipedia.org
  accept: */*
+ accept-encoding: deflate, gzip, br, zstd

The server can opt to compress the response in one of the compression algorithms it supports. This greatly reduces the transfer time and size:

curl_getinfo($ch, CURLINFO_TOTAL_TIME); // 0.31 sec
curl_getinfo($ch, CURLINFO_SIZE_DOWNLOAD_T); // 90 KB

Curl picks the compression algorithms it supports based on the options it was set at the compilation step.

In most distributions, PHP Curl is compiled with gzip, deflat, and br (Brotli). However, it is also possible to add support for zstd by compiling libcurl with zstd support, and recompiling PHP with the new libcurl header files.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK