Carsten's Homepage

Testing compressed HTTP encodings.

What is this about?

HTTP allows content to be compressed before transmission. When incorporating this feature into the yaws web server, I stumbled, like many people before me, upon a lot of browser incompatibilities, and some confusion related to the content encoding deflate.

This page lets you test your browser.

Note however, that this page is cheating. In particular, the content of the page will depend upon the content encoding used (probably violating the HTTP standard, although a `Vary: Accept-Encoding' header is included), telling you, if it has been compressed for transmission.

The test.

A simple HTML document

In the following examples, watch out for the first line (if they appear at all).

Transmit a file, use gzip, if the user agent allows it: here.

As above, but first try `deflate', if the user agent allows it, and if so, use raw deflate format: here.

As above, but first try `deflate', if the user agent allows it, and if so, use zlib deflate format: here.

Other documents

A document referencing a style sheet which we will also try to transmit compressed: here.

A postscript file: here or here.

A PDF file (interesting if it handled by a plugin): here or here.

What are the sample document?

Just ignore them.

What is the format of deflated content?

Quoting RFC 2616:

deflate
The "zlib" format defined in RFC 1950 in combination with the "deflate" compression mechanism described in RFC 1951.

As a first guess as well as after having read the two mentioned documents, this should be what is described in the next subsection.

What is what I call `zlib deflate format'?

The `zlib deflate format' documents have been produced with the following small filter written in python using zlib. The result works with Mozilla, w3m and Opera.

#!/usr/bin/env python

import sys
import zlib

def output(d):
    if d:
        sys.stdout.write(d)

Z=zlib.compressobj()

input_chunk_length = 4096

data = sys.stdin.read(input_chunk_length)
while data:
    output(Z.compress(data))
    data = sys.stdin.read(input_chunk_length)

output((Z.flush()))

This yields (for comparison with the RFCs)

% deflate < /dev/null | od -t x1
0000000 78 9c 03 00 00 00 00 01
0000010
% echo -n test | deflate | od -t x1
0000000 78 9c 2b 49 2d 2e 01 00 04 5d 01 c1
0000014
% echo -n tttttttttttttttttttt | deflate | od -t x1
0000000 78 9c 2b 29 c1 04 00 5f 3c 09 11
0000013

The same is achieved with the following C program, which directly uses zlib without any (convenient) binding in the way.

#include <stdio.h>
#include <zlib.h>

#define input_chunk_length 4096
#define output_chunk_length 4096

int main () {
    char in_buffer[input_chunk_length];
    char out_buffer[output_chunk_length];
    int bytes_read;
    z_stream strm;

    strm.zalloc = (alloc_func) Z_NULL;
    strm.zfree =  (free_func) Z_NULL;

    deflateInit2(&strm, Z_DEFAULT_COMPRESSION,
		 Z_DEFLATED, 15, 8, Z_DEFAULT_STRATEGY);
    /* equivalent to deflateInit(&strm, Z_DEFAULT_COMPRESSION); */

    strm.next_out  = out_buffer;
    strm.avail_out = output_chunk_length;

    do {
	strm.avail_in = 
	    bytes_read = fread(in_buffer, 1, input_chunk_length, stdin);
	strm.next_in  = in_buffer;
	while (deflate(&strm, bytes_read ? Z_NO_FLUSH : Z_FINISH),
	       (!bytes_read && strm.avail_out != output_chunk_length) 
	       || ! strm.avail_out) {
	    fwrite(out_buffer, 1, 
		   output_chunk_length - strm.avail_out,
		   stdout);
	    strm.next_out  = out_buffer;
	    strm.avail_out = output_chunk_length;
	}
    } while (bytes_read);
    return 0;
}

What is what I call `raw deflate format'?

It turns out that some browsers expect deflated data without the first two bytes (a kind of header) and and the last four bytes (an ADLER32 checksum). This format can of course be produced by simply stripping these off. It can also be produced by changing the 15 in the call to deflateInit2 in the C program above to -15 (or generally the negative of the desired windowBits). This feature of zlib is undocumented, however.

The browsers mentioned as working with the zlib deflate format will actually accept both formats.


Viewable With Any Browser Valid HTML 4.01! Valid CSS! Carsten's Homepage