Beyond the mod_deflate fixes, customer experiences with HTTP compression will depend on what type of data is compressed. Different browsers have problems with certain types of data being compressed. Compressing plain HTML works fine on any modern browser, though people have experienced browser problems when HTML with embedded javascript is compressed. There is some indication that in Internet Explorer the decompression path changes the timing of javascript loading and some javascript which would otherwise work will then fail with this changed timing. This has happened not just with mod_deflate but also with mod_gzip, which has been available for Apache 1.3 for a long time.
The Adobe Acrobat plug-in has known problems dealing with PDF files that mod_deflate has compressed. This is not a mod_deflate problem, and the same type of problem has occured with mod_gzip, which is a completely independent implementation of HTTP compression.
A couple of our IHS 2 customers have had problems using mod_deflate to compress javascript. With the last occurrence of this, the customer discovered that their javascript when compressed would run fine with Netscape but not with Internet Explorer. In both cases, the customers gathered traces and the data sent by the server was valid, but for some reason IE would not run the javascript properly if it had arrived compressed. IHS configuration directives would have to be used to disable compression for certain URLs and/or certain browsers.
Here's an interesting article regarding an IE 6 bug (there is a similar one for IE 5.5):
http://support.microsoft.com/default.aspx?scid=kb;[LN];Q312496Here's another article about an IE bug with javascript in the presence
of cache-control: no-cache
:
Here's another one, outlining one person's experience with compression changing the timing of javascript execution enough that the javascript no longer worked:
http://lists.over.net/pipermail/mod_gzip/2001-March/001708.htmlCommon symptoms are blank pages or error messages from the browser, or javascript execution failures, or problems in Adobe Acrobat displaying pdf files.
As long as some text in the uri can be checked for when to disable compression, it is easy. Here are examples for all uris ending in .pdf or .jpg (upper or lower case):
SetEnvIfNoCase Request_URI "\.pdf$" no-gzip SetEnvIfNoCase Request_URI "\.jpg$" no-gzip
Assuming that the known mod_deflate problems have been corrected with available fixes, the most likely cause is that the browser or plug-in cannot handle compressed data in the specific context. The browser/plug-in may not be able to uncompress certain media types at all or may not be able to uncompress certain media types when received in a certain order or some other limitation can be encountered. The problem could also depend on whether or not SSL is used.
Some theoretical causes that could be caused by mod_delate include
Here are the steps for determining whether or not mod_deflate generated the proper response to deliver to the browser:
NetTrace
directive, be sure to
specify the IP address of the client that you'll use for reproducing
the problem, as well as a large senddata value so that the entire
response is traced.
Example:
NetTraceFile /tmp/nettrace NetTrace client 111.222.333.444 dest file event senddata=5000000 event recvdata=1024
(The data sent to the server in the request body isn't normally an issue with compression issues, so we'll only trace the first 1024 bytes of the request body, if any.)
Also, it is recommended that you set LogLevel
to
Debug
and use the DeflateFilterNote
directive to log the request, compression ratio, and user agent string
from the browser (see the
DeflateFilterNote documentation).
/tmp/nettrace
in the example above) to another
location so that additional browser traffic isn't written to the trace
file that we'll examine next.
Here is an example where the input file is
/tmp/nettrace
and the
results of parsing it are to be stored in a new directory called
/tmp/nettrace.parsed
:
$ java -jar /tmp/ServerDoc.jar ParseNetTrace /tmp/nettrace /tmp/nettrace.parsed checking gzip integrity of /tmp/nettrace.parsed/127.0.0.1/0/sent.body.0 checking gzip integrity of /tmp/nettrace.parsed/127.0.0.1/1/sent.body.0 checking gzip integrity of /tmp/nettrace.parsed/127.0.0.1/1/sent.body.1 checking gzip integrity of /tmp/nettrace.parsed/127.0.0.1/1/sent.body.2
(In this example trace, there were four compressed response bodies.)
If the compressed data is invalid and could cause a problem for the browser, errors will be encountered and displayed by ServerDoc, as in the following example:
java -jar /tmp/ServerDoc.jar ParseNetTrace /tmp/nettrace.bad/tmp/nettrace.bad.parsed checking gzip integrity of /tmp/nettrace.bad.parsed/127.0.0.1/0/sent.body.0 /tmp/nettrace.bad.parsed/127.0.0.1/0/sent.body.0 is not properly gzipped! java.util.zip.ZipException: invalid bit length repeat checking gzip integrity of /tmp/nettrace.bad.parsed/127.0.0.1/1/sent.body.0 checking gzip integrity of /tmp/nettrace.bad.parsed/127.0.0.1/1/sent.body.1 checking gzip integrity of /tmp/nettrace.bad.parsed/127.0.0.1/1/sent.body.2
(For this example, we took a valid network trace generated by mod_net_trace but replaced some of the hex data in the trace file with a different sequence of bytes to simulate a corrupted response.)
Beyond the automatic gzip integrity checking performed by ServerDoc,
the response headers and the uncompressed data may need to be examined
as well. The response headers will be created by ServerDoc in files
called sent.hdr.0
, sent.hdr.1
, and so on.
The header field Content-Encoding must be present whenever the
response body is compressed. ServerDoc will not try to check the gzip
integrity of responses that did not contain the Content-Encoding
header field, so gzipped bodies that weren't checked by ServerDoc
possibly have invalid or missing header information.
It is possible that the response body that was gzipped was incomplete such that the gzipped response is valid from a gzipped encoding perspective yet when it is uncompressed by the browser there is missing information (e.g., a truncation occurred). To uncompress the response bodies and see what content was sent, use the gunzip utility.
$ gunzip < /tmp/nettrace.parsed/127.0.0.1/1/sent.body.0 > /tmp/uncompressed
The uncompressed data in file /tmp/uncompressed
will
have to be examined by someone that knows what is expected in order to
determine if the data is truncated or is otherwise malformed.
If a problem is discovered in the data written by mod_deflate or a
problem is suspected in the HTTP header, the documentation to send to
IBM is