Archive

Posts Tagged ‘mod_jk gzip’

Improve Performance with http compression

March 10, 2011 1 comment

Over the past few years there has been a huge increase in the number of web services that provide or consume data. As developers optimize their web services for performance and scalability, the one aspect that gets overlooked or is taken for granted is the network bandwidth. As part of optimization adding http compression can have a huge impact on response times. In terms of reward/effort ratio, it is one of the best things to do so that clients capable of handling compression can make use of it. Also since it degrades gracefully, web service clients who are not able to make use of it will continue to work fine.

What to Compress

Most web services expose the data provided by their API as XML / JSON.  If the amount of data exceeds a few KB, it may be worth considering compression. If on the other hand the amount of data you are dealing with is fairly small ( < 2 KB) you may decide that the additional CPU cycles may not be worth it. The answer will obviously vary depending on the specifics of each application and its clients. Also it doesn’t make sense to compress jpg/gif which are already compressed.

Specifics

More often than not an http server such as apache is involved in serving up the content. In case of a java/j2ee application the actual content may be generated by tomcat/glassfish/jboss/websphere but the app server may be behind apache. An option in this case is to let apache do the http compression. All that needs to be done is to enable  mod_deflate and add the right mime types that you want apache to compress. Here is a snippet of what to add to the http conf file. Don’t forget to load mod_deflate .

AddOutputFilterByType DEFLATE text/html text/plain text/xml application/json

As long as the client request includes something like accept-encoding=gzip,deflate , apache will compress the response. The amount of compression taking place can be logged/measured using the DeflateFilterNote directive. Adding something as simple as

DeflateFilterNote Input instream
DeflateFilterNote Output outstream
DeflateFilterNote Ratio ratio
LogFormat "%t \"%r\" %>s %b out=%{outstream}n/in=%{instream}n (%{ratio}n%%)" deflateLog

helps in logging and actually figuring out how much compression is taking place. As the response size increases, the ratio of “compressed data”/”uncompressed data” decreases and can go as low as 5% (savings of 95%).

Here are the log entries for responses roughly 50K ,15K and 5K  in size and the compression levels are seen.

“GET /compression/test.jsp HTTP/1.1” 200 2687  out=2669/in=51635 (5%)
“GET /compression/test.jsp HTTP/1.1” 200 1242  out=1224/in=14361 (8%)
“GET /compression/test.jsp HTTP/1.1” 200 858  out=840/in=5182 (16%)

Glassfish

If your topology demands that there be no apache in front of your app server, compression can be enabled on the app server itself. For example on glassfish (3.0.1) a change from the admin console

Network Config –> Network Listeners –> <listener name >–> HTTP –>Compression (on)

will result in an immediate change.

For tomcat, the change is as simple as adding  compression=”on” in the server.xml for the http connector.

Apache connectors

The above two options take care of compression between the client and the http server.  However in the scenario where apache sits in front of an app server such as tomcat/glassfish (and the two are on physically separate machines),  there is data transfer happening between those machines. Whether or not this data transfer uses compression depends on how apache talks to the app server. Two of the common options for doing this include mod_jk and mod_proxy.

mod_proxy

When mod_proxy is used, the http connector on glassfish (or tomcat ) is used. If gzip compression is enabled on the http connectors as mentioned above , then data from the appserver is compressed and apache mod_deflate will not perform any compression. This can be seen by lines such as these in the apache access log

“GET /compression/test.jsp HTTP/1.1” 200 1451 out=-/in=- (-%)

On disabling the compression on the app servers, data is uncompressed between apache and the app server, and apache does the compression, which can be verified in the access log.  As mentioned above, adding the DeflateFilterNote gives a lot of useful information to figure out whats going on under the hood.

mod_jk

mod_jk uses a binary protocol and doesn’t explicitly support enabling/disabling compression. Enabling some logging in mod_jk can give more information as to what’s going on. For doing that, we can make use of the JkRequestLogFormat as part of mod_jk configuration

JkWorkersFile     comptest/jk/worker1.properties
JkLogFile     comptest/logs/mod_jk.log
……
JkRequestLogFormat “%w %V %T %b”     #add this line

Here is the output from mod_jk log for scenarios for client requests with/without the gzip,deflate headers and with compression enabled/disabled on apache (using AddOutputFilterByType DEFLATE text/xml text/plain text/html)

[Wed 23:08:02 2011] worker1 localhost 0.015625 9353 ( compression not enabled on apache, client request with gzip,deflate accept-encoding)
[Wed 23:08:02 2011] worker1 localhost 0.015625 9353 ( compression not enabled on apache, client request without gzip,deflate accept-encoding)
[Wed 23:08:02 2011] worker1 localhost 0.015625 9353 ( compression enabled on apache, client request without gzip,deflate accept-encoding)
[Wed 23:08:06 2011] worker1 localhost 0.015625 1799 ( compression enabled on apache, client request with gzip,deflate accept-encoding)

The last column shows the number of types (%b) transferred. Again, note that this log captures the mod_jk traffic details i.e. data between apache and tomcat/glassfish. Looks like as long as compression is enabled on apache and the client request includes the gzip,deflate accept-encoding, there is compression involved in the tomcat-glassfish data transfer.  Same holds for glassfish.

ServletFilter

In scenarios where an embedded servlet container is being used as opposed to a full web container (e.g. embedded jetty) the only option maybe to make use of a ServletFilter that does gzip compression. An example is the gzip filter which is part of jetty.

Client

Double check the http client that you are using to make sure that the right headers are being sent. Requests from javascript libraries are usually pretty good about it, but for java http clients the headers may need to be set explicitly.

Testing

Testing the api to make sure that the compression set up is right is trivial but often forgotten. Something as simple as

curl –compressed -I “webServiceUrl”

and verifying that that the output contains  the header

Content-Encoding: gzip

will ensure that the set up is correct.

Another useful variation to measure the bytes returned is to use

curl -i -w “\nsize= %{size_download}\n” “webServiceUrl”

Trade Offs and Conclusion

Anyone considering using compression should start with the apache mod_deflate , which mentions issues to consider with proxy servers as well as with certain browsers. Applications come in various shapes and forms and there really isn’t a one size fits all solution here. Applications vary with respect to clients, servers, hardware, critical scenarios etc just to mention a few. While there is certainly bandwidth savings as a result of using compression, it comes at the cost of cpu cycles. Given how simple it is to set up compression and try it out, the best way to figure out whether it makes sense for your application is to give it a shot – it might just make a big difference!

Categories: performance