Compression Algorithms and Dictionaries

A compression algorithm is the particular method used to shrink the size of transferred traffic, without altering its payload (known as "lossless compression"). Most compression algorithms involve spotting repeated sequences in the data and storing these sequences for quick look-up searches later. A compression dictionary is a term that Packeteer uses to define a specific compression algorithm along with a dictionary size.

About Compression Algorithms

Many of Xpress' compression algorithms use a predictive approach that is both fast and extremely effective. Xpress learns which sequences typically follow others and predicts subsequent content based on precedent. When a prediction turns out to be valid, Xpress needs only to transfer an indicator of successful prediction, rather than the actual content, to its compression partner on the receiving end. Using the same dictionary of predictive sequences, the partner notes a positive indicator and restores the original data.

For a simplistic example, examine the following tongue twister: "How much wood could a wood chuck chuck, if a wood chuck could chuck wood?" Using Xpress' predictive capabilities, we might create dictionary entries such as: "ou" is frequently followed by "ld" and "ould" is frequently followed by "chuck." Or we might get more complex, longer predictions from a second pass through the data, such as: "a wood chuck" is frequently followed by "could chuck."

Frequently, applications benefit from one compression algorithm more than from another. For example, email and messenger traffic achieve the highest compression with the ICNA algorithm. Xpress offers several compression algorithm options. In addition, Packeteer provides additional choices through plug-in modules that you can download from the same Packeteer site you get software upgrades and classification plug-ins. The options currently available are:

About Compression Dictionaries

A compression dictionary is defined by a specific compression algorithm (Pred2, Pred1, CNA, ICNA, Zlib) along with an associated dictionary size, such as 512K or 4M.

By default, all compressed services on an Xpress unit share the same compression dictionary. The default dictionary that is in effect depends on the version of PacketWise software you are using, the installed compression plug-ins, the Xpress model, the amount of memory in the Xpress unit, and whether you have manually set a different default. For example, the default algorithm in PacketWise v6.1 is Pred2 and in v7.2 is CNA.

Traffic must be decompressed with the same dictionary with which it was compressed. Before compressing, the Xpress unit will check with its partner to make sure that dictionary is available. If that dictionary is not available, it will look for one that they both have and will use that common dictionary to compress the data. Note that if an Xpress unit is compressing with ICNA and the ICNA plug-in isn't installed on a partner, the Xpress unit will compress with CNA, or if CNA is not available, it will use Pred2.

Assigning Private Dictionaries to Classes

If different services have different types of content and their repeated sequences vary, then a shared dictionary might not be the most effective alternative. Predictions would be valid for some services, but then would change for others.

Giving a class its own dictionary means that the predictions will be tailored for just the one class' type of content. But you can't give every class its own dictionary because memory is limited. The more memory is used for individual dictionaries, the less available to support many tunnels.

A good strategy is to give a very critical traffic class with voluminous traffic its own algorithm/dictionary. Then, if there is another class whose traffic content is similar (and would therefore benefit from the same predictions), you can make the two classes share one dictionary. (Complete instructions on how to tune compression are not in this background-information file; they're in Increase Bandwidth Capacity — Enabling and Tuning Compression.)

Note that you would only want to assign private dictionaries to Outbound classes, since it's the outbound traffic that gets compressed.

You can choose:

You could think of it like a party full of guests (classes) that all want pie (algorithm/dictionary).

Dictionary Assignments in Diffserv Environments

Diffserv environments use packet marking to indicate the desired service level to be given to packets as they traverse the network. You can choose for Xpress compression to use a single dictionary for any packets with like ToS (Type of Service) bits. With this feature, each TOS value seen on your network is automatically associated with a unique group dictionary, and all services with this TOS value will use the same dictionary. For example, suppose SAP and Oracle are each assigned a DSCP of 1 (a TOS value of 8); these two services will be assigned the same group dictionary.

Compression Dictionary Sizes

Your decisions regarding the size and number of distinct dictionaries involve tradeoffs:

Bear in mind that compression is faster than decompression. In other words, it takes less time for an Xpress unit to compress a packet than the partner unit takes to decompress the same packet.

Each compression algorithm's dictionary has associated size ranges that you can choose and assign. The range of sizes depends on the amount of physical memory you have in your unit. You can find out which dictionary sizes for which algorithms are supported on your unit with the CLI command setup compression show types. (Remember that sometimes you can also download plug-in modules for additional choices.) The current maximum ranges for each algorithm are:

Pred1: 64 KB to 4 MB

Pred2: 64 KB to 32 MB
Excessive dictionary size does not render big compression payoffs for Pred1 and 2. For example, if Pred2 has sufficient space to identify and store the most common sequences, additional space doesn't improve results very much. A 1 MB dictionary yields much better results than a 128 KB dictionary, but it is unlikely that a 16 MB dictionary would yield substantially better results than with 1 MB.

CNA: 64 KB to 32 MB
CNA makes good use of as much memory as you can give it. The most effective size seems to be around 4 MB, but 1 MB is usually sufficient for good gains. Packeteer recommends CNA-4M as very good choice for compression's default dictionary.

ICNA: 64 KB to 32 MB
As with CNA, ICNA's most effective dicitionary size is 4 MB. If sufficient RAM isn't available, a 1 MB dictionary still provides significant gains.

Zlib: 32 KB
Its small dictionary size makes Zlib an excellent choice when memory is short. For example, when a main site unit connects to many small branches, it forms a large number of tunnels to service the many locations, and it uses more of compression's memory. When memory runs too short to assign Pred2 or CNA with their associated dictionaries, Zlib is a good choice.

Estimating Memory Usage

Sometimes, it's useful to understand how much and when compression memory is used. Maybe you are planning your compression deployment and want to check to see if your plan will fit within your memory allotment. Maybe you're planning to assign several custom private algorithm/dictionaries to individual classes and wondering if you have room. Or maybe you're wondering if you're running out of compression memory, and if so, why.

First, you can find out how much memory for compression you have (and have left) with the CLI command setup compression show. Then, you add the memory used for compression and for decompression for each algorithm/dictionary your unit uses.

Compression memory usage:

Multiply the algorithm's assigned dictionary size by the number of compression tunnels fanning out from the unit you are configuring. (You don't need to consider how many classes use the tunnel or any other factors.)

Note: You won't be able to accurately estimate compression memory usage for the Pred2 algorithm. Pred2 does not always execute a second pass of compression. If it does, it requires twice the memory for each compression tunnel. If it doesn't do a second pass, it allocates memory the same way as other algorithms.

DEcompression memory usage:

Multiply the algorithm's assigned dictionary size by the number of compression tunnels that deliver inbound compressed traffic to the unit you are configuring.

Examples:

Complete instructions for tuning compression, including assigning different compression dictionaries, are available in Increase Bandwidth Capacity — Enabling and Tuning Compression.

        

PacketGuide™ for PacketWise® 7.4