[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

ave length, best compression etc - Was Re: [idn] THis WG derailed ?



As with all compression algorithm, there is an 'entropy' which is the
mininum length (or max compression) you can achieve by eliminating extra
information. Beyond that, you can achieve better compression for some
particular string by causing other strings to expand, ie, you have two
set: compressed strings and expanded strings. It is a give-and-take.

Thus, the idea is that if we do have a compression algorithm, then we
want to have the more oftenly occured string to be in the compressed
set. Thus, an rearrangment algorithm like LSB which basically rearrange
the certain characters so that it can be compressed better is generally
a good idea.

OTOH, we do not know whether this rearragement will produce better
compression in the long run. It may turn out that those strings which
falls in the expanded strings set is more oftenly used in future.

There is always a holy grail of compression. And we could spend donkey
years arguing over it and never get to our goal, ie, IDN. Lets not
forget that and hopefully we can get IDN in a timely fashion.

-James Seng