Fast Lossless Color Image Compression

In the research field of compression, the usual goal is to achieve the best possible compression rate. The best known algorithms, however, are very slow, and sometimes impractical for real-world applications.

When dealing with images, the situation becomes especially worse considering the fact that photo resolutions are ever increasing. Image sizes of 4500×3000 pixels (or more) are not uncommon. Uncompressed, this amounts to 40 megabytes for 24 bits/pixel, and having to wait several seconds for the image to load or save is annoying.

London Bridge (Tower Bridge) : Reflection on the River Thames
Photo by Anirudh Koul at Flickr

JPEG, the de-facto standard for image compression, while still one of the fastest methods available, has one major drawback: it compresses lossy, in other words, whenever the image is saved, some fidelity is lost. For situations where multiple load/save cycles are to be expected, for example in photo manipulation applications, this is inacceptable.

In the recent months, I have been evaluating available lossless image compression algorithms to select one that is fast, but also good in compression efficiency.

Looking at the Lossless Photo Compression Benchmark I was a bit disappointed to find out that PNG, while quite fast in decompression and offering moderate compression rates, is very slow when compressing images.

The JPEG-LS format, being significantly faster at compression compared to PNG, and even offering improved compression rates of about 15%, seemed a good choice at first. It has, however, several drawbacks: Not only is it slower than PNG in decompression, but HP also has a patent on the used context modeling. No-Go.

What does a programmer do, when he is not satisfied with what is available? Right, he re-invents the world, only to make it better: Last week, I released the initial version of ImageZero (iz, pronounced “easy”), a high-performance lossless RGB photo codec.

Being twice as fast as PNG when decompressing (and more than 20 times faster when compressing) it achieves compression ratios that are near or better than PNG for natural photos, sometimes even better than JPEG-LS for very high quality photos. ImageZero is not intended for grayscale or artificial images, but I may improve the algorithms for those image types.

Method File Size Compression Decompression
uncompressed 46380 KB
JLS 14984 KB 6.6 s 7.3 s
PNG 16256 KB 42.4 s 2.4 s
IZ 15496 KB 1.2 s 1.3 s
Results for the full resolution “London Bridge” photo (4507×3512 pixels)

The used algorithm is really easy: the per-pixel compression and decompression is done without any branches (except for a single branch in the bit coder), so that on today’s super scalar CPUs it can be executed very fast.

Possible uses include:
– image/thumbnail databases
– backing store for image processing apps
– future icon format

If you are interested, please help shaping the library API and file format for .iz files. We can use the kde-imaging mailing list for discussion.


44 thoughts on “Fast Lossless Color Image Compression”

  1. Wow, I am amazed. You needed something better, you started coding, and you got it, with great results.

    I wonder though in which aspects .iz files would be worse than .png, other than the general inavailability (which is to be expected at first). I mean, does them has all PNG features, only better? Or are they just for a specific purpose, being PNG better for general-purpose things?

    For example, could this new compression algorithm replace PNG in the web in the future? Why or why not?

  2. How about running the whole benchmark?

    And don’t count on people pronouncing iz (i zee or i zett) as easy. They mostly won’t, just like with Qt.

  3. @Gallaecio, ImageZero is worse than PNG for non-photographic images, making it unsuitable for the Web. Also, PNG supports alpha channel and 48 bit depth, something that has not been implemented in IZ yet.

    @smudo, the complete LPCB results are in the forum, where I made the initial release.

  4. How about JBIG compression? It is used in printers where they download 1200 dpi full page images pretty fast.

  5. Okey, WebP offers lossless mode in a separate “experimental” branch. Will check that out later today.

    @Rob, JBIG is for black&white images only.

  6. Any chance for a short explanation for your algorithm? In the forum you mention a 3pixel predictor, how does this compare to WebP’s predictor?

    Also, please add a way to add an optional alpha channel to your pictures, it might not be the main use, but as a web designer I have often needed natural(or close to natural) photographs with alpha.

  7. I’m very impressed with your results and interested in looking at oyur code. I’m from a statistics/information theory background so I will be very pleased if I can be useful (although I can offer very little time)

  8. Did you try it with RAW images like DNG ? The pixel layout is different as they use bayer matrix, I wonder if this change your result and photographer would welcome any faster compressions algorithm for them.

  9. Talking about container, what prevents you from using the same PNG container for iz bitstream? For photographic content metadata handling is as important as the pixels themselves and you surely don’t want to reinvent the wheel in this area…

  10. Hi,

    I just checked out and compiled your iz on the Mac. Works great. You could add a makefile, though.
    A test photo (canon 7D 3849 x 2566 cr2 file) of mine did show the best compression ratio with iz compared to tiff (with various options for compression like LZW etc). Iz even beat bzip2 –best. Wow. And, yes, it’s fast.

    ppm (uncompressed) 29.6 MB
    bzip2 –best: 16.4 MB
    tiff: 15 MB
    iz: 11.3 MB

  11. I’m curious how this would compare to jpeg2000 in terms of speed and overall compression ratio? Are there any constraints to how iz scales to large gigapixel+ images?

  12. Can’t wait to see how the timings look when webp with lossless mode is added to the mix.

  13. What about PGF (Progressive Graphics File)? On the digikam mailinglist it got advertised as the better alternative to JPG and PNG for lossless saving.

  14. Wow that is a dumb algorithm, that’s a good thing, I think. I’m impressed. I would never have even tried, assuming that image compression is so well studied that it would take something really cleaver to beat what is already available.

  15. Thanks for your comments and interest (even Disney is discussing it 😉

    I could only test WebP-lossless compression: 797 s for above image, file size is 12699 KB. Decompression to uncompressed mode (PPM or BMP) is not supported with the current webpll command line tools.

    As for all other formats, they are all designed for better (and slower) compression, see the LPCB link above how they compare to PNG.

    For those interested, the state-of-the-art codecs from Alexander Ratushnyak reach 8871 KB in 54.0 s (Gralic), and 10962 KB in 9.1 s (Flic), but they are closed-source.

    I will answer more comments tomorrow.

  16. @ben, GPU-based compression requires OpenCL/CUDA, something my Intel 945 based system does not have.

    @Afiefh, read the source, it really is a dumb algorithm (thanks to manianlift for finding the right word). The basic idea is to store one Huffman code (not three) for each RGB pixel stating how many bits are needed for storing the residuals after prediction. Then the residual bits are stored uncompressed.

    The algorithm spends half of its time in the prediction, which means anyone with some MMX/SSE knowledge could implement the algorithm much faster by computing the RGB predictions in parallel, and I hope someone contributes an MMX/SSE implementation.

    You could also make it faster by using a simpler prediction (e.g. using only 2 pixels). Here, using “predict2” instead of “predict3”, it needs only 0.65 s for compression, but the file size is bigger (16414 KB). In practice, however, the disk speed limits the “perceived” time, so trading size for speed might even have a negative effect (if it would not, you would simply store uncompressed).

    As for alpha channel support, I lack any photos with alpha channel, so I cannot test any implementation. Hopefully Krita or GIMP users make lots of them available.

    @Fabrice, no, I do not plan to implement RAW modes, but 48 bit RGB support is on the todo list.

    @Alexander Bokovoy, if PNG container format allows aligning to 4, 8, or 16 bytes, it sure is an option. The bit coder will be slower with byte-aligned streams.

    @Sam, bzip2 is not an image compressor, so you cannot compare it to IZ.

    @Rich, I doubt that you want to store gigapixel images losslessly, given you can only expect a 1:3 ratio. Anyway, IZ currently limits to 16384×16384 pixels, for anything more you would need to compress tiles separately, but there is no API yet to support tiling.

    @lzap, lossy formats are inherently faster, because they have to code much less bits.

  17. Christoph, you’ll just need to select a new chunk type and dump your stream in it. The beginning of chunk is 4 (type) + 4 (length) bytes, then data stream, then 4 bytes of CRC. It is up to decoder how to process the data stream in (non-standard) chunks so you can select whatever alignment you like.

  18. Folks, what are you doing here?

    1) LS is *definitely* a lot faster if implemented properly. The algorithm is very, very simple, and will have a faster throughput than PNG for certain, whereas PNG uses a quite inappropriate dictionary approach for compression.

    2) The HP patent is available royality free, as this goes for any other patent that applies to a JPEG (baseline) standard. Yes, Weinberger at HP labs designed LS, and yes, HP did agree to provide free access to LS. This is recorded at the ISO level, so there is really nothing to worry about. Rather the reverse, who has patents on your algorithm, actually?

  19. > LS is *definitely* a lot faster if implemented properly.

    From what I see at the LPCB, the fastest implementation of JPEG-LS is the CharLS library, which is still 50% slower than PNG for decompression.

    Since IZ in its default mode is twice as fast compared to PNG, IZ is *three* times as fast as CharLS.

    If your “proper” implementation is faster, show me the source code. I am certainly interested.

    Anyway, if you read correctly, I already wrote that performance wasn’t really the issue which outruled JPEG-LS for my use case…

    > The HP patent […] nothing to worry about

    “To be licensed, a party must submit a Registration which is completely and accurately filled-out.
    Licensee may not assign the license granted hereunder to any party at any time without the prior explicit written consent of HP, except to an acquirer of substantially all of Licensee’s assets with written notice to HP.”

    In other words, if I release a codec, which uses JPEG-LS technology, every (commercial) user of that codec would also have to register with HP to obtain the license.

    I have never seen such as thing for JPEG-1. Adoption of JPEG-LS (as well as JPEG-2000) failed, because of those obscure license terms.

    > Rather the reverse, who has patents on your algorithm, actually?

    Patents on Huffman codes? Mach Dich nicht lächerlich.

  20. Ok, so here are the times: This is the plain stupid reference code on a nice huge image:

    thor@aroree:~/src/jpeg_ls_v2.2/Decoder> time locod -ir2000.jls -oout.pgm

    real 0m44.762s
    user 0m26.406s
    sys 0m1.164s

    …and this is from png:

    thor@aroree:~/src/jpeg_ls_v2.2/Decoder> time pngtopnm out.png >out2.pgm

    real 1m26.761s
    user 0m21.385s
    sys 0m1.580s

    As for the patents, if you have never seen this for JPEG-1, you haven’t actually read the standard. I recommend to look specifically into Annex L of it, you can get a free copy of ITU-T.81 from the ITU AFAIK if you know where to look. It states there clearly – as for all other JPEG standards actually – that patents apply to this algorithm and that you need to obtain a licence, just that you do not need to pay for it. Of course, nobody did. And probably nobody will for LS, but formally, yes, you have to do this as much for JPEG-LS as you have to for JPEG 2000 and for JPEG, the rules are exactly the same.

    You should also note that probably every nonsense is patented today (look for example for the “Mr. Sid” patent if you don’t believe me), so just having only Huffman coding does not make your code “patent free”. But at least I have reasons to believe that HP will have a bit more power for fighting for free access to an ISO standard (as it did for JPEG-1 against Forget aka “Compression labs” if anyone remembers) unlike a small private entity, so this is hardly a good argument you bring up here.

    If you care about higher speed than just JPEG-LS, I can certainly provide a couple of papers where this algorithm has been streamlined if this should be in your line of interested. The email is valid. It’s more of an akademic interest of course as likely nobody is going to apply such modifications that no longer adhere the standard.

  21. > nice huge image

    How big before/after compression? Is this a natural photo? To me, it looks like you are using an artifical image, where the run-length mode of JPEG-LS is used very often, which is a lot faster than regular coding, but without seeing the image, this is only a blind guess.

    > It states there clearly – as for all other JPEG standards actually – that patents apply to this algorithm

    It states the patents apply to the arithmetic coding mode of JPEG, which nobody used (because of the patents). And here is the difference to JPEG-LS and JPEG-2000: Even the baseline mode is patented in both.

    Anyway, thanks for your offer about possible JPEG-LS optimizations. But using a modified version could be infringement of the patent license HP offers (which requires that you follow the standard exactly).

  22. Alexander, I just checked the PNG specification. It explicitely states “The chunk data length can be any number of bytes up to the maximum; therefore, implementors cannot assume that chunks are aligned on any boundaries larger than bytes.”

  23. Christoph, this is implied for standard chunks. It is up to compressor of a non-standard chunk to pad the data as it needs because all non-standard chunks will be ignored by existing implementations that do not understand those chunks.

  24. Yes, you can force any alignment by pre-pending pad bytes. This means, however, that the result is different depending on the alignment of the previous chunks.

  25. The image here is a natural image (not an artificial image), to be specific an areal photo of a landscape. The original size is about 400MB, the compressed size for PNG is 230 MB, for LS it is a bit below 200MB. There aren’t any “flat” areas in this image. (no just single-tone regions)

    The fine part about LS is that it is pretty cache-friendly and that its history is rather limited.

    If you look into the patents, you will also find that there is not only AC-coding patents that are covered, the DPCM coder is also covered. The difference is just that some patents are available royality free, the AC coder was not.

    The baseline in JPEG 2000 and LS is patented, but as stated, it would cost you nothing to get access to the patent, so this is a strange argument to come up with.

  26. Looks very interesting! I have basic knowledge about data compression from university course but i would like to have more practical experience in this area, therefore i am thinking about writing an FPGA and/or GPGPU implementation (of course, open source) of your algorithm. 🙂

  27. Very interesting, I was working about a year ago on a Jpeg2000 hardware compressor, do you have any detailed information about comparison between the two? especially in timing and compression rate (in lossless mode).

    I’ll try to get more information about possible hardware implementation, could be fun!

  28. IZ files are about 20% bigger compared to JPEG-2000 in lossless mode. I cannot speak for hardware implementations (never done any beyond 7400 NANDs 😉 But if software complexity is an indication for hardware complexity, then implementations of IZ should be similar in speed with JPEG-LS implementations, but simpler, because of the primitive algorithm IZ uses.

  29. is there a working code for VS2010? I tried to make a project for it but it required all kind of missing linux to ms routines, managed to get it to start but get a weird expection right at the start


  30. OpenCL on x86 has the niche benefit, that it supports Multi-Core CPUs out of the box. Victor Oliveira implemented OpenCL in GIMP/GEGL with very nice results, even on systems without GPU.

  31. can u send me the algorithm please , i’m making paper in color image compression methods so it would help me

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: