Compression for uzlib

Discussion about programs, libraries and tools that work with MicroPython. Mostly these are provided by a third party.
Target audience: All users and developers of MicroPython.
Post Reply
Maksym Galemin
Posts: 10
Joined: Mon May 28, 2018 11:48 pm

Compression for uzlib

Post by Maksym Galemin » Thu Jan 31, 2019 9:29 am

I'm wondering why uzlib module doesn't support compression? Surely, in the current implementation of tgzip example (uzlib ver. 2.9.2) the whole input file should be copied into RAM, but looking at the source code I can't find anything that makes splitting the input file into a number of data chunks and compressing them separately into gzip "members" (or "compressed data sets", see RFC 1951 and RFC 1952 for details) impossible. The final compression ration won't be as good as in case of a single huge gzip "member", but even with 8192-bytes data chunks per gzip "member" the final results looks not that bad. In any case I think it's better to have not very efficient compression in uzlib module rather then not having it at all.

Example main() code from tgzip.c (just a quick hack):

Code: Select all

int main(int argc, char *argv[])
    FILE *fin, *fout;
    unsigned int len;

    printf("tgzip - example from the uzlib library\n\n");

    if (argc < 3)
          "Syntax: tgunzip <source> <destination>\n\n"
          "Both input and output are kept in memory, so do not use this on huge files.\n");

       return 1;

    /* -- open files -- */

    if ((fin = fopen(argv[1], "rb")) == NULL) exit_error("source file");

    if ((fout = fopen(argv[2], "wb")) == NULL) exit_error("destination file");

    /* -- read source -- */

    fseek(fin, 0, SEEK_END);

    len = ftell(fin);

    fseek(fin, 0, SEEK_SET);

    unsigned crc = ~0;
    size_t size_remaining = len;
    unsigned char source[8192] = {};
    unsigned int hash_bits = 12;
    size_t hash_table_size = sizeof(uzlib_hash_entry_t) * (1 << hash_bits);
    uzlib_hash_entry_t hash_table[1 << hash_bits];

    while (size_remaining > 0)
        size_t bytes_to_read = (size_remaining > sizeof(source)) ? sizeof(source) : size_remaining;
        if (fread(source, 1, bytes_to_read, fin) != bytes_to_read) exit_error("read");

        /* -- compress data -- */

        struct uzlib_comp comp = {0};
        comp.dict_size = 32768;
        comp.hash_bits = hash_bits;
        comp.hash_table = hash_table;
        memset(comp.hash_table, 0, hash_table_size);

        uzlib_compress(&comp, source, bytes_to_read);

        /* -- write output -- */

        putc(0x1f, fout);
        putc(0x8b, fout);
        putc(0x08, fout);
        putc(0x00, fout); // FLG
        int mtime = 0;
        fwrite(&mtime, sizeof(mtime), 1, fout);
        putc(0x04, fout); // XFL
        putc(0x03, fout); // OS
        fwrite(comp.out.outbuf, 1, comp.out.outlen, fout);

        crc = ~uzlib_crc32(source, bytes_to_read, ~0);

        fwrite(&crc, sizeof(crc), 1, fout);
        fwrite(&bytes_to_read, sizeof(bytes_to_read), 1, fout);

        size_remaining -= bytes_to_read;


    return 0;

Posts: 1126
Joined: Fri Feb 28, 2014 2:05 pm

Re: Compression for uzlib

Post by pfalcon » Fri Feb 01, 2019 8:51 am

I'm wondering why uzlib module doesn't support compression?
Obvious reason would be that nobody did that?

And doing that properly would require elaborating the API of the underlying C library (also called uzlib).
Awesome MicroPython list
Pycopy - A better MicroPython
MicroPython standard library for all ports and forks -
More up to date docs -

Post Reply