Results 1 to 19 of 19

Thread: Any money in data compression?

  1. #1
    Member
    Join Date
    Feb 2010
    Location
    Grenada
    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Talking Any money in data compression?

    Hi there,

    I was just wondering. Is there any money in independent data compression these days?

    I mean, if I was to develop a compression method, with a ratio, say, fairly better than paq, who would be interested (enough to spend money on it)?

    Interested in your views.

  2. #2
    Member
    Join Date
    Apr 2009
    Location
    The Netherlands
    Posts
    49
    Thanks
    0
    Thanked 3 Times in 2 Posts
    I think there is money in data compression.
    If you can create an efficient compressor. It should be better in compressionratio, faster in compression and decompression, using less resources than comparable algorithms... Especially good in compressing audio, video or images... Maybe you will get noticed by a big company?
    Some of the algorithms listed here are better than commercial algorithms, but too slow or too memory consuming to be practical.

  3. #3
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    410
    Thanks
    37
    Thanked 60 Times in 37 Posts
    i think yes!

    but there must be a free (open) decompressor
    (for example like unrar does)

    or
    better the compression-format is a known, well accepted format
    like 7z with full directory-tree-support

    because for example
    i want to use compression for archiv-purposes
    and i must have the warranty that i can uncompress the archive

    ---

    interesting imho seems for example to have

    a program which can read a uncompressed 7z-file and recompress it

    or

    a program which uses the bzip2 method within 7z-format
    and implements a massivly multithreading-variant
    with the result to have a good and very fast compression

    best regards

  4. #4
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 779 Times in 486 Posts
    Yes, but not by selling your program. There is already a lot of high quality compression software that is free and open source. It is hard to compete against that.

    The way I made money in compression is by giving my software away as open source. This got the notice of companies that are willing to pay me good money to write software for them. That software is usually specific to specialized or obscure formats where you can do better than general purpose programs like PAQ or SR2 (depending on whether you want size or speed ).

  5. #5
    Programmer michael maniscalco's Avatar
    Join Date
    Apr 2007
    Location
    Boston, Massachusetts, USA
    Posts
    113
    Thanks
    11
    Thanked 88 Times in 26 Posts
    Quote Originally Posted by Matt Mahoney View Post
    The way I made money in compression is by giving my software away as open source. This got the notice of companies that are willing to pay me good money to write software for them.
    Matt, you beat me to it. This is exactly correct.

    You won't likely make money from your work. But you can dramatically increase your value as a developer. And this leads to opportunities for a better salary and better career. At least, it did for me.

    - Michael Maniscalco

  6. #6
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 779 Times in 486 Posts
    Quote Originally Posted by joerg View Post
    i think yes!

    but there must be a free (open) decompressor
    (for example like unrar does)
    Or make the compressor free and charge money for the decompresser. (Or to be even more evil, be sure the compressor deletes the original ).

  7. #7
    Member biject.bwts's Avatar
    Join Date
    Jun 2008
    Location
    texas
    Posts
    449
    Thanks
    23
    Thanked 14 Times in 10 Posts

    Thumbs up

    Quote Originally Posted by Matt Mahoney View Post
    Yes, but not by selling your program. There is already a lot of high quality compression software that is free and open source. It is hard to compete against that.

    The way I made money in compression is by giving my software away as open source. This got the notice of companies that are willing to pay me good money to write software for them. That software is usually specific to specialized or obscure formats where you can do better than general purpose programs like PAQ or SR2 (depending on whether you want size or speed ).
    I like your view. I had offers a few years ago but it wasn't clear if it was for a real job or a prank. I think that you offer good advice. I wish a real company would offer me a job where I could stay at home on my or their PC and write one or two specializes routines a year. It hasn't happened but it might.

    The way I made money was to get noticed at work as a strange person always playing with code on a machine. I tended to take stuff people wrote in fortran cobol or pascal ( it was a phase for a couple of years) and then made it better and faster in machine code. I say it made me money because they did not fire my even with my antisocial skills and kept giving me raises.

    I don't do much machine coding any more but it was fun and a great start. I was not liked my much of management but in a crunch I always had work during the cold war. My interest was more in the crypto field than in compression. I was known and used to break schemes at work.

    I had hoped when I retired the NSA would hire me but alas that did not happen. But compression is so close to encryption that it is what got my interest. Maybe the NSA will take a second look at my stuff. I think that money is not everything. Changing Matts code allowed me to reach some fame. And Mark Nelson was a major help in getting me to do the BWTS thing.

    Which means they might hire me kill me or ignore me. If you see me posting every so often you can be pretty sure option 3 (ignore me is in affect).

    Have fun you only live once

    David A. Scott

  8. #8
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    IMHO compression is very popular (especially lossy audio, image, and video) and there are some companies willing to pay for new, better algorithms. You can get some money by selling licenses for your compressor if it stands out with high speed or high compression, but it must be open-source (e.g. free for non-commercial usage) as nowadays the competition is tough and it's hard to sell a closed-source compressor. The good example is LZO, which can be used with GPL (with a UCL compression library) or with a commercial license (with a more powerful NRV compression library). From the other side, Christian Martelock wrote some very good, but closed-source programs (ccm, rzm, slug), which are the best or among the best in their class. Currently he tries to earn some money on them. I wish him the best, but I think it will be very hard because of the closed-sourceness.

    The second possibility is to sell not just an implementation of an algorithm, but the algorithm itself (or a patent for it). I can imagine that somebody will make up something more powerful than MPEG-4 AVC/H.264 or JPEG-2000 and will get a huge amount of money for this.
    Last edited by inikep; 1st February 2010 at 19:44.

  9. #9
    Member biject.bwts's Avatar
    Join Date
    Jun 2008
    Location
    texas
    Posts
    449
    Thanks
    23
    Thanked 14 Times in 10 Posts

    Cool

    Quote Originally Posted by michael maniscalco View Post
    Matt, you beat me to it. This is exactly correct.

    You won't likely make money from your work. But you can dramatically increase your value as a developer. And this leads to opportunities for a better salary and better career. At least, it did for me.

    - Michael Maniscalco
    If he is exactly correct why can't you take his advice to use in your own life. I could be wrong but isn't a lot of the stuff you wrote closed source.
    Such as M03or M99 I could be wrong maybe the source is available. After all I was wrong about Scott it looks like he is your Senator.

  10. #10
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by Kaw View Post
    Some of the algorithms listed here are better than commercial algorithms, but too slow or too memory consuming to be practical.
    So they are not quite "better".

  11. #11
    Member biject.bwts's Avatar
    Join Date
    Jun 2008
    Location
    texas
    Posts
    449
    Thanks
    23
    Thanked 14 Times in 10 Posts
    Here is an idea to make money or fame in the compression game. I have recently looked at how well DNA compression works. Someone showed me a page of the compression effort of e.coli, No one compresses it very well. Well I have a goal to compress it better. Most of my approaches will not work in fact tried a BWTS got some compression but got better when the BWTS part not done. I may go back to using it, but it has given me other ideas. I am not going to be very fast in that I am lazy but I think I am zeroing in on the problem. It's great fun learning more about DNA the code of life so its a learning experience.

    When I get code that works well, I will offer it to a company. Of course they will mostly likely not answer. Then I will release it. Knowing its better than the
    company currently has. This could lead to a job. But at the very least if the company you wrote to was making money on DNA compressing software. Its possibly there customers will find out about it and use your free software. Its a win win situation.


  12. #12
    Programmer michael maniscalco's Avatar
    Join Date
    Apr 2007
    Location
    Boston, Massachusetts, USA
    Posts
    113
    Thanks
    11
    Thanked 88 Times in 26 Posts
    Quote Originally Posted by biject.bwts View Post
    If he is exactly correct why can't you take his advice to use in your own life. I could be wrong but isn't a lot of the stuff you wrote closed source.
    Such as M03or M99 I could be wrong maybe the source is available. After all I was wrong about Scott it looks like he is your Senator.
    I wasn't speaking about open source. I was arguing that you make money by demonstrating your worth as a developer through your achievements. No one needs to see the source code for M03 in order to understand the achievement itself.

    Besides, MSufSort is open source and has been since day one (most of DivSufSort's improvements of the last few years has come from adopting the algorithms from the source code of MSufSort version 3.x).

    I've documented how M03 works in my own words as well as to others on more than one occasion. I even spent a great deal of time documenting the technique for Dr Juergen Abel and it will eventually appear in a paper of his. But I haven't shared my personal implementation of it because I'm not done playing with it yet. (^:

    - Michael

  13. #13
    Member biject.bwts's Avatar
    Join Date
    Jun 2008
    Location
    texas
    Posts
    449
    Thanks
    23
    Thanked 14 Times in 10 Posts

    Question

    Quote Originally Posted by michael maniscalco View Post
    I wasn't speaking about open source. I was arguing that you make money by demonstrating your worth as a developer through your achievements. No one needs to see the source code for M03 in order to understand the achievement itself.

    Besides, MSufSort is open source and has been since day one (most of DivSufSort's improvements of the last few years has come from adopting the algorithms from the source code of MSufSort version 3.x).

    I've documented how M03 works in my own words as well as to others on more than one occasion. I even spent a great deal of time documenting the technique for Dr Juergen Abel and it will eventually appear in a paper of his. But I haven't shared my personal implementation of it because I'm not done playing with it yet. (^:

    - Michael
    I am sorry I get confused with english. I guess its like the signs that say FREE PHONE. I go in and bug them I never get the free phone. Matt in the reply was talking about open source you stated you "exactly agreed" with him. I didn't realize that "exactly" has different meanings. I guess thats why I have trouble communicating with others in english.

    Dave

  14. #14
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,378
    Thanks
    215
    Thanked 1,025 Times in 546 Posts
    > I was just wondering. Is there any money in independent
    > data compression these days?

    Compression is already used for many applications - it
    allows to cut costs on any kind of storage or transfer of
    large volumes of data. And popular algorithms are far from
    perfect (ppmd and lzma are nearly 10 years old already),
    so any noticeable improvements would be welcome.
    For example: filehostings/online backups, any large DBs/archives,
    program installers, game resources, etc.
    But its a bit unlikely to make millions in this area,
    unless you'd create a consulting company or something -
    specific applications are too different, so you'd have to
    do some additional work for any new customer, and although
    money saved by improved compression are measurable,
    there's usually not that much.

    Well, there might be still a chance with video compression,
    but its really a lot of work, hardly for a single person.

    > I mean, if I was to develop a compression method, with a
    > ratio, say, fairly better than paq, who would be
    > interested (enough to spend money on it)?

    ...Who, well, ... Marcus Hutter? http://prize.hutter1.net/
    (Also http://www.mailcom.com/challenge/)

    But talking about "paq compression ratio" in general
    is actually a bad idea. Paq8 is kinda a bundle of 6-7
    unrelated compressors, and most of these are not especially good.
    Only the text model is world's 1 or 2, but even if you'd
    beat that, its unlikely that somebody (including Hutter)
    would pay for that, as plaintext volume is basically nothing
    comparing to other content (and enwiki is not plaintext).

    However, there're different applications for CM
    compressors (including paq), unlike other approaches like
    LZ or BWT. Lossless CM can be rather easily turned to lossy,
    and its also possible to use it for prediction and data
    generation for many tasks. Basically, all the traditional
    applications for neural networks etc, but with data entropy
    as an objective model quality measure.

  15. #15
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 779 Times in 486 Posts
    I think publishing any good software whether it is open source or closed source will get you noticed by companies that need something similar. There is no disadvantage to an open source license like GPL because if a company wants to use the code in their product they will probably want to keep the software proprietary so they still need to buy a license. The most practical thing for them to do is hire you, because nobody can modify or maintain the code better than the person who wrote it, even if it is well documented. When Ocarina hired me, they also got to use all of the code I wrote earlier without the GPL restrictions.

    The advantage of open source is that others can modify it and improve it. PAQ would not have gotten where it is without about 20 other people contributing to the code. Ocarina hired some of them too.

    It will take patience. PAQ is about 10 years old (going back to the neural network compressor P5 in 2000). It takes years to develop good algorithms. PAQ1 (Jan. 2002) was not that great a compressor until Serge Osnach added SSE to it in May 2003. The idea had never occurred to me. Then there was a burst of interest starting in Jan. 2004 when a variation of PAQ6 won the Calgary challenge.

    BTW there has not been a Calgary challenge winner since May 2006. I think there is room for improvement. The fame will be worth a lot more than the prize money, I think.

  16. Thanks:

    encode (30th January 2019)

  17. #16
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    461
    Thanked 257 Times in 105 Posts
    I can only concurr with Matt's view.
    As an unexpected event, it just happened to me this year; while my contribution in this area has been, to be fair, very small compared to most other contributors in this forum (it was just a hobby to begin with...), it proved enough to get noticed.
    Last edited by Cyan; 4th February 2010 at 19:45.

  18. #17
    Member
    Join Date
    Aug 2018
    Location
    United States
    Posts
    7
    Thanks
    1
    Thanked 3 Times in 3 Posts
    We are currently about 4 months away from our completed prototype for compression. Let's say hypothetically we can compress nearly any size/type of data to less than 1% of it's original size (the more data the better the compression ration). We would be looking to license this tech to the giants. Does anyone know if this is something they would be interested in licensing as we do not plan to sell. Right now it fully works on paper and manual input. We just need to automate a few things.

  19. #18
    Member
    Join Date
    Sep 2018
    Location
    Philippines
    Posts
    38
    Thanks
    22
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Blindtech View Post
    We are currently about 4 months away from our completed prototype for compression. Let's say hypothetically we can compress nearly any size/type of data to less than 1% of it's original size (the more data the better the compression ration). We would be looking to license this tech to the giants. Does anyone know if this is something they would be interested in licensing as we do not plan to sell. Right now it fully works on paper and manual input. We just need to automate a few things.
    1%. That's impossible i think. But today's compression algorithms are indeed mostly symbol/string replacement algorithms.

    I also thought of such an algorithm that compresses at 99% compression ratio. But (maybe) no decoder created. Then it became 90% ratio. Then it degenerated to LZ77, or simple enumeration of strings into codes. Then your compression functions are maybe just equivalent to circular/ring buffers or MTF, and so on.

    ***

    With regard to open source comments in this thread, e.g. Matt Mahoney's, i believe you should not give the source code to the public away if you have a "super" compression algorithm. You keep and guard the source code first, contact the software tech giants you choose, and sell your compressor to them. Let them buy your executable and source code or algorithm.

    There are many things that could go wrong too when you plug your USB flash drive with your program, but they actually copied it already. They're the experts, they can even hack your internet connection and steal your code. So do finish your programming in a computer not connected to the internet. And maybe, if you really got that super compression program, the buyer will "show the money first."

  20. #19
    Member
    Join Date
    Aug 2018
    Location
    United States
    Posts
    7
    Thanks
    1
    Thanked 3 Times in 3 Posts
    Thank you for the insight. We made an awesome breakthrough tonight which is key to the decompression side of things. Finishing and tweaking the compressor this week. On current test we were able to compress a 5.14gb pst file to 13 bytes. Very exciting especially after tonight's epiphany. It did take about 3.5 minutes to compress though so working through possible performance increases.

Similar Threads

  1. Data compression explained
    By Matt Mahoney in forum Data Compression
    Replies: 92
    Last Post: 7th May 2012, 19:26
  2. Data compression group on facebook
    By Matt Mahoney in forum The Off-Topic Lounge
    Replies: 8
    Last Post: 14th May 2010, 23:16
  3. Advice in data compression
    By Chuckie in forum Data Compression
    Replies: 29
    Last Post: 26th March 2010, 16:09
  4. Data Compression Crisis
    By encode in forum The Off-Topic Lounge
    Replies: 15
    Last Post: 24th May 2009, 20:30
  5. Data Compression Evolution
    By encode in forum Forum Archive
    Replies: 3
    Last Post: 11th February 2007, 15:33

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •