Results 1 to 18 of 18

Thread: AtomBeam compression / reduction

  1. #1
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    353
    Thanks
    131
    Thanked 54 Times in 38 Posts

    AtomBeam compression / reduction

    Does anyone know how AtomBeam works? I can't tell what they're talking about:
    AtomBeam is a cutting-edge new AI driven data reduction technique to significantly decrease the volume of data (the number of individual bits needed to store data) required for storage and transmission.

    Wait, isn’t that just compression?


    No, AtomBeam uses a significantly different approach to typical algorithm-based compression techniques. Instead of using a mathematical approach to compress the data, we break the data down into ‘chunks’ (Called Sourceblocks) determined by our proprietary AtomIzer AI engine. These Sourceblocks are not sent directly, instead a shortened Codeword is sent that corresponds to that Sourceblock. And all of this information is housed in a Codebook. This results in faster, more efficient data transmission as the volume of data being transmitted is extremely low.
    Is this just a dictionary?

    I found them when I was searching on compression for small data.

  2. #2
    Member JamesWasil's Avatar
    Join Date
    Dec 2017
    Location
    Arizona
    Posts
    78
    Thanks
    80
    Thanked 13 Times in 12 Posts
    Sounds like they are using "AI" (insert buzzwords here) to do what a regular program already does with a dictionary, yes. Kind of like a light switch using "AI" to sense light, when a simple motion detector already does that.

    Or using "AI" to sense carbon monoxide or smoke detection, when a photoelectric method or older ionization technique does that without anything else required.

    It is them trying to use terms that do not need to be there.

  3. #3
    Member
    Join Date
    Jul 2014
    Location
    Mars
    Posts
    195
    Thanks
    133
    Thanked 13 Times in 12 Posts
    Interesting how it works in real world. So I guess companies should buy there services to lower latency? It`s companies oriented right?

  4. #4
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    987
    Thanks
    96
    Thanked 396 Times in 276 Posts

  5. Thanks:

    JamesWasil (20th June 2020)

  6. #5
    Member JamesWasil's Avatar
    Join Date
    Dec 2017
    Location
    Arizona
    Posts
    78
    Thanks
    80
    Thanked 13 Times in 12 Posts
    Quote Originally Posted by Sportman View Post
    Thanks for sharing the PDF Sportman.

    But Oh My. That company "Atombeam" is rife with marketing garbage.

    "According to IDC Research, IoT will constitute 90 zettabytes, or 51%, of all data generated in 2025. For small IoT messages, compression is ineffective."

    Hmm. Interesting choice of words for attempted marketing. How is arithmetic encoding, modified huffman, and other statistical encoding methods that DO work on small messages "ineffective"?

    "Compression is ineffective on small IoT data units, but AtomBeam identifies and indexes patterns in training data in advance, and so live data is encoded and sent with virtually no added latency."

    Marketing garbage and lies.

    "...patented AI handles any size of data and excels at processing small IoT data packets as small as 30 bits. AtomBeam’s patented technology reduces the amount of data to be transmitted by IoT by 3–5 times. No other technology can do this"

    Do they have any shame when outright lying to try and market their product?

    "No other technology can do this". They have not bothered to try any of the other compression technologies that exist before making that claim, and depending on how structured the data is, MOST modern algorithms are capable of doing this unless they are designed for big data / require a large header.


    Why do they lie and hype it up like this, thinking that if they use a "downplay this to prop up that" approach - which makes no sense when it comes to comparing compression of data and is actually false - that it will sell like hotcakes? Or are they expecting their customers and target audiences not to know better?

  7. #6
    Member JamesWasil's Avatar
    Join Date
    Dec 2017
    Location
    Arizona
    Posts
    78
    Thanks
    80
    Thanked 13 Times in 12 Posts
    Also, last time I checked, intelligent grouping of mesages did not require "AI" to be able to compress and send short data or messages with a binary table of 1's and 0's based on statistics.

    Frequency models around the structure of those messages is easily doable. Even bit packing could help or RLE done there if those 30 to 60 bit sequences have mostly leading 0's before the message.

    Simply doing that on the messages will reduce the latency and can be done for free without them.

    That company is something else...what's next?

    "Rice coding is "ineffective"...BUT...Our SUPER AI steam-powered macaroni and pasta-bits coding does what no one else can! All other solutions are unavailable!"

    "First we divide the rice co errr I mean pasta-bits...into codewor ERR I mean "source blocks", then we add a patented AI waterflow to the pan while turning up the heat and boil the bits away! We then add our patented sauce and the IoT result is delicious!

    "Corporate licensing for additional AI flavors and pasta bit seasoning now available for additional cost!"

    "It's magic! Buy it today!"


  8. #7
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    987
    Thanks
    96
    Thanked 396 Times in 276 Posts
    Compare against zstandard and zlib:
    https://2jm9pf3xm7e146il9l5kk6ji-wpe...ta_graphic.jpg

  9. #8
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    845
    Thanks
    241
    Thanked 309 Times in 184 Posts
    Quote Originally Posted by Sportman View Post
    Compare against zstandard and zlib:
    https://2jm9pf3xm7e146il9l5kk6ji-wpe...ta_graphic.jpg
    Funny comparison where zlib performs better than zstandard. Perhaps they didn't try very hard to make it fair.

  10. #9
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,919
    Thanks
    291
    Thanked 1,277 Times in 723 Posts
    1) I think they're comparing independently compressed data blocks with their system that uses external dynamic dictionary.

    2) Their "AI" may be some heuristic used instead of parsing optimization.
    Like we can train NN on data processed with usual LZ parsing optimizer, to learn where token edges are usually located.
    Point is that such heuristic is much cheaper to implement in hardware than real parsing optimizer, and they're talking about IoT.
    Same applies to entropy coding since IoT uses low-power devices.

  11. #10
    Member Gotty's Avatar
    Join Date
    Oct 2017
    Location
    Switzerland
    Posts
    486
    Thanks
    330
    Thanked 315 Times in 171 Posts
    AtomBeam Technologies Assigned Patent
    Patent number: 10680645

    The abstract of the patent published by the U.S. Patent and Trademark Office states: ”A system and method for data storage, transfer, synchronization, and security using recursive encoding, wherein data is deconstructed into chunklets, and is processed through a series of reference code libraries that reduce the data to a sequence of reference codes, and where the output of each reference library is used as the input to the next.
    Source: https://www.storagenewsletter.com/20...gned-patent-2/

    The first paragraph from the patent:

    What is claimed is:

    1. A system for storing, retrieving, and transmitting data in a highly compact format, comprising: a count sketch engine comprising a first plurality of programming instructions stored in a memory and operable on a processor of a computing device, wherein the programming instructions, when operating on the processor, cause the processor to: receive a set of training data comprising a plurality of data chunklets, each chunklet possessing an estimated frequency of occurrence; for each data chunklet within the set of training data, perform the following: update a set of hash tables corresponding to the data chunklet and a corresponding count of hash table entries; if the data chunklet is within a heap of data containing a set of the most frequent data chunklets, increment a count corresponding to the data chunklet; and if the data chunklet is not within the heap of data, estimate a frequency of occurrence for a value of the data chunked, and add the data chunklet to the heap of data while evicting a data chunklet having a lowest frequency of occurrence in the heap of data when the estimated frequency of occurrence is greater than the lowest frequency of occurrence; and generate a set of frequency data for the set of the most frequent data chunklets in the set of training data using data stored in heap of data; a Huffman tree creator generating a reference code library using set of frequency data for a plurality of the most frequent data chunklets in the set of training data, comprising a second plurality of programming instructions stored in the memory and operable on the processor of the computing device, wherein the programming instructions, when operating on the processor, cause the processor to: create a first Huffman binary tree based on the frequency of occurrences of each word in the set of training data; assign a Huffman codeword to each data chunklet in the set of training data according to the first Huffman binary tree; and construct the reference code library, wherein the reference library stores the reference codes and their corresponding words as key-value pairs in the library of key-value pairs; wherein the reference code library comprises data chunklets and reference codes corresponding to the data chunklets; a data deconstruction engine comprising a third plurality of plurality of programming instructions stored in the memory and operable on the processor of the computing device, wherein the programming instructions, when operating on the processor, cause the processor to: receive run-time data; deconstruct the run time data into a run time set of data chunklets; retrieve the reference code for each chunklet from the reference code library; where there is no reference code for a given chunklet, create a reference code, and store chunklet and its newly-created reference code in the reference code library; and create a plurality of warplets representing the data, each warplet comprising a reference code to a chunklet in the reference code library; and a data reconstruction engine comprising a fourth plurality of programming instructions stored in the memory and operable on the processor of the computing device, wherein the programming instructions, when operating on the processor, cause the processor to: receive the plurality of warplets representing the data; retrieve the chunklet corresponding to the reference code in each warplet from the reference code library; and assemble the chunklets to reconstruct the data.
    If you have difficulties reading it, you are not alone. This is a single sentence.

  12. Thanks:

    SolidComp (28th June 2020)

  13. #11
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    353
    Thanks
    131
    Thanked 54 Times in 38 Posts
    Jesus, that is one sentence. Why do people do this? Why would the patent office ever grant this crap?

  14. #12
    Member JamesWasil's Avatar
    Join Date
    Dec 2017
    Location
    Arizona
    Posts
    78
    Thanks
    80
    Thanked 13 Times in 12 Posts
    Quote Originally Posted by SolidComp View Post
    Jesus, that is one sentence. Why do people do this? Why would the patent office ever grant this crap?
    It's basically the longest sentence ever made to describe a huffman table arranged with statistical frequencies. Rather than having trees and blocks, they have "chunklets" and "key-value pairs" and "warplets" other fancy names that mean nothing.

    The data read from a file or stream isn't input anymore, it's now "source blocks".

    It might use another table to keep track of recents and rep matches and calls that "AI training" (which nothing else does lol).

    "reconstruction engine comprising a fourth plurality of programming instructions stored in the memory and operable on the processor of the computing device, wherein the programming instructions, when operating on the processor, cause the processor to: receive the plurality of warplets representing the data; retrieve the chunklet corresponding to the reference code in each warplet from the reference code library; and assemble the chunklets to reconstruct the data."

    ^ This means compressor and decompressor on a computer reading from a file lol

    The USPTO never turns money away even when granting toilet paper like that.

  15. #13
    Member Gotty's Avatar
    Join Date
    Oct 2017
    Location
    Switzerland
    Posts
    486
    Thanks
    330
    Thanked 315 Times in 171 Posts
    On their site ...
    [...] AtomBeam Technologies will be unveiling its patented technology at the upcoming Oct. 22nd – 24th Mobile World Congress (MWC) event in Los Angeles [...]
    Source: https://atombeamtech.com/2019/10/18/...orld-congress/

    Searched for it. And really they were there.
    https://www.youtube.com/watch?v=Jy4w-Sn-hEk

    The video was uploaded 4 months ago. Already 3 views. I'm the 4th one.
    Quote from the video: "We are the only game in town. There is no other way to reduce that data." [...] "There is no other way to reduce the size except for AtomBeam."
    OK.
    I have never ever disliked any video on youtube. This is my first.

  16. #14
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    987
    Thanks
    96
    Thanked 396 Times in 276 Posts
    "about 100 bits at a time"
    "400 times faster than standard compression resulting in up to 75% less bandwidth and storage utilization"
    https://www.youtube.com/watch?v=m-3BNenuX_Q

    So "AI" create from every about 100 bits 40-25 bits codewords and send that over the network + one time codebook with sourceblocks (size unknown).

    Sounds like an AI trained custom dictionary for every data type.

  17. #15
    Member
    Join Date
    Apr 2009
    Location
    The Netherlands
    Posts
    80
    Thanks
    6
    Thanked 18 Times in 11 Posts
    It really sounds like LittleBit with a fancy description. Static huffman tree with variable word size. Although I bet that LittleBit outperforms them on size.

  18. #16
    Member
    Join Date
    Feb 2015
    Location
    United Kingdom
    Posts
    172
    Thanks
    28
    Thanked 73 Times in 43 Posts
    Quote Originally Posted by Sportman View Post
    "about 100 bits at a time"
    "400 times faster than standard compression resulting in up to 75% less bandwidth and storage utilization"
    https://www.youtube.com/watch?v=m-3BNenuX_Q

    So "AI" create from every about 100 bits 40-25 bits codewords and send that over the network + one time codebook with sourceblocks (size unknown).

    Sounds like an AI trained custom dictionary for every data type.
    Interesting how they say their solution is 400x faster than compression, it's almost like their solution isn't compression at all, I'm getting a whiff of sloot from reading this.

    Their 4 byte compression claims are incredibly dubious, it's almost like they don't care that UDP and TCP header sizes would become the bottleneck in such networks.

    Eg: 100x UDP packets containing <=4 bytes of data would send 2.94x more data over the wire than a single UDP packet with 400 bytes of payload, and TCP (which they propose using in the patent for a distributed compression network) would be 5.71x larger than a single TCP packet with 400 bytes of payload.
    And not once do they mention anything about "buffering" in their system which would make this claim of being able to actually compress 4-bytes hold up.

    To me this just appears to be a pump and dump company to rip-off investors.

  19. #17
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    987
    Thanks
    96
    Thanked 396 Times in 276 Posts
    Quote Originally Posted by Lucas View Post
    I'm getting a whiff of sloot from reading this.
    I thought the same, only Sloot specs where more the opposite 4x faster 400x smaller.

  20. #18
    Member
    Join Date
    Jun 2009
    Location
    Puerto Rico
    Posts
    233
    Thanks
    127
    Thanked 46 Times in 35 Posts
    Quote Originally Posted by Lucas View Post
    Interesting how they say their solution is 400x faster than compression, it's almost like their solution isn't compression at all, I'm getting a whiff of sloot from reading this.

    Their 4 byte compression claims are incredibly dubious, it's almost like they don't care that UDP and TCP header sizes would become the bottleneck in such networks.

    Eg: 100x UDP packets containing <=4 bytes of data would send 2.94x more data over the wire than a single UDP packet with 400 bytes of payload, and TCP (which they propose using in the patent for a distributed compression network) would be 5.71x larger than a single TCP packet with 400 bytes of payload.
    And not once do they mention anything about "buffering" in their system which would make this claim of being able to actually compress 4-bytes hold up.

    To me this just appears to be a pump and dump company to rip-off investors.
    A few years ago, there was a Cloud service called Bitcasa. They seem to have a hash table of some sort. Files uploaded would be split into chunks (I believe 512kb chunks) and it would only upload those if it didn't exist on their servers. Basically reducing redundancy data for stuff like legally purchased music or videos, where several users would store the same data over and over again. This saved bandwidth on the user side as well but required significant IO due to the file chunking. Not to mention the NTFS Master File Table being growing a lot! There was no compression performed.

Similar Threads

  1. PNG color reduction approax
    By toi007 in forum The Off-Topic Lounge
    Replies: 0
    Last Post: 17th February 2012, 13:46

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •