Page 2 of 3 FirstFirst 123 LastLast
Results 31 to 60 of 67

Thread: Fastest decompressor!?

  1. #31
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,013
    Thanks
    406
    Thanked 403 Times in 153 Posts
    BTW, can you give me that 'osho.txt'? So I will able to test a new LZSS/ULZ...

  2. #32
    Member Sanmayce's Avatar
    Join Date
    Apr 2010
    Location
    Sofia
    Posts
    57
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Another blunder of mine, I forgot to salute for your kingship:
    More than 659+8+ songs went through my mind, but I couldn't choose which one to use to salute you, anyway my pick:

    A masterpiece: two variants of "Cat People" by David Bowie:
    '... I can stare for a thousand years ... I've been putting out fire with gasoline ...'

    The-test-file-link is above in this thread.
    And for a closer look what Blackbird is, here is a photo of this state-of-craft titanium machine: born to be the fastest flyer high in the sky.

  3. #33
    Member Vacon's Avatar
    Join Date
    May 2008
    Location
    Germany
    Posts
    523
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hello everyone,

    to be more precisely: in your posting from June, 7th
    in rar-container... *sigh*
    Edit 1: I didn't see it too at first glance. Your hint made me look after it, but it was not really obvious.
    Edit 2: FreeArc 0.67 alpha made it 33.235.794 Bytes in less than 7 minutes by just using the defaults (right-clicking on folder and choosing "Add to Folio VIP - Osho's Books on CD-ROM Part 1-5.arc" from the context-menu.
    Edit 3: this wasn't very precise... So here are the true outputs:
    timer arc.exe a test *.txt
    Timer 9.01 : Igor Pavlov : Public domain : 2009-05-31
    FreeArc 0.67 (June 26 2010) creating archive: test.arc
    Compressed 5 files, 206,908,949 => 33,235,353 bytes. Ratio 16.0%
    Compression time: cpu 232.41 secs, real 234.39 secs. Speed 883 kB/s
    All OK

    Kernel Time = 3.484 = 1%
    User Time = 229.796 = 97%
    Process Time = 233.281 = 99%
    Global Time = 234.835 = 100%

    Best regards!
    Last edited by Vacon; 27th June 2010 at 22:45.

  4. #34
    Member Sanmayce's Avatar
    Join Date
    Apr 2010
    Location
    Sofia
    Posts
    57
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Path to perfection...

    I can't help it: wanna salute everybody(especially my brother) with newest video-clip Mossano - Indianotech today I came across.


    Words don't come easy to me... you guys must see this AMAZINGLY precious cure for worries!
    And two groovy out-of-worry djs riding in ready-to-fall-apart beetle engulfed in flames ...
    The little dancer - a magician, so fluid and without show-offness ...
    The indian she-dancer supported by Ganesha ...
    This I call spirit: music, dance, no-artificalities in one, a perfect harmony: BESTEST.

  5. #35
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    688
    Thanks
    41
    Thanked 174 Times in 88 Posts
    You really think so? Well, please don't treat me harsh and I know that its better not to argue about tastes but... Those guys are not DJs. I can't name people DJs when they are playing on cheap CD decks. Better they use 900 series at least. Pioneer Mixer althought is nice.
    Turntables - its da power. And music... For me it sounds like "lollypop-radio-rotation-once-heard-never-return-cheap-pop-house". Complete scum for me. Check this one:
    http://www.youtube.com/watch?v=BxEb2FrQUbE

  6. #36
    Member Vacon's Avatar
    Join Date
    May 2008
    Location
    Germany
    Posts
    523
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hello everyone,

    Quote Originally Posted by Skymmer View Post
    [...]Complete scum for me. Check this one:
    http://www.youtube.com/watch?v=BxEb2FrQUbE
    Well... Let me say this -> tastes are *very* different... Ready to rumble?


    Best regards!
    P.S.: a little bit off-topic though...
    Last edited by Vacon; 28th June 2010 at 23:08.

  7. #37
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    688
    Thanks
    41
    Thanked 174 Times in 88 Posts
    Quote Originally Posted by Vacon View Post
    Hello everyone,


    Well... Let me say this -> tastes are *very* different... Ready to rumble?


    Best regards!
    P.S.: a little bit off-topic though...
    Sure. http://www.youtube.com/watch?v=5Az_7U0-cK0 He-he-he.

  8. #38
    Member Vacon's Avatar
    Join Date
    May 2008
    Location
    Germany
    Posts
    523
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hello everyone,

    maybe not *quite* my taste, but nearer to "music" than the first link

    Best regards!
    P.S.: How do we get the turn back to decompression...?

  9. #39
    Member Sanmayce's Avatar
    Join Date
    Apr 2010
    Location
    Sofia
    Posts
    57
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Stop flooding this thread please, as far as I know this is not a chatroom...

    Hi Skymmer,
    your post is so dense with contempt, starting with mocking emoticon ...

    "Pure is his eye, and no loathing lurketh about his mouth. Goeth he not along like a dancer?"
    Reading the book containing above verses is a possible cure for abomination.

    The eventual appearance of new fastest decompressor has elated me, so I wanted to share my joy by congratulating all.

    Anyway, the plot in my eyes is this: two guys, being in hell, manage to find way to a heavenly garage: a realm of music, dance and brilliant cars, thankfully to three already-harmonic beings(deities): Ganesha, female, and a kid who guide these two lost human-beings via simple-but-mantric-refrain.
    The clip you proposed in my opinion is a funny one but shows a path to madhouse not to perfection.

    You didn't like it/me/them/equipment, ok, let's not inflate this balloon: twitter is elsewhere.

    Regards.

  10. #40
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,013
    Thanks
    406
    Thanked 403 Times in 153 Posts

  11. #41
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    688
    Thanks
    41
    Thanked 174 Times in 88 Posts
    Sanmayce, I just expresed my opinion and nothing more. If you like it, then go with it. But my 14 years experience with electronic music tells me that you'll realize your wrong choice one day. Sure, we're off-topic here. Moderators and Creators, feel free to remove it.

  12. #42
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    688
    Thanks
    41
    Thanked 174 Times in 88 Posts
    Encode Check my 2001 year version of Porcelain. Fruity Loops 3.4.1 have been used, as far as I remember.
    http://narod.ru/disk/22309506000/Por...y%5D).mp3.html

    And this one too, created with one guy who unfortunately felt into drug oblivion.
    http://narod.ru/disk/22309659000/12.MPC.html
    Last edited by Skymmer; 29th June 2010 at 23:55.

  13. #43
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,013
    Thanks
    406
    Thanked 403 Times in 153 Posts

    Cool

    Ragarding Moby, check out my first YouTube video (live button pushing):
    http://www.youtube.com/watch?v=2hgkjINADzM

  14. #44
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,013
    Thanks
    406
    Thanked 403 Times in 153 Posts

    Lightbulb

    Wrote a draft PPM (max order-4), it compresses osho.txt to 40,743,886 bytes in 30 sec... Version with max order-5 compresses to ~37,000,000 bytes in 40 sec! Since such stuff is draft, compression and processing time can be much improved!

  15. #45
    Member Sanmayce's Avatar
    Join Date
    Apr 2010
    Location
    Sofia
    Posts
    57
    Thanks
    0
    Thanked 0 Times in 0 Posts
    No problema Skymmer, forget it, the only aftermath is that I dare not to salute you again.

    This thread is all about pushing-the-limits by utilizing-the-basics i.e. applying simplicity in life, here as a etude of text-processing-craft.

    Our revered prophet Ванга had revealed that (Орфей)Orpheus' myth is not myth at all but truth itself:
    He has had a long neglected hair and was dressed in a dusty ragged tunic, he was a genuine and enlightened human-being with such a spirit that even birds and animals stood still when he played on his
    simplistic whistle.

    So, the simplicity rules since ancient times, I think it is greatly underrated even despised, which is JUST-WORST.

    Having purged the disturbance... A hintlike question:
    What is the maximum compression ratio for text-data using LZ[SS] derivates? That is by using an unaware-of-needed-time compression(parsing). Let's say even multi-hours. Achieving it at a WHATEVER cost(Blackbird costs 1,000,000,000+ nowayears dollars, which was half-billion-dollars in 70's).

    So far, my superficial comprehension of what formula(skeleton), for achieving best Com/Decom Ratio, must be is this:
    - State of the art optimal parsing(as Hamid Buzidi stated for his LZTURBO);
    - As simple and low-CPU-load decoding as possible(done by Encode), the LZ strength, with one purpose: MEGA-threading, thus future-ready;
    - Fine Block/Window tuning via variables(options); personally I look after a regime calibrated for texts using 7bits alphabet;

    I wonder what are the best chunk(block)/window(dictionary-size) pairs for text-data!
    No code has to be inserted here.No code has to be inserted here.And again frustration due to buggy nul, thus no decompression time available.

    Hamid did a great job with this best-so-far parsing! Other opinions?
    Also, obviously 1MB chunks deliver very nice crunch, yummy.

  16. #46
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,013
    Thanks
    406
    Thanked 403 Times in 153 Posts

    Exclamation

    Here are the test results of a DRAFT version of my new LZ encoder. Decompression is as fast (or even faster) as with LZSS/ULZ.

    No code has to be inserted here.

    Note that the LZ output is as simple as it possible!

  17. #47
    Member Sanmayce's Avatar
    Join Date
    Apr 2010
    Location
    Sofia
    Posts
    57
    Thanks
    0
    Thanked 0 Times in 0 Posts

    AN ETUDE: How simpler-than-simple gives more speed ...

    To Encode: Very good, you did an additional improvement regarding compression ratio also, right? If the block is still 32MB and the window 512KB, result was 61,439,540, now 56,888,532. Do you plan to make the block size adjustable and time-disregarding-parsing in order to create really FASTEST-TEXT-DECOMPRESSOR not paying attention to compression time? I wish you good luck.
    I am very-very impressed what Hamid did, I am still stunned, LZ77 to achieve: 51,203,616 for 1MB block, it is just-SUPER.

    To all: Here I want to share another thing that is on my mind...
    Can 'strstr()' be outperformed(even the 'Boyer-Moore' variants) on short strings?

    I consider myself a dummy-and-amateurish C programmer i.e. unskilled to such a degree that I feel shame to publish my code.
    I still receive warnings like:
    Leprechaun.c(1001) : warning C4311: 'type cast' : pointer truncation from 'char *' to 'unsigned long'
    Leprechaun.c(128 : warning C4312: 'type cast' : conversion from 'unsigned long' to 'char *' of greater size

    which I don't know how to fix, because I use addresses and variables interchangeably, when comes to pointer arithmetic.

    Nevertheless, the romantic part in me, which is a force I reckon with, dictates me to illustrate my words concerning simplicity with a tiny code fragment, with two goals in mind:
    - first, to inspire amateurish programmers alongside with more skilled ones to take not things for granted;
    - second, to remind that building a house(project) with trashy bricks(functions) leads to a hovel i.e. slower-and-slower code in a long run.

    My text-obsession has three layers: text decompressing(being an interest), text searching(being a fondness), text suggesting(being a mania).
    Two years ago, before dawn, in a calm mood I spent 2-3 hours in a mini-quest for faster-than-any-I-knew string-search-function for small different patterns(strings to be searched for, with lengths less than 12 bytes) and sentences(strings to be searched into, with lengths less than 960 bytes).

    Luckily, after only few adjustments, the boosted 'Karp_Rabin' appeared(by chance):
    'Karp_Rabin_Kaze', for pattern 'underdog' took 70 seconds to look for into 400+ million sentences, compared to 'strstr'(85s) & 'Boyer-Moore-Horspool'(86s), is ((85-70)/70)*100%=21% faster when running on sentences NOT-ON-WHOLE-BLOCKS. It is handy because this opens the way for wildcard-searching, which needs short sequences of text since heavy recursion is time devouring.

    I hope it will be beneficial in a way of showing that we can't say 'fast' when relying on non-optimized basic functions(approaches).
    Attached Files Attached Files
    Last edited by Sanmayce; 3rd July 2010 at 12:57.

  18. #48
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,610
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by Sanmayce View Post
    I still receive warnings like:
    Leprechaun.c(1001) : warning C4311: 'type cast' : pointer truncation from 'char *' to 'unsigned long'
    Leprechaun.c(128 : warning C4312: 'type cast' : conversion from 'unsigned long' to 'char *' of greater size

    which I don't know how to fix, because I use addresses and variables interchangeably, when comes to pointer arithmetic.
    Simply: don't use them interchangeably. It is incorrect.

  19. #49
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 797 Times in 489 Posts
    Because on some systems a pointer is 64 bits and unsigned long is 32 bits.

  20. #50
    Member Sanmayce's Avatar
    Join Date
    Apr 2010
    Location
    Sofia
    Posts
    57
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thanks m^2, thanks Matt for your readiness to help,

    I didn't expect anyone to be interested in helping me on such a trivial, for experienced programmers, matter.
    I was, rather, excusing-for-my-skillessness not complaining.
    For that reason, the explanation of my problem was stated unclearly.
    By 'interchangeably' I meant two things:
    - first, using pointer and non-pointer in an expression i.e. mixing them;
    - second, to utilize one variable as a pointer and non-pointer.

    I know that it is a dirty style of writing but allowed, though.
    Also, I have had my subjective reasons to do it in a such a dummy way, namely achieving better performance by eliminating the need of additional pointer/variable usage.

    Obviously, warnings are the ugly result from bad planning, but, once the draft code is done, the vigor is gone, so rewriting in order to clean language-dependent impurities is no longer of real value to me.

    Again, this is an example of how bad-leads-to-worse when migration from code using 32bit dependencies to 64bit must be done.
    Here lies my concern and shame, I am not able to write clean 32bit code and as a consequence: 64bit too.
    Never mind, my passion is burst-speed-text-processing, not first-class programming skills.

    For warning
    Leprechaun.c(1001) : warning C4311: 'type cast' : pointer truncation from 'char *' to 'unsigned long'

    the C code is
    unsigned long MAXusedBuffer[32];
    char *bufend[ 806 ];
    1001 MAXusedBuffer[k] = (unsigned long)bufend[i*31+k-1]-MAXusedBuffer[k];

    the Assembly code says don't worry be happy
    ; Line 1001
    mov ecx, DWORD PTR _MAXusedBuffer$[esp+esi+892436]
    mov eax, DWORD PTR _bufend$[esp+esi+895532]
    sub eax, ecx
    mov DWORD PTR _MAXusedBuffer$[esp+esi+892436], eax

    For warning
    Leprechaun.c(128 : warning C4312: 'type cast' : conversion from 'unsigned long' to 'char *' of greater size

    the C code is
    unsigned long PseudoLinkedPointer;
    1286 else if (memcmp(PseudoLinkedPointer+4+4+4,wrd,wrdlen) < 0) // go RP or MP
    { // RW existence check - line below:
    1288 if ( *(char *)(PseudoLinkedPointer+4+4+4+wrdlen) != 0 ) // RW exists
    { // Here all 'P W P' section is repeated; the way of handling case when dynamic number of words in leaf

    the Assembly code says OKAY
    ; Line 1286
    jmp $L2030
    $L2020:
    mov ecx, ebx
    lea edi, DWORD PTR _wrd$[esp+892432]
    mov esi, ebp
    xor edx, edx
    repe cmpsb
    je SHORT $L2735
    sbb edx, edx
    sbb edx, -1
    $L2735:
    test edx, edx
    jge SHORT $L2022
    ; Line 1288
    mov cl, BYTE PTR [eax+ebx+12]
    test cl, cl
    lea ebp, DWORD PTR [eax+ebx+12]
    je SHORT $L2024

    To m^2:
    Not so, otherwise an error will occur not a warning.
    You see, I use a variable as a pointer too, for me pointers/variables are part of memory - I treat them alike but C not.

    To Matt:
    My compiler is 'Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 13.10.3077 for 80x86'.
    I tried, speaking of 1288 line, *(unsigned long *) too, as a result: the same warning:
    Leprechaun.c(128 : warning C4312: 'type cast' : conversion from 'unsigned long' to 'unsigned long *' of greater size
    In fact, I don't care of compilers warnings so much.

    Enough with my attempts to code properly, now I like to share few principal thoughts with you.

    I consider myself having not an idealistic or perfectionist-like attitude, on the contrary, I have a pragmatic and in the same time romantic attitude.
    How many tools(programs) have the 'insane' in addition to 'max' and 'extreme' levels? I am implying that in case of a dificult-to-tackle-heavy-problem when all approaches failed to yield satisfying results, why a massively(using let's say 3GB of RAM, and why not 300GB) hashed approach doesn't take place? If this is too insane for you take a look at ioDrive Duo 320GB with Write Bandwidth: 1.4 GB/s (32k packet size) to fall in dismay how software lags behind. Really a mind-blowing product as the advertisement goes, I would add MIND-CHANGING.
    In a few words: a visionary tool(function) must provide always the 'insane' level of resource utilization either RAM or CPU, or both.
    That's why I admire such declaration for perfection as Hamid did with LZTURBO.

    Regarding Leprechaun.c(my attempt) to implement a kind-of-insane regime of parsing(in sense of extracting distinct words) incoming text, I would be very glad if you agree to take a look and evaluate whether such a tool worth an attention. I'm ready to send you via email Leprechaun_r13++.c.zip 22,689 bytes if you are interested. My wish is to attach here at Encode's forum(with his permission) the cleansed version by a skilled C programmer in order a beneficial open-source console tool to reach C&TXT fans.

    Not-by-the-way, you as a man with a diverse experience, can you name a function faster than my KarpRabin variant for short(<1KB) strings?
    Also, can you recommend to me a comprehensive [e]book(or web-resource) about MAX fast search(the power of hashing has mesmerized me) algorithms? Except R.Sedgewick' books, I don't like his style.
    In case of bothering you, I will not again.
    I hesitated for a moment whether or not, but being a decisive old-boy I salute you with a new song I was listening to during writing this 6+KB: Miley Cyrus - Can't Be Tamed, I hope you will see the bond and mostly: enjoy it.

    Regards.

    P.S.
    The number of 's' letters in 'skillessness' has intrigued me.
    Since my English must be massively depoored(a la debugged), he-he, a little quiz:
    How many words with 5+ 's' letters are plausible(might exist)?
    To explore this I invoked my tool Kazuya to search for '*s*s*s*s*s*s*' into a big-English-spellcheck-wordlist(354,939 words):

    C:\WorkTemp\Leprechaun_r13++>Kazuya.exe -krknd english.dic.txt.lst
    Input Pattern1(hit only 'Enter' to skip): *s*s*s*s*s*s*
    Processing english.dic.txt ...
    Doing SEARCH for Pattern1 at once and flushing hit-sentences ...
    000,000,001 assassinatress
    000,000,002 classlessness
    000,000,003 dispossesses
    000,000,004 expressionlessness
    000,000,005 masslessness
    000,000,006 nonpossessiveness
    000,000,007 passionlessness
    000,000,008 possessedness
    000,000,009 possessingness
    000,000,010 possessionless
    000,000,011 possessionlessness
    000,000,012 possessiveness
    000,000,013 possessoress
    000,000,014 possessoriness
    000,000,015 prepossessingness
    000,000,016 resistlessness
    000,000,017 senselessness
    000,000,018 stresslessness
    000,000,019 subsensuousness
    000,000,020 successlessness
    000,000,021 supersensuousness
    000,000,022 supersuspiciousness
    000,000,023 unassessableness
    000,000,024 unpossessedness
    000,000,025 unpossessiveness
    000,000,026 unprepossessingness
    Found 26 case-insensitive and unexact matches(hits), so far.

    Funny words.
    Last edited by Sanmayce; 5th July 2010 at 18:14.

  21. #51
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,610
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by Sanmayce View Post
    Thanks m^2, thanks Matt for your readiness to help,

    I didn't expect anyone to be interested in helping me on such a trivial, for experienced programmers, matter.
    I was, rather, excusing-for-my-skillessness not complaining.
    For that reason, the explanation of my problem was stated unclearly.
    By 'interchangeably' I meant two things:
    - first, using pointer and non-pointer in an expression i.e. mixing them;
    - second, to utilize one variable as a pointer and non-pointer.

    I know that it is a dirty style of writing but allowed, though.
    Also, I have had my subjective reasons to do it in a such a dummy way, namely achieving better performance by eliminating the need of additional pointer/variable usage.

    Obviously, warnings are the ugly result from bad planning, but, once the draft code is done, the vigor is gone, so rewriting in order to clean language-dependent impurities is no longer of real value to me.

    Again, this is an example of how bad-leads-to-worse when migration from code using 32bit dependencies to 64bit must be done.
    Here lies my concern and shame, I am not able to write clean 32bit code and as a consequence: 64bit too.
    Never mind, my passion is burst-speed-text-processing, not first-class programming skills.

    For warning
    Leprechaun.c(1001) : warning C4311: 'type cast' : pointer truncation from 'char *' to 'unsigned long'

    the C code is
    unsigned long MAXusedBuffer[32];
    char *bufend[ 806 ];
    1001 MAXusedBuffer[k] = (unsigned long)bufend[i*31+k-1]-MAXusedBuffer[k];

    the Assembly code says don't worry be happy
    ; Line 1001
    mov ecx, DWORD PTR _MAXusedBuffer$[esp+esi+892436]
    mov eax, DWORD PTR _bufend$[esp+esi+895532]
    sub eax, ecx
    mov DWORD PTR _MAXusedBuffer$[esp+esi+892436], eax

    For warning
    Leprechaun.c(128 : warning C4312: 'type cast' : conversion from 'unsigned long' to 'char *' of greater size

    the C code is
    unsigned long PseudoLinkedPointer;
    1286 else if (memcmp(PseudoLinkedPointer+4+4+4,wrd,wrdlen) < 0) // go RP or MP
    { // RW existence check - line below:
    1288 if ( *(char *)(PseudoLinkedPointer+4+4+4+wrdlen) != 0 ) // RW exists
    { // Here all 'P W P' section is repeated; the way of handling case when dynamic number of words in leaf

    the Assembly code says OKAY
    ; Line 1286
    jmp $L2030
    $L2020:
    mov ecx, ebx
    lea edi, DWORD PTR _wrd$[esp+892432]
    mov esi, ebp
    xor edx, edx
    repe cmpsb
    je SHORT $L2735
    sbb edx, edx
    sbb edx, -1
    $L2735:
    test edx, edx
    jge SHORT $L2022
    ; Line 1288
    mov cl, BYTE PTR [eax+ebx+12]
    test cl, cl
    lea ebp, DWORD PTR [eax+ebx+12]
    je SHORT $L2024

    To m^2:
    Not so, otherwise an error will occur not a warning.
    You see, I use a variable as a pointer too, for me pointers/variables are part of memory - I treat them alike but C not.
    I call the code incorrect, because it's behaviour is undefined. And can vary from machine to machine, from compiler to compiler. C treats them both as a part of memory, but (purposefully) doesn't define sizes of these parts. They may vary and conversion may truncate one of them, f.e. with MS, x64 and addresses farther than 4 GB away, the code will do something else than it does in your case (and something else than you wanted it to).
    So simply: don't do it. Would it be a problem if the array stored pointers? If PseudoLinkedPointer was a pointer?

  22. #52
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 797 Times in 489 Posts
    I had a similar problem in ZPAQ when I wanted to align allocated memory on 64 byte boundaries to decrease cache misses. So I did:

    Code:
      T *data;  // allocated memory
      int offset;
      ...
      offset=64-int((long)data&63);
      data=(T*)((char*)data+offset);  // adjust to 64 byte boundary
    This works as long as (long) and (char*) are the same size. For 64 bit machines, g++ changed long from 32 to 64 bits, so it all works. However, 64 bit MSVC still uses 32 bit longs, so the cast loses the high bits (which should be OK but still gives an error). One fix is:

    Code:
    // return low 6 bits of ptr
    int p2i6(void *ptr) {
      union {unsigned int i[2]; void *p;} u;
      int j=0x03040506;
      u.p=ptr;
      return u.i[sizeof(void*)==8 && (*(char*)&j)==3]&63;
    }
      ...
      offset=64-p2i6(data);
      data=(T*)((char*)data+offset);
    The code is not entirely correct because it depends on int being 32 bits. The C/C++ spec only says 16 bits or more. However it should work with 32 bit int, 32 or 64 bit pointers, and big-endian or little-endian hardware. On a big-endian machine the low 32 bits of a 64 bit pointer are in u.i[1], otherwise u.i[0].

    I would use <stdint.h> but Borland can't find the file.

  23. #53
    Programmer toffer's Avatar
    Join Date
    May 2008
    Location
    Erfurt, Germany
    Posts
    587
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Just a single hint: use ptrdiff_t from stddef.h (or cstddef).

    http://www.cplusplus.com/reference/c...def/ptrdiff_t/

  24. #54
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 797 Times in 489 Posts
    Thanks. ptrdiff_t works in g++, Borland, and Mars.

  25. #55
    Member Sanmayce's Avatar
    Join Date
    Apr 2010
    Location
    Sofia
    Posts
    57
    Thanks
    0
    Thanked 0 Times in 0 Posts

    My pride and joy: The master-rhymer elf

    Thanks m^2,
    you are right, no doubt, but explicitly, while my statement is correct implicitly, so the bottom-line is:
    in order to be said correctly in the both cases: 'incorrect' must be replaced with 'dirty' or as you said 'with undefined behaviour', right!?
    I am aware of truncation, big/little-endian issues and other traps which could stumble me, so I avoid the mess by not entering this mine-field for me.
    Of course 'PseudoLinkedPointer' could be a pointer, but I think that I clearly stated the problem already.
    The point is that a well-written(i.e. OS/Compiler/CPU independent) tool must look like this, but I am far from this:

    For reference:
    Markus Oberhumer: lzo-2.03: miniacc.h

    OS platform: "freestanding", "embedded", "cygwin", "emx", "beos", "lynxos", "os400", "qnx", "dos32", "dos16", "os216", "os2", "win64", "win32", "win16", "palmos", "tos", "macclassic", "vms", "console", "aix", "freebsd", "hpux", "interix", "irix", "linux", "macosx", "minix", "netbsd", "openbsd", "osf", "solaris", "sunos", "ultrix", "unicos", "unknown".

    Compiler platform: "Cilly", "sdcc", "Pathscale C", "Intel C", "Pelles C", "llvm-gcc", "gcc", "Amsterdam Compiler Kit C", "Aztec C", "Borland C", "Cray C", "Digital Mars C", "DEC C", "MetaWare High C", "IAR C", "IBM C", "Keil C", "lcc-win32", "Microsoft C", "Metrowerks C", "Microway NDP C", "Pacific C", "Portland Group PGI C", "Pure C", "Symantec C", "SunPro C", "Tiny C", "TopSpeed C", "Watcom C", "Turbo C", "Zortech C", "unknown".

    CPU architecture platform: "i086", "alpha", "amd64", "arm", "avr", "blackfin", "c166", "cris", "ez80", "h8300", "hppa", "i386", "ia64", "m16c", "m32r", "m68k", "mcs251", "mips", "msp430", "powerpc", "s390", "sh", "sparc", "spu", "z80", "cray_sv1", "cray_t90", "cray_ymp", "cray_xmp", "unknown".

    Ok, now my precious dirty-C-style word-ripper designed for above in bold platforms.

    In context of my previous posts I romantically hope that my two etudes on kind-of-match-finding will inspire programmers to do 'insane' exhaustive-search, at-least & at-last. Also, I would be glad to see/receive improved lines/hashes/fragments/approaches of me greedy Leprechaun.
    Because of maximum-minimum word comparisions, due to combined forces of 4_layers_of_HASH & 3+_million_B-trees, rip-rate is almost at drive's Burst-Read-Speed.

    As trivial as might be, the simple distinct-word-ripping out of the incoming text, taught me a lesson: simple tasks need simple, calm and OPEN mind.

    I have had for a week a lot of fun and thrill during coding the dedicated B-tree order 3 i.e. per se, in order to avoid variables(cycles).
    I wish everyone to experience such a sensation: how effortlessness brings happiness.

    Enjoy!
    Attached Files Attached Files
    Last edited by Sanmayce; 12th July 2010 at 21:29.

  26. #56
    Member Sanmayce's Avatar
    Join Date
    Apr 2010
    Location
    Sofia
    Posts
    57
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Wow, another catastrophe of mine,
    'Karp-Rabin-Kaze' appears to be inferior, in some cases, compared to GNU 'strstr'.
    Back to school... no, no, back to kindergarten!
    'Karp-Rabin-Kaze' bends a knee(not knees, he-he) before 'strstr_GNU_C_Library', bravo Stephen R. van den Berg, I will have fun reading-understanding-learning it, for sure, thanks.

    Here a lite testbed(OSHO.TXT) is set, the other(26GB with 400+ million lines) will be tested off-line later, he-he.

    Thanks a lot also to,
    http://www-igm.univ-mlv.fr/~lecroq/string/index.html
    amazing site and GREAT/RICH resource on search technique.

    Below is a benchmark program for at-a-glance showdown:
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <time.h>
    
    #ifndef NULL
    #define NULL ((void*)0)
    #endif
    
    clock_t clocks1, clocks2;
    double TotalRoughSearchTime = 0;
    
    #define ASIZE 256
    
    // ### Boyer-Moore-Horspool algorithm [
    long HORSPOOL(y, x, n, m)
        char *y, *x;
        long n;
        int m;
       {
        long i;
        int a, j, bm_bc[ASIZE];
        unsigned char ch, lastch;
       
        /* Preprocessing */
        for (a=0; a < ASIZE; a++) bm_bc[a]=m;
        for (j=0; j < m-1; j++) bm_bc[x[j]]=m-j-1;
       
        /* Searching */
        lastch=x[m-1];
        i=0;
        while (i <= n-m) {
           ch=y[i+m-1];
           if (ch ==lastch)
              //if (memcmp(&y[i],x,m-1) == 0) OUTPUT(i);
              if (memcmp(&y[i],x,m-1) == 0) return(i);
           i+=bm_bc[ch];
        }
        return(-1);
       }
    // ### Boyer-Moore-Horspool algorithm ]
    
    
    // ### Brute force 'Dummy' algorithm [
       long Brute_Force_Dummy(char *y, char *x, long n, int m) {
        long i, j;
      
        /* Searching */
        for (i=0; i <= n-m; i++) {
           j=0;
           while (j < m && y[i+j] == x[j]) j++;
           if (j >= m) return(i);
        }
        return(-1);
       }
    // ### Brute force 'Dummy' algorithm ]
    
    
    // ### Karp-Rabin algorithm [
       #define REHASH(a, b, h) ((((h) - (a)*d) << 1) + (b))
       long Karp_Rabin(char *y, char *x, long n, int m) {
       int d, hx, hy, i, j;
    
       /* Preprocessing */
       /* computes d = 2^(m-1) with
          the left-shift operator */
       for (d = i = 1; i < m; ++i)
          d = (d<<1);
    
       for (hy = hx = i = 0; i < m; ++i) {
          hx = ((hx<<1) + x[i]);
          hy = ((hy<<1) + y[i]);
       }
    
       /* Searching */
       j = 0;
       while (j <= n-m) {
          if (hx == hy && memcmp(x, y + j, m) == 0) return(j);
          hy = REHASH(y[j], y[j + m], hy);
          ++j;
       }
        return(-1);
       }
    // ### Karp-Rabin algorithm ]
    
    
    // ### Karp-Rabin-Kaze algorithm [
    long KarpRabinKaze (char * pbTarget,
         char * pbPattern,
         unsigned long cbTarget,
         unsigned long cbPattern)
    {
        unsigned int    i;
        char *  pbTargetMax = pbTarget + cbTarget;
        char *  pbPatternMax = pbPattern + cbPattern;
        unsigned long  ulBaseToPowerMod = 1;
        register unsigned long  ulHashPattern = 0;
        unsigned long  ulHashTarget = 0;
    long hits = 0;
    //unsigned long count;
        //char *  buf1;
        //char *  buf2;
    
        if (cbPattern > cbTarget)
            return(-1);
    
        // Compute the power of the left most character in base ulBase
        //for (i = 1; i < cbPattern; i++) ulBaseToPowerMod = (ulBase * ulBaseToPowerMod);
    
        // Calculate the hash function for the src (and the first dst)
        while (pbPattern < pbPatternMax)
        {
            // Below lines give 366KB/clock for 'underdog':
            //ulHashPattern = (ulHashPattern*ulBase + *pbPattern);
            //ulHashTarget = (ulHashTarget*ulBase + *pbTarget);
            pbPattern++;
            pbTarget++;
        }
            // Below lines give 436KB/clock for 'underdog' + requirement pattern to be 4 chars min.:
            //ulHashPattern = ( (*(long *)(pbPattern-cbPattern)) & 0xffffff00 ) + *(pbPattern-1);
            //ulHashTarget = ( (*(long *)(pbTarget-cbPattern)) & 0xffffff00 ) + *(pbTarget-1);
            // Below lines give 482KB/clock for 'underdog' + requirement pattern to be 2 chars min.:
            //ulHashPattern = ( (*(unsigned short *)(pbPattern-cbPattern)) | *(pbPattern-1) );
            //ulHashTarget = ( (*(unsigned short *)(pbTarget-cbPattern)) | *(pbTarget-1) );
            // Below lines give 482KB/clock for 'underdog' + requirement pattern to be 2 chars min.:
            //ulHashPattern = ( (*(unsigned short *)(pbPattern-cbPattern)) & 0xff00 ) + *(pbPattern-1);
            //ulHashTarget = ( (*(unsigned short *)(pbTarget-cbPattern)) & 0xff00 ) + *(pbTarget-1);
            // Below lines give 605KB/clock for 'underdog' + requirement pattern to be 2 chars min.:
            //ulHashPattern = ( (*(unsigned short *)(pbPattern-cbPattern))<<8 ) + *(pbPattern-1);
            //ulHashTarget = ( (*(unsigned short *)(pbTarget-cbPattern))<<8 ) + *(pbTarget-1);
            // Below lines give 668KB/clock for 'underdog':
            ulHashPattern = ( (*(char *)(pbPattern-cbPattern))<<8 ) + *(pbPattern-1);
            ulHashTarget = ( (*(char *)(pbTarget-cbPattern))<<8 ) + *(pbTarget-1);
    
        // Dynamically produce hash values for the string as we go
        for ( ;; )
        {
            if ( (ulHashPattern == ulHashTarget) && !memcmp(pbPattern-cbPattern, pbTarget-cbPattern, (unsigned int)cbPattern) )
           // if ( ulHashPattern == ulHashTarget ) {
           // 
           //  count = cbPattern;
           //  buf1 = pbPattern-cbPattern;
           //  buf2 = pbTarget-cbPattern;
           //  while ( --count && *(char *)buf1 == *(char *)buf2 ) {
           //          buf1 = (char *)buf1 + 1;
           //          buf2 = (char *)buf2 + 1;
           //  }
           //                 
           //  if ( *((unsigned char *)buf1) - *((unsigned char *)buf2) == 0) hits++;
           //  }
                return((long)(pbTarget-cbPattern));
                //hits++;
                                                                 
            if (pbTarget == pbTargetMax)
                return(-1);
    
            // Below line gives 482KB/clock for 'underdog' + requirement pattern to be 2 chars min.:
            //ulHashTarget = ( (*(unsigned short *)(pbTarget+1-cbPattern)) | *pbTarget );
            // Below line gives 436KB/clock for 'underdog' + requirement pattern to be 4 chars min.:
            //ulHashTarget = ( (*(long *)(pbTarget+1-cbPattern)) & 0xffffff00 ) + *pbTarget;
    //; Line 696
    //        movsx   esi, BYTE PTR [ebx]
    //        mov     ecx, DWORD PTR [edx+1]
    //        and     ecx, -256                               ; ffffff00H
    //        add     ecx, esi
            // Below line gives 482KB/clock for 'underdog' + requirement pattern to be 2 chars min.:
            //ulHashTarget = ( (*(unsigned short *)(pbTarget+1-cbPattern)) & 0xff00 ) + *pbTarget;
    //; Line 691
    //        movsx   esi, BYTE PTR [ebx]
    //        xor     ecx, ecx
    //        mov     cx, WORD PTR [edx+1]
    //        and     ecx, 65280                              ; 0000ff00H
    //        add     ecx, esi
            // Below line gives 605KB/clock for 'underdog' + requirement pattern to be 2 chars min.:
            //ulHashTarget = ( (*(unsigned short *)(pbTarget+1-cbPattern))<<8 ) + *pbTarget;
            // Below line gives 668KB/clock for 'underdog':
            ulHashTarget = ( (*(char *)(pbTarget+1-cbPattern))<<8 ) + *pbTarget;
    //; Line 718
    //        movsx   ecx, BYTE PTR [eax+1]
    //        movsx   edx, BYTE PTR [ebp]
    //        shl     ecx, 8
    //        add     ecx, edx
            // Below line gives 366KB/clock for 'underdog':
            //ulHashTarget = (ulHashTarget - *(pbTarget-cbPattern)*ulBaseToPowerMod)*ulBase + *pbTarget;
            pbTarget++;
        }
    }
    // ### Karp-Rabin-Kaze algorithm ]
    
    
    char * strstr_Microsoft (
            const char * str1,
            const char * str2
            )
    {
            char *cp = (char *) str1;
            char *s1, *s2;
    
            if ( !*str2 )
                return((char *)str1);
    
            while (*cp)
            {
                    s1 = cp;
                    s2 = (char *) str2;
    
                    while ( *s1 && *s2 && !(*s1-*s2) )
                            s1++, s2++;
    
                    if (!*s2)
                            return(cp);
    
                    cp++;
            }
            return(NULL);
    }
    
    char *
    strstr_GNU_C_Library (phaystack, pneedle)
         const char *phaystack;
         const char *pneedle;
    {
      const unsigned char *haystack, *needle;
      char b;
      const unsigned char *rneedle;
    
      haystack = (const unsigned char *) phaystack;
    
      if ((b = *(needle = (const unsigned char *) pneedle)))
        {
          char c;
          haystack--;        /* possible ANSI violation */
    
          {
        char a;
        do
          if (!(a = *++haystack))
            goto ret0;
        while (a != b);
          }
    
          if (!(c = *++needle))
        goto foundneedle;
          ++needle;
          goto jin;
    
          for (;;)
        {
          {
            char a;
            if (0)
            jin:{
            if ((a = *++haystack) == c)
              goto crest;
              }
            else
              a = *++haystack;
            do
              {
            for (; a != b; a = *++haystack)
              {
                if (!a)
                  goto ret0;
                if ((a = *++haystack) == b)
                  break;
                if (!a)
                  goto ret0;
              }
              }
            while ((a = *++haystack) != c);
          }
        crest:
          {
            char a;
            {
              const unsigned char *rhaystack;
              if (*(rhaystack = haystack-- + 1) == (a = *(rneedle = needle)))
            do
              {
                if (!a)
                  goto foundneedle;
                if (*++rhaystack != (a = *++needle))
                  break;
                if (!a)
                  goto foundneedle;
              }
            while (*++rhaystack == (a = *++needle));
              needle = rneedle;    /* took the register-poor aproach */
            }
            if (!a)
              break;
          }
        }
        }
    foundneedle:
      return (char *) haystack;
    ret0:
      return 0;
    }
    
    
    int main()
      {
    FILE *fp_inLINE;
    int Bozan;
    long ThunderwithL=0, ThunderwithR=0;
    char *Strng;
    long Strnglen;
    char Pattern[13];
    int Patternlen;
    long LinesEncountered=0;
    long BruteForceDummyhits=0;
    long KarpRabinKazehits=0;
    long KarpRabinhits=0;
    long HORSPOOLhits=0;
    long strstrMicrosofthits=0;
    long strstrGNUCLibraryhits=0;
    
    long FoundIn;
    
    printf("strstr_SHORT-SHOWDOWN, revision 1, written by Kaze.\n");
    Pattern[0]=0x00;
    Strng = (char *)malloc( 200*1024*1024 );
    if( Strng == NULL )
    { puts( "strstr_SHORT-SHOWDOWN: Needed memory allocation denied!\n" ); return( 1 ); }
    if( ( fp_inLINE = fopen( "OSHO.TXT", "rb" ) ) == NULL )
    { printf( "strstr_SHORT-SHOWDOWN: Can't open 'OSHO.TXT' file.\n" ); return( 1 ); }
    
       fseek(fp_inLINE, 0, SEEK_END);
       Strnglen = ftell(fp_inLINE);
       fseek(fp_inLINE, 0, SEEK_SET);
       fread(Strng, 1, Strnglen, fp_inLINE);
    
    printf("Input Pattern(up to 12 chars): "); gets(Pattern); // char * __cdecl gets(char *);
    Patternlen = strlen(&Pattern[0]);
    
    // Replacing CR with NULL i.e. 13->0
        for (ThunderwithL=0; ThunderwithL<Strnglen; ThunderwithL++)
            if (Strng[ThunderwithL] == 13) Strng[ThunderwithL] = 0;
        ThunderwithL=0;
    
    printf( "Doing Search for Pattern(%dbytes) into String(%dbytes) line-by-line ...\n", Patternlen, Strnglen);
    
    // 1[
    clocks1 = clock();
        for (Bozan=0; Bozan < (1<<4); Bozan++) // 16 times, at end >>4
    {
    //Search area is between Strng[0] .. Strng[n-1]
        for (;;)
        {
        while (Strng[ThunderwithR] != 10 && ThunderwithR < Strnglen-1) {ThunderwithR++;}
        FoundIn = Brute_Force_Dummy(&Strng[ThunderwithL], &Pattern[0], ThunderwithR - ThunderwithL, Patternlen);
        if ( FoundIn != -1) BruteForceDummyhits++;
        LinesEncountered++;
        ThunderwithR++;    ThunderwithL=ThunderwithR;
        if (ThunderwithR >= Strnglen-1) break;
        }
    ThunderwithL=0;ThunderwithR=0;
    if (Bozan != (1<<4)-1) LinesEncountered=0;
    }
    clocks2 = clock(); TotalRoughSearchTime = clocks2 - clocks1; TotalRoughSearchTime++;
    printf( "LinesEncountered: %lu\n", LinesEncountered);
    printf( "Brute_Force_Dummy_hits/Brute_Force_Dummy_clocks: %lu/%lu\n", BruteForceDummyhits>>4, (long)(TotalRoughSearchTime)>>4);
    printf( "Brute_Force_Dummy performance: %luKB/clock\n", (Strnglen/((long)(TotalRoughSearchTime)>>4))>>10);
    // 1]
    
    // 2[
    clocks1 = clock();
        for (Bozan=0; Bozan < (1<<4); Bozan++) // 16 times, at end >>4
    {
    //Search area is between Strng[0] .. Strng[n-1]
        for (;;)
        {
        while (Strng[ThunderwithR] != 10 && ThunderwithR < Strnglen-1) {ThunderwithR++;}
        FoundIn = KarpRabinKaze(&Strng[ThunderwithL], &Pattern[0], ThunderwithR - ThunderwithL, Patternlen);
        if ( FoundIn != -1) KarpRabinKazehits++;
        LinesEncountered++;
        ThunderwithR++;    ThunderwithL=ThunderwithR;
        if (ThunderwithR >= Strnglen-1) break;
        }
    ThunderwithL=0;ThunderwithR=0;
    if (Bozan != (1<<4)-1) LinesEncountered=0;
    }
    clocks2 = clock(); TotalRoughSearchTime = clocks2 - clocks1; TotalRoughSearchTime++;
    printf( "LinesEncountered: %lu\n", LinesEncountered);
    printf( "KarpRabinKaze_hits/KarpRabinKaze_clocks: %lu/%lu\n", KarpRabinKazehits>>4, (long)(TotalRoughSearchTime)>>4);
    printf( "KarpRabinKaze performance: %luKB/clock\n", (Strnglen/((long)(TotalRoughSearchTime)>>4))>>10);
    // 2]
    
    // 3[
    clocks1 = clock();
        for (Bozan=0; Bozan < (1<<4); Bozan++) // 16 times, at end >>4
    {
    //Search area is between Strng[0] .. Strng[n-1]
        for (;;)
        {
        while (Strng[ThunderwithR] != 10 && ThunderwithR < Strnglen-1) {ThunderwithR++;}
        FoundIn = Karp_Rabin(&Strng[ThunderwithL], &Pattern[0], ThunderwithR - ThunderwithL, Patternlen);
        if ( FoundIn != -1) KarpRabinhits++;
        LinesEncountered++;
        ThunderwithR++;    ThunderwithL=ThunderwithR;
        if (ThunderwithR >= Strnglen-1) break;
        }
    ThunderwithL=0;ThunderwithR=0;
    if (Bozan != (1<<4)-1) LinesEncountered=0;
    }
    clocks2 = clock(); TotalRoughSearchTime = clocks2 - clocks1; TotalRoughSearchTime++;
    printf( "LinesEncountered: %lu\n", LinesEncountered);
    printf( "Karp_Rabin_hits/Karp_Rabin_clocks: %lu/%lu\n", KarpRabinhits>>4, (long)(TotalRoughSearchTime)>>4);
    printf( "Karp_Rabin performance: %luKB/clock\n", (Strnglen/((long)(TotalRoughSearchTime)>>4))>>10);
    // 3]
    
    
    // 4[
    clocks1 = clock();
        for (Bozan=0; Bozan < (1<<4); Bozan++) // 16 times, at end >>4
    {
    //Search area is between Strng[0] .. Strng[n-1]
        for (;;)
        {
        while (Strng[ThunderwithR] != 10 && ThunderwithR < Strnglen-1) {ThunderwithR++;}
        FoundIn = HORSPOOL(&Strng[ThunderwithL], &Pattern[0], ThunderwithR - ThunderwithL, Patternlen);
        if ( FoundIn != -1) HORSPOOLhits++;
        LinesEncountered++;
        ThunderwithR++;    ThunderwithL=ThunderwithR;
        if (ThunderwithR >= Strnglen-1) break;
        }
    ThunderwithL=0;ThunderwithR=0;
    if (Bozan != (1<<4)-1) LinesEncountered=0;
    }
    clocks2 = clock(); TotalRoughSearchTime = clocks2 - clocks1; TotalRoughSearchTime++;
    printf( "LinesEncountered: %lu\n", LinesEncountered);
    printf( "Boyer-Moore-Horspool_hits/Boyer-Moore-Horspool_clocks: %lu/%lu\n", HORSPOOLhits>>4, (long)(TotalRoughSearchTime)>>4);
    printf( "Boyer-Moore-Horspool performance: %luKB/clock\n", (Strnglen/((long)(TotalRoughSearchTime)>>4))>>10);
    // 4]
    
    // 7[
    clocks1 = clock();
        for (Bozan=0; Bozan < (1<<4); Bozan++) // 16 times, at end >>4
    {
    //Search area is between Strng[0] .. Strng[n-1]
        for (;;)
        {
        while (Strng[ThunderwithR] != 10 && ThunderwithR < Strnglen-1) {ThunderwithR++;}
        FoundIn = (long)strstr_Microsoft(&Strng[ThunderwithL], &Pattern[0]);
        if ( FoundIn != 0) strstrMicrosofthits++;
        LinesEncountered++;
        ThunderwithR++;    ThunderwithL=ThunderwithR;
        if (ThunderwithR >= Strnglen-1) break;
        }
    ThunderwithL=0;ThunderwithR=0;
    if (Bozan != (1<<4)-1) LinesEncountered=0;
    }
    clocks2 = clock(); TotalRoughSearchTime = clocks2 - clocks1; TotalRoughSearchTime++;
    printf( "LinesEncountered: %lu\n", LinesEncountered);
    printf( "strstr_Microsoft_hits/strstr_Microsoft_clocks: %lu/%lu\n", strstrMicrosofthits>>4, (long)(TotalRoughSearchTime)>>4);
    printf( "strstr_Microsoft performance: %luKB/clock\n", (Strnglen/((long)(TotalRoughSearchTime)>>4))>>10);
    // 7]
    
    // 8[
    clocks1 = clock();
        for (Bozan=0; Bozan < (1<<4); Bozan++) // 16 times, at end >>4
    {
    //Search area is between Strng[0] .. Strng[n-1]
        for (;;)
        {
        while (Strng[ThunderwithR] != 10 && ThunderwithR < Strnglen-1) {ThunderwithR++;}
        FoundIn = (long)strstr_GNU_C_Library(&Strng[ThunderwithL], &Pattern[0], ThunderwithR - ThunderwithL, Patternlen);
        if ( FoundIn != 0) strstrGNUCLibraryhits++;
        LinesEncountered++;
        ThunderwithR++;    ThunderwithL=ThunderwithR;
        if (ThunderwithR >= Strnglen-1) break;
        }
    ThunderwithL=0;ThunderwithR=0;
    if (Bozan != (1<<4)-1) LinesEncountered=0;
    }
    clocks2 = clock(); TotalRoughSearchTime = clocks2 - clocks1; TotalRoughSearchTime++;
    printf( "LinesEncountered: %lu\n", LinesEncountered);
    printf( "strstr_GNU_C_Library_hits/strstr_GNU_C_Library_clocks: %lu/%lu\n", strstrGNUCLibraryhits>>4, (long)(TotalRoughSearchTime)>>4);
    printf( "strstr_GNU_C_Library performance: %luKB/clock\n", (Strnglen/((long)(TotalRoughSearchTime)>>4))>>10);
    // 8]
    return(0);
      }
    C:\WorkTemp\_KAZE_~1\KAZUYA~1>strstr_SHORT-SHOWDOWN.exe
    strstr_SHORT-SHOWDOWN, revision 1, written by Kaze.
    Input Pattern(up to 12 chars): an
    Doing Search for Pattern(2bytes) into String(206908949bytes) line-by-line ...
    LinesEncountered: 2459508
    Brute_Force_Dummy_hits/Brute_Force_Dummy_clocks: 1212509/668
    Brute_Force_Dummy performance: 302KB/clock
    LinesEncountered: 2459508
    KarpRabinKaze_hits/KarpRabinKaze_clocks: 1212509/511
    KarpRabinKaze performance: 395KB/clock
    LinesEncountered: 2459508
    Karp_Rabin_hits/Karp_Rabin_clocks: 1212509/631
    Karp_Rabin performance: 320KB/clock
    LinesEncountered: 2459508
    Boyer-Moore-Horspool_hits/Boyer-Moore-Horspool_clocks: 1212509/850
    Boyer-Moore-Horspool performance: 237KB/clock
    LinesEncountered: 2459508
    strstr_Microsoft_hits/strstr_Microsoft_clocks: 1212509/702
    strstr_Microsoft performance: 287KB/clock
    LinesEncountered: 2459508
    strstr_GNU_C_Library_hits/strstr_GNU_C_Library_clocks: 1212509/492
    strstr_GNU_C_Library performance: 410KB/clock

    C:\WorkTemp\_KAZE_~1\KAZUYA~1>strstr_SHORT-SHOWDOWN.exe
    strstr_SHORT-SHOWDOWN, revision 1, written by Kaze.
    Input Pattern(up to 12 chars): amazing
    Doing Search for Pattern(7bytes) into String(206908949bytes) line-by-line ...
    LinesEncountered: 2459508
    Brute_Force_Dummy_hits/Brute_Force_Dummy_clocks: 319/827
    Brute_Force_Dummy performance: 244KB/clock
    LinesEncountered: 2459508
    KarpRabinKaze_hits/KarpRabinKaze_clocks: 319/573
    KarpRabinKaze performance: 352KB/clock
    LinesEncountered: 2459508
    Karp_Rabin_hits/Karp_Rabin_clocks: 319/758
    Karp_Rabin performance: 266KB/clock
    LinesEncountered: 2459508
    Boyer-Moore-Horspool_hits/Boyer-Moore-Horspool_clocks: 319/641
    Boyer-Moore-Horspool performance: 315KB/clock
    LinesEncountered: 2459508
    strstr_Microsoft_hits/strstr_Microsoft_clocks: 319/911
    strstr_Microsoft performance: 221KB/clock
    LinesEncountered: 2459508
    strstr_GNU_C_Library_hits/strstr_GNU_C_Library_clocks: 319/606
    strstr_GNU_C_Library performance: 333KB/clock

    C:\WorkTemp\_KAZE_~1\KAZUYA~1>strstr_SHORT-SHOWDOWN.exe
    strstr_SHORT-SHOWDOWN, revision 1, written by Kaze.
    Input Pattern(up to 12 chars): underdog
    Doing Search for Pattern(8bytes) into String(206908949bytes) line-by-line ...
    LinesEncountered: 2459508
    Brute_Force_Dummy_hits/Brute_Force_Dummy_clocks: 4/759
    Brute_Force_Dummy performance: 266KB/clock
    LinesEncountered: 2459508
    KarpRabinKaze_hits/KarpRabinKaze_clocks: 4/562
    KarpRabinKaze performance: 359KB/clock
    LinesEncountered: 2459508
    Karp_Rabin_hits/Karp_Rabin_clocks: 4/769
    Karp_Rabin performance: 262KB/clock
    LinesEncountered: 2459508
    Boyer-Moore-Horspool_hits/Boyer-Moore-Horspool_clocks: 4/621
    Boyer-Moore-Horspool performance: 325KB/clock
    LinesEncountered: 2459508
    strstr_Microsoft_hits/strstr_Microsoft_clocks: 4/829
    strstr_Microsoft performance: 243KB/clock
    LinesEncountered: 2459508
    strstr_GNU_C_Library_hits/strstr_GNU_C_Library_clocks: 4/537
    strstr_GNU_C_Library performance: 376KB/clock

    C:\WorkTemp\_KAZE_~1\KAZUYA~1>strstr_SHORT-SHOWDOWN.exe
    strstr_SHORT-SHOWDOWN, revision 1, written by Kaze.
    Input Pattern(up to 12 chars): skillessness
    Doing Search for Pattern(12bytes) into String(206908949bytes) line-by-line ...
    LinesEncountered: 2459508
    Brute_Force_Dummy_hits/Brute_Force_Dummy_clocks: 0/772
    Brute_Force_Dummy performance: 261KB/clock
    LinesEncountered: 2459508
    KarpRabinKaze_hits/KarpRabinKaze_clocks: 0/560
    KarpRabinKaze performance: 360KB/clock
    LinesEncountered: 2459508
    Karp_Rabin_hits/Karp_Rabin_clocks: 0/756
    Karp_Rabin performance: 267KB/clock
    LinesEncountered: 2459508
    Boyer-Moore-Horspool_hits/Boyer-Moore-Horspool_clocks: 0/617
    Boyer-Moore-Horspool performance: 327KB/clock
    LinesEncountered: 2459508
    strstr_Microsoft_hits/strstr_Microsoft_clocks: 0/876
    strstr_Microsoft performance: 230KB/clock
    LinesEncountered: 2459508
    strstr_GNU_C_Library_hits/strstr_GNU_C_Library_clocks: 0/583
    strstr_GNU_C_Library performance: 346KB/clock
    No code has to be inserted here.
    The conclusion: the things can get only better, and digging-for-better-etudes is a never-ending-quest.

  27. #57
    Member Sanmayce's Avatar
    Join Date
    Apr 2010
    Location
    Sofia
    Posts
    57
    Thanks
    0
    Thanked 0 Times in 0 Posts
    As an addition here is the WHOLE-BLOCK(as one string) test:
    No code has to be inserted here.
    Of course 'Boyer-Moore-Horspool' has a MONSTROUS advantage in speed when HayStack is 1++KB and Needle is not small, no doubt about it.
    For big strings(blocks) 'KarpRabinKaze' falls far(by ((1727-347)/347)*100%=397+%) behind the MONSTER 'Boyer-Moore-Horspool'.

    Here arises the question "What has 'KarpRabin' to do with compression(match-finding) i.e. heavy(multi-pattern)-searching?".

    Rabin–Karp is inferior for single pattern searching to Knuth–Morris–Pratt algorithm, Boyer–Moore string search algorithm and other faster single pattern string searching algorithms because of its slow worst case behavior. However, Rabin–Karp is an algorithm of choice for multiple pattern search. /From Wikipedia, the free encyclopedia/
    Last edited by Sanmayce; 12th July 2010 at 22:29.

  28. #58
    Member Sanmayce's Avatar
    Join Date
    Apr 2010
    Location
    Sofia
    Posts
    57
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Another Amateurish Suggestion

    The charming side of being dummy is the freedom to think without any bias and predefined limitations, i.e. to be able to speculate freely.

    There is a rich-in-emotion Bulgarian saying:
    "I never leave me donkey behind(in the mud)", a la "Semper Fidelis" (English: Always Faithful), he-he.

    The most feared block size in all the kingdom: 256KB ...
    Let's see how HORRIBLE is this Compress-Ratio-Degradation when such [very-]small chunks are used.

    Again, the testbed is from post #19 in this thread:

    No code has to be inserted here.By sacrificing the C.R. the goal is to achieve SMOOTH(CONSTANT)-FEEDING of decompression threads, during BURST-READ of incoming concatenated 256KB(or 1MB or 4MB) CHUNKS, as a trade-off.

    No code has to be inserted here.As to desired from me Super-Lazy-Exhaustive-Match-Finding, or whatever they call it,
    here is another preposterous proposition of mine:

    Speaking of LZSS, with these settings(regardless of obvious fact that following settings are intended for larger block):
    Dictionary_Size = 18bit
    Match_Size = 6bit
    Min_Match_Threshold = 3 chars

    in my view the FULLY-thorough match-finding must be done with the help of B-trees(no mercy for needed GBs) or/and tuned Karp-Rabin,
    seeing the process of parsing as a task to find patterns of length 3-63chars in a string(dictionary) of length 262144chars.
    The precedent(a proven monster for such tasks) is Leprechaun: here 61 B-trees of fixed order would come in handy.
    Or 61 slots with total of (262144-3+1)+(262144-4+1)+...+(262144-63+1) = (63-3+1)*(262144)-(63*64/2-2*3/2)+(63-3+1)*1 = 15,988,832 elements(possible matches) to be inserted into respective 61 B-trees.
    In case of 32MB block: (33554432-3+1)+(33554432-4+1)+...+(33554432-63+1) = (63-3+1)*(33554432)-(63*64/2-2*3/2)+(63-3+1)*1 = 2,046,817,392 elements.

    Fear not! What are some 2 billions matches to be searched for, moreover the total number is less since repetitions(duplicate elements are not inserted again) are only counted, i.e. leafs contain only distinct possible-matches.

    Needless to say, but anyway, I don't care about parsing speed, I do care of revealing all usable matches.

    The process of filling the dictionary is still puzzling me, so the above is just an outline.
    My primary intention is to give some hints as a by-stander not as a compression-craft learner.

    I repeat myself only because the fear-of-this-and-that must be no obstacle when the goal(thread's name) is so definite.

    During the confirmation for correctness of 'be a no ' a sentence appeared:
    000,000,002 To be a no one is so tremendously blissful; one gets really spaced.
    That's the point.

    I want to salute all LZ developers with another greatest Video-Clip-Masterpiece I came across Radio Killer - Be Free and to remind to:
    "I care about Love, and how I feel
    I care about dreams, and make them real
    I care about you, I care about me
    No no worry, yes I?m able to be free!"
    Last edited by Sanmayce; 29th July 2010 at 19:14. Reason: Reason for edit: Added Gribok's new archiver.

  29. #59
    Member Sanmayce's Avatar
    Join Date
    Apr 2010
    Location
    Sofia
    Posts
    57
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Some Heavy Compression + Heavy Thoughts

    For me PPM(regardless how beautiful in its complexity might be) is a beaten card,
    I bet all my chips on LZ derivates as plan A and on BWT derivates as plan B in my pursuit for fastest decompressor.

    Monstrous PPMII compressor based on PPMd var.I vs PPMX 0.06P4 Fast PPM compressor vs bsc, Block Sorting Compressor, Version 2.2.6:

    No code has to be inserted here.And Gribok your 'fair' sounds to me like 'spitting-against-the-wind' and 'throwing-stones-on-the-runner', the comparison is a good thing, OK, but where goes the intricate beauty of each algorithm i.e. the uniqueness!? And 'speed' - maybe it's an empty word, huh!
    If we follow such 'fair' logic we must use also '-tT' in order to be 'fair'-enough in tripping the crippler.

    I have been mildly accused of comparing decompression performance of different compression ratios yielding codecs, stressing on this-and-that, as incorrect. I keep answering: Decompression ratio is [the biggest part of] everything, I cannot say it simpler.
    By the way: my height and weight are identical with those of Usain Bolt, but my time on 100m is twice as slow.

    And when the infamous for its decompression speed PPM has become a rival to asymmetric-in-favour-of-decompression-speed BWT prodigy.
    Because e.g. "there is no such thing as fastest white woman on 100m sprint, there is just fastest sprinter" - from a fastest Bulgarian he-sprinter comment about fastest Bulgarian she-sprinter, implying 'fastest' is the only meaningful thing and not 'woman/hermaphrodite/man/white/red/black/...'.

    I just don't see the point your sub-super tool to run crippled on one feet, it is like underestimating it, I am serious, my brother tells me the same all the time regarding works I'm involved into, and I think this is a kind-of-disease.
    I believe that the work is something bigger than A(not the) worker, in other words: worker is a small part(vessel) of WORK.

    During my life I have seen enough "specialists" in different areas considering themselves "masters" not seeing the simple truth: identifying oneself with the truth/skill, as ancients have declared, is an ILLNESS i.e. no one is even remotely close to the mastership.
    Please treat my fringy words in context of a fan(not-criticizer)-talking.

    Currently bsc is the console-archiver-of-choice for my archives, bsc is a piece of beauty.
    Not saying 'beauty' to beauty is ugly, that's the point.

    Regards

    P.S.
    Pre-yesterday watching a movie about the signs of the 'apocalypse', a beautiful word popped in my mind: 'apocalyptor' in sense of 'revealer', which describes well 'master/artist' by linking revelation(discovering or more precisely disclosing) and abstract-visions. Or more simply, one who grabs ideas and put them into practice, or more romantically: one who animates the un-born.
    By adding SERIOUSNESS as the main facet to 'master/artist' and REVEALING(not creating), 'apocalyptor' is a beautiful word.

    abstract:
    Having an intellectual and affective artistic content that depends solely on intrinsic form rather than on narrative content or pictorial representation.


    My limited awareness tells me that the UNIVERSE has only 3 layers(uniting all multi-dimensional & multi-times worlds):
    - realm of THE UNFATHOMABLE - THE CREATOR;
    - realm of AWARENESS - track-to-perfection - many a god/devastator/dominator/apocalyptor/harmonic-creature; creatures being a conductor/conveyer/vessel of GREAT/UNITY;
    - realm of confusion - track-to-perdition - so called 'vanity fair' where hypocrisy rules; creatures refusing to connect to GREAT/UNITY;
    Sadly(non-funly), I consider myself still being in 'vanity fair' realm. My way, however, currently connects(is in-between) the second and the third track.

    The above-said illustrates my goal to show that REAL things come from REAL(confusionless) worlds through REAL beings i.e. openminded-mediators(part of humans in particular, ha-ha).

    P.P.S.

    No code has to be inserted here.Defending my claim 'Decompression ratio is [the biggest part of] everything':
    For cached-data i.e. less(see next paragraphs) of cases:
    lzturbo is ahead of bsc with about ((3.2%-0.8%)/0.8%)*100%=300%

    Affirming the usefulness of Com/Decom Ratio:
    For un-cached-data i.e. more of cases:
    lzturbo is ahead of bsc with about ((2037%-665%)/665%)*100%=206%

    Since Com/Decom Ratio takes into account all aspects except I/O-read-speed,
    regarding the better performer, the profit(boosting) i.e. Delivering(Loading+Decompressing) lies between 206% .. 300%.

    How about the case when I/O-read-speed exceeds even that of system ram!

    The incoming CPU-RAM revolution(commenced but just-started-to-unfold) will turn many benchmarks into falsity.
    The precedent is already here: the ioDrive Duo - The World's Fastest and Most Innovative SSD:

    With the ioDrive Duo, it is now possible for application, database and system administrators to get previously unheard-of levels of performance, protection and capacity utilization from a single server. Performance for multiple ioDrive Duos scales linearly, allowing any enterprise to scale performance to six gigabytes per-second (Gbytes/sec) of read bandwidth and over 500,000 read IOPS by using just four ioDrive Duos.
    > Sustain over a GB/sec of bandwidth
    > Easily RAID multiple ioDrive Duo's
    > OS support for Windows, Linux & Solaris

    Some offspring of this MONSTROUS family: ioXtreme: http://community.fusionio.com/media/p/463.aspx Extreme Performance for Extreme Users: with humble(compared to rest of the family) 600MB/s burst read.

    How about PC with 80 http://www.legitreviews.com/article/460/1/ cores CPU!
    How about Low-End-SERVER with 512 http://www.anandtech.com/show/3768/s...-consumption/2 CPUs!

    OK, let's do the dummy-math:
    - with 65nm process 3 years ago: 80 cores;
    - with 32nm process nowadays: ?(320 cores);
    - with 20nm process(IBM has done it already) some years later: ?(720+ cores).

    It is not as speculative as it may seem, for a core with consumption [far] less than 1W(Intel's 80-Core Teraflops Research Chip consumes only 62 watts) I wonder what you/I gonna do if the tomorrow-which-never-comes comes: a PC with 320 cores and no readiness/openmindedness to utilize such a might. Whether such a product(how profanely sounds here) will be available is not important at all, but that, as Nathan Kirsch wrote, "the 'Era Of Tera' is really on the way".
    Last edited by Sanmayce; 5th August 2010 at 19:51. Reason: Enrichment & Beautifying & error-fixing

  30. #60
    Member Sanmayce's Avatar
    Join Date
    Apr 2010
    Location
    Sofia
    Posts
    57
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Best string-hash-function or an-open-letter to pro-programmer

    I firmly believe that fastest decompression/search/hash etudes share one common thread: simplicity(as base of any further multiplicity).

    Hi Mr. Noll,

    despite my intention not to disturb you with my dummy(but precious for me) C etudes, I cannot avoid saying 'THANKS' for Fowler/Noll/Vo hash which dethroned(replaced) my inferior '2in1' hash function in Leprechaun r.13++, thus giving way to r.13+++ which provides increased by up to ((6,329,353W/s - 5,974,513W/s)/5,974,513W/s)*100% = 5.9% performance due to 0.6%-1.1% better dispersion and faster execution.

    _FNV1A_Hash_SHIFTless_XORless PROC NEAR
    ; Line 721
    mov edx, DWORD PTR _str$[esp-4]
    mov cl, BYTE PTR [edx]
    test cl, cl
    mov eax, -2128831035
    je SHORT $L1582
    npad 1
    $L1580:
    ; Line 722
    movzx ecx, cl
    xor ecx, eax
    imul ecx, 16777619
    inc edx
    mov eax, ecx
    mov cl, BYTE PTR [edx]
    test cl, cl
    jne SHORT $L1580
    $L1582:
    ; Line 726
    and eax, 8191
    ; Line 727
    ret 0
    _FNV1A_Hash_SHIFTless_XORless ENDP
    The link for Leprechaun_r13+++++_EXE+ELF.zip is here, there are: source, PDF booklet, EXE.
    The Fowler/Noll/Vo hash:
    http://www.isthe.com/chongo/src/fnv/fnv-5.0.2.tar.gz

    Again and again I am in position of the 'absolute-beginner-for-ever', or as my brother keeps on saying "Kaze must read more".
    Making my '2in1' hash took me more than 3-6 hours, and what: a good only to remain commented, in contrast: FNV proves that primes rule i.e. simplicity rules. I know the latter very well but it takes some science-alchemy-magic-revelation to bring perfection into our world. I am kind of dummy-dumb but foxy enough to see that being weak in science I must focus my intent in a higher sphere, namely revelation, why to waste time in betweens, ha-ha.

    As a virtual token of my gratitude allow me to salute you, this time, with a masterpiece movie: 'INK' made 2009.
    The scene where Ink talks, asks and finally understands is truly outstanding - I like it infinitely, in one word: An Apocalypse.

    Now, if I'm not lying(one never knows) I will not waste your time - I did fall in dismay in how many activities you are taking part.

    P.S.
    Allow me to share some thoughts of mine concerning my attitude towards ready-to-enter ideas.
    If you walk in Japan(not necessarily in cities but also in remote villages) and ask people on the street whether dolls are vivid( i.e. have an anima(soul)), immediately without any hesitation the answer will be YES. Not so in countries without reverence to NATURE and compassion towards souls of all living creatures.
    In that way, similarly, I treat many things including even tools(programs), in particular Leprechaun - the master-rhymer elf, a spirit from the forest, who fascinated me with his playfulness, charming greediness(his way to enjoy) and direct approach, as having personality.

    Best regards,
    Georgi 'Kaze' 'Sanmayce'

    Supplemental #1:
    Code:
    UINT FNV1A_Hash_WHIZ(const char *str, SIZE_T wrdlen)
    {
        const UINT PRIME = 1607;
     
        UINT hash32 = 2166136261;
        const char *p = str;
     
        for(; wrdlen >= sizeof(DWORD); wrdlen -= sizeof(DWORD), p += sizeof(DWORD)) {
            hash32 = (hash32 ^ *(DWORD *)p) * PRIME;
        }
        if (wrdlen & sizeof(WORD)) {
            hash32 = (hash32 ^ *(WORD*)p) * PRIME;
            p += sizeof(WORD);
        }
        if (wrdlen & 1) 
            hash32 = (hash32 ^ *p) * PRIME;
     
        return hash32 ^ (hash32 >> 16);
    }
    the source and some other variants of FNV1A alongside with light&heavy tests at:
    http://www.strchr.com/hash_functions

    Supplemental #2:
    http://encode.su/threads/1160-Fastes...-hash-function!
    Last edited by Sanmayce; 15th November 2010 at 17:53. Reason: New revision added, fastest revision FNV1A_WHIZ added; added FNV1A_Jesteress

Page 2 of 3 FirstFirst 123 LastLast

Similar Threads

  1. Fastest Compressors
    By LovePimple in forum Forum Archive
    Replies: 0
    Last Post: 1st November 2006, 05:36

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •