Page 1 of 3 123 LastLast
Results 1 to 30 of 84

Thread: Paq8sk

  1. #1
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    238
    Thanks
    29
    Thanked 24 Times in 22 Posts

    Paq8sk

    PAQ8SK v1
    this is forked from pa8pxv182fix1 with tweak text model n increase memory usage from 4gb upto ~7gb. here is the source code and the binary. the result of xml file use -9eta option is:
    paq8pxv182fix1 250750 bytes
    paq8sk 249237 bytes

    enwik8 use paq8sk -9eta is:
    16289679 bytes in 30307.92 sec
    Attached Files Attached Files

  2. #2
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    238
    Thanks
    29
    Thanked 24 Times in 22 Posts
    Quote Originally Posted by suryakandau@yahoo.co.id View Post
    PAQ8SK v1
    this is forked from pa8pxv182fix1 with tweak text model n increase memory usage from 4gb upto ~7gb. here is the source code and the binary. the result of xml file use -9eta option is:
    paq8pxv182fix1 250750 bytes
    paq8sk 249237 bytes

    enwik8 use paq8sk -9eta is:
    16289679 bytes in 30307.92 sec
    @shelwien could you add mod_sse.cpp to this version please ? thank you

  3. #3
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,834
    Thanks
    287
    Thanked 1,239 Times in 695 Posts
    You can do it, the changes are very simple:
    Code:
    diff -rw paq8pxd-82/paq8pxd.cpp paq8pxd-82s/paq8pxd.cpp
    13380a13382,13383
    > #include "mod_sse.h"
    13388a13392
    >   SSE* sse;
    13393c13397,13399
    <     p+=p==0;
    ---
    > p = sse->Predict(p);
    > sse->Perceive(i);
    13409c13415,13416
    <     p+=p==0;
    ---
    > p = sse->Predict(p);
    > sse->Perceive(predictor.x.y);
    13462c13470
    >      delete sse;
    13467a13476,13477
    >   sse = new SSE;

  4. #4
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    238
    Thanks
    29
    Thanked 24 Times in 22 Posts
    Quote Originally Posted by Shelwien View Post
    You can do it, the changes are very simple:
    Code:
    diff -rw paq8pxd-82/paq8pxd.cpp paq8pxd-82s/paq8pxd.cpp
    13380a13382,13383
    > #include "mod_sse.h"
    13388a13392
    >   SSE* sse;
    13393c13397,13399
    <     p+=p==0;
    ---
    > p = sse->Predict(p);
    > sse->Perceive(i);
    13409c13415,13416
    <     p+=p==0;
    ---
    > p = sse->Predict(p);
    > sse->Perceive(predictor.x.y);
    13462c13470
    >      delete sse;
    13467a13476,13477
    >   sse = new SSE;

    ​in which part i have to insert the above code ?

  5. #5
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,834
    Thanks
    287
    Thanked 1,239 Times in 695 Posts
    Look at the pxd source I posted and compare it vs original v82.
    Basically #include can be added anywhere and I inserted Predict/Perceive calls into entropy coder functions.

  6. #6
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    238
    Thanks
    29
    Thanked 24 Times in 22 Posts
    Paq8sk2
    this forked from paq8pxd83 with some improvement on text model by using 2 contextmap and experimental hash function.
    the result for dickens file using -s6 -w is:
    paq8pxd83 1917517 bytes
    paq8sk2 1916707 bytes
    ​
    Attached Files Attached Files

  7. #7
    Member
    Join Date
    Jun 2009
    Location
    Puerto Rico
    Posts
    208
    Thanks
    98
    Thanked 27 Times in 20 Posts
    Quote Originally Posted by suryakandau@yahoo.co.id View Post
    Paq8sk2
    this forked from paq8pxd83 with some improvement on text model by using 2 contextmap and experimental hash function.
    the result for dickens file using -s6 -w is:
    paq8pxd83 1917517 bytes
    paq8sk2 1916707 bytes
    ​
    Do you have the source code for your modified version on GitHub or somewhere else?

  8. #8
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    238
    Thanks
    29
    Thanked 24 Times in 22 Posts
    Quote Originally Posted by moisesmcardona View Post
    Do you have the source code for your modified version on GitHub or somewhere else?

    this is the source and the binary. btw this is the result of enwik8 using
    -x9 -w and -x10 -w option

    paq8sk2 using -x9 -w enwik8
    Total 100000000 bytes compressed to 15964967 bytes.
    Time 23532.64 sec, used 5734 MB (1717753330 bytes) of memory
    -x10 -w enwik8
    Total 100000000 bytes compressed to 15886109 bytes.
    Time 24567.55 sec, used 9566 MB (1440929250 bytes) of memory
    Attached Files Attached Files

  9. #9
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    238
    Thanks
    29
    Thanked 24 Times in 22 Posts
    Quote Originally Posted by suryakandau@yahoo.co.id View Post
    this is the source and the binary. btw this is the result of enwik8 using
    -x9 -w and -x10 -w option

    paq8sk2 using -x9 -w enwik8
    Total 100000000 bytes compressed to 15964967 bytes.
    Time 23532.64 sec, used 5734 MB (1717753330 bytes) of memory
    -x10 -w enwik8
    Total 100000000 bytes compressed to 15886109 bytes.
    Time 24567.55 sec, used 9566 MB (1440929250 bytes) of memory


    @sportman could you test -x15 -w option for enwik9 ?? Thank you

  10. #10
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    238
    Thanks
    29
    Thanked 24 Times in 22 Posts
    Paq8sk5

    the result for dickens file using -s6 -w option is 1910704 bytes. enwik9 is on progress
    Attached Files Attached Files

  11. #11
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    238
    Thanks
    29
    Thanked 24 Times in 22 Posts
    Quote Originally Posted by suryakandau@yahoo.co.id View Post
    Paq8sk5

    the result for dickens file using -s6 -w option is 1910704 bytes. enwik9 is on progress
    enwik9 using -x10 -w
    Total 1000000000 bytes compressed to 125720675 bytes.
    Time 219889.84 sec, used 10435 MB (2352516682 bytes) of memory

    @sportman/darek could you test it using -x15 -w option please ??

  12. #12
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    238
    Thanks
    29
    Thanked 24 Times in 22 Posts
    Quote Originally Posted by suryakandau@yahoo.co.id View Post
    enwik9 using -x10 -w
    Total 1000000000 bytes compressed to 125720675 bytes.
    Time 219889.84 sec, used 10435 MB (2352516682 bytes) of memory

    @sportman/darek could you test it using -x15 -w option please ??
    enwik8 using -x10 -w option
    Total 100000000 bytes compressed to 15863844 bytes.
    Time 20171.72 sec, used 10435 MB (2352516648 bytes) of memory.
    ​

  13. #13
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    933
    Thanks
    95
    Thanked 358 Times in 250 Posts
    enwik8
    15,795,668 bytes, 8,568.886 sec., paq8sk5 -x15 -w

    paq8sk5 display input (XML text) at console during start phase.

  14. Thanks:

    suryakandau@yahoo.co.id (24th April 2020)

  15. #14
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    238
    Thanks
    29
    Thanked 24 Times in 22 Posts
    Quote Originally Posted by Sportman View Post
    enwik8
    15,795,668 bytes, 8,568.886 sec., paq8sk5 -x15 -w

    paq8sk5 display input (XML text) at console during start phase.

    How about enwik9 using -x15 -w option ?? could you test it please ??

  16. #15
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    1,097
    Thanks
    675
    Thanked 431 Times in 329 Posts
    @surykandau - yes, I could test it.

    As I wrote before I need to finish my running testplan. It takes next 3-4 days. Maybe 5.

    Then I'll test this version - this takes 1.5 day for me. Comparing to latest version paq8pxd it could chieve 124'4xx'xxx bytes.

  17. #16
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    238
    Thanks
    29
    Thanked 24 Times in 22 Posts
    Paq8sk7
    - improve textmodel
    - implement new experimental hash function
    - improve wordmodel
    the result for dickens file using -s6 -w
    Total 10192446 bytes compressed to 1908246 bytes.
    Time 2012.67 sec, used 1452 MB (1522754301 bytes) of memory
    Attached Files Attached Files

  18. #17
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    1,097
    Thanks
    675
    Thanked 431 Times in 329 Posts
    Ok, should I swap test paq8sk5 into paq8sk7?

  19. #18
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    238
    Thanks
    29
    Thanked 24 Times in 22 Posts
    Quote Originally Posted by Darek View Post
    Ok, should I swap test paq8sk5 into paq8sk7?
    yes thank you

  20. #19
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    933
    Thanks
    95
    Thanked 358 Times in 250 Posts
    enwik8:
    15,787,302 bytes, 9,795.359 sec., paq8sk7 -x15 -w

    paq8sk7 also display input (XML text) at console during start phase (this cause removal of older console buffer data).
    Got paq8sk7 crash before paq8sk7 shutdown.

  21. #20
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    238
    Thanks
    29
    Thanked 24 Times in 22 Posts
    Quote Originally Posted by Sportman View Post
    enwik8:
    15,787,302 bytes, 9,795.359 sec., paq8sk7 -x15 -w

    paq8sk7 also display input (XML text) at console during start phase (this cause removal of older console buffer data).
    Got paq8sk7 crash before paq8sk7 shutdown.
    i guess it caused by using -x15 option, please try use -x14 option

  22. #21
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    933
    Thanks
    95
    Thanked 358 Times in 250 Posts
    enwik8:
    15,787,454 bytes, 8,760.268, paq8sk7 -x14 -w

    without crash.

  23. Thanks:

    suryakandau@yahoo.co.id (27th April 2020)

  24. #22
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    1,097
    Thanks
    675
    Thanked 431 Times in 329 Posts
    Ok, should I test -x15 or -x14 then? Maybe -x14 would be safer.

  25. #23
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    238
    Thanks
    29
    Thanked 24 Times in 22 Posts
    Quote Originally Posted by Darek View Post
    Ok, should I test -x15 or -x14 then? Maybe -x14 would be safer.
    it could be better to test use -x14

  26. #24
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    238
    Thanks
    29
    Thanked 24 Times in 22 Posts
    Quote Originally Posted by suryakandau@yahoo.co.id View Post
    Paq8sk7
    - improve textmodel
    - implement new experimental hash function
    - improve wordmodel
    the result for dickens file using -s6 -w
    Total 10192446 bytes compressed to 1908246 bytes.
    Time 2012.67 sec, used 1452 MB (1522754301 bytes) of memory

    enwik9 use -x10 -w
    Total 1000000000 bytes compressed to 125565180 bytes.
    Time 214767.55 sec, used 11459 MB (3426335285 bytes) of memory

  27. #25
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    1,097
    Thanks
    675
    Thanked 431 Times in 329 Posts
    @suryakandau - sorry for dalay.

    Score for enwik9 on paq8sk7 -x14 -w => 123'723'399 bytes, time 110'022.69s - very good score for this version - w/o external dictionary and with -x14 option

  28. Thanks:

    suryakandau@yahoo.co.id (30th April 2020)

  29. #26
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    238
    Thanks
    29
    Thanked 24 Times in 22 Posts
    Quote Originally Posted by Darek View Post
    @suryakandau - sorry for dalay.

    Score for enwik9 on paq8sk7 -x14 -w => 123'723'399 bytes, time 110'022.69s - very good score for this version - w/o external dictionary and with -x14 option
    if i use external dictionary, can it below than 123.xxx.xxx bytes ?

  30. #27
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    1,097
    Thanks
    675
    Thanked 431 Times in 329 Posts
    I don't know, if this version scale similarly to Kaitz and the same idea will be used then it should be about 122.8xx.xxx - 122.9xx.xxx bytes.

  31. #28
    Member
    Join Date
    Aug 2015
    Location
    indonesia
    Posts
    238
    Thanks
    29
    Thanked 24 Times in 22 Posts
    Paq8sk9


    NOw this version using external dictionary, so the result for enwik9 using -x10 -w -e1,english.dic
    Total 1000000000 bytes compressed to 124758108 bytes.
    Time 227725.05 sec, used 11672 MB (3649474565 bytes) of memory


    paq8sk10 is running now for enwik9
    Attached Files Attached Files

  32. #29
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    1,097
    Thanks
    675
    Thanked 431 Times in 329 Posts
    Ok, then we wait for paq8sk10.

  33. #30
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    933
    Thanks
    95
    Thanked 358 Times in 250 Posts
    enwik8:
    15,774,291, 8,947.975 sec., paq8sk9 -x15 -w

  34. Thanks (2):

    Darek (2nd May 2020),suryakandau@yahoo.co.id (1st May 2020)

Page 1 of 3 123 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •