Page 3 of 3 FirstFirst 123
Results 61 to 67 of 67

Thread: FARSH: hashing 30 GB/s!

  1. #61
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,572
    Thanks
    780
    Thanked 687 Times in 372 Posts
    JamesB, i have added anti-optimization trick to loop measuring speed for the first column. Please try it with compiler/options that produced strange results

    Overall, it seems that newer Intel CPUs from IvyBridge to Skylake has pretty close speed/frequency.

    We still miss Nehalem, SandyBridge, low-power CPUs (two generations of Atoms, Bobcat, Jaguar), Pentium 4, original Athlon64 and any older CPUs
    Last edited by Bulat Ziganshin; 16th July 2016 at 22:32.

  2. #62
    Member
    Join Date
    Jan 2015
    Location
    Hungary
    Posts
    12
    Thanks
    30
    Thanked 7 Times in 3 Posts
    AMD Sempron 3000+ Socket A (462) 2000 Mhz (real clock 1988 Mhz) http://www.cpu-world.com/CPUs/K7/AMD...A3000BOX).html

    Code:
    C:\teszt\farsh02-benchmark>for %e in (*.exe) do @start /b /wait  /realtime %e 1
    ..
    aligned-farsh-x86         |   2.057 GB/s =  1.916 GiB/s  |   2.536 GB/s =  2.362 GiB/s
    AMD Sempron(TM) 3000+: SSE
    ..
    farsh-x86                 |   2.022 GB/s =  1.884 GiB/s  |   2.348 GB/s =  2.187 GiB/s

  3. #63
    Member
    Join Date
    Nov 2014
    Location
    California
    Posts
    158
    Thanks
    51
    Thanked 44 Times in 33 Posts
    For Sandy Bridge:

    for %e in (*.exe) do @start /b /wait /realtime %e 1
    aligned-farsh-x64-nosimd | 12.477 GB/s = 11.620 GiB/s | 14.035 GB/s = 13.071 GiB/s
    aligned-farsh-x64 | 19.638 GB/s = 18.289 GiB/s | 25.462 GB/s = 23.714 GiB/s
    aligned-farsh-x86-sse2 | 22.342 GB/s = 20.808 GiB/s | 26.134 GB/s = 24.339 GiB/s
    aligned-farsh-x86 | 5.088 GB/s = 4.738 GiB/s | 5.834 GB/s = 5.433 GiB/s


    Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz: AVX
    farsh-x64-nosimd | 12.061 GB/s = 11.233 GiB/s | 13.780 GB/s = 12.834 GiB/s
    farsh-x64 | 21.969 GB/s = 20.460 GiB/s | 25.183 GB/s = 23.454 GiB/s
    farsh-x86-sse2 | 20.794 GB/s = 19.366 GiB/s | 24.705 GB/s = 23.008 GiB/s
    farsh-x86 | 5.098 GB/s = 4.748 GiB/s | 6.012 GB/s = 5.599 GiB/s
    Last edited by hexagone; 16th July 2016 at 21:08. Reason: formatting

  4. #64
    Member
    Join Date
    Dec 2011
    Location
    Cambridge, UK
    Posts
    506
    Thanks
    187
    Thanked 177 Times in 120 Posts
    A very quick test of FARSH on a AVX-512 system. There's no dedicated AVX-512 version, so I tried the normal avx2 and also non-simd variant with arch=core2 vs arch=native. Odd how it's so much slower with icc.

    Code:
    gcc 6.2.0:
    farsh-x64-avx2            |  10.529 GB/s =  9.805 GiB/s  |  12.352 GB/s = 11.504 GiB/s
    farsh-x64<core2>          |   2.514 GB/s =  2.341 GiB/s  |   2.416 GB/s =  2.250 GiB/s
    farsh-x64-<native>        |   7.415 GB/s =  6.905 GiB/s  |  29.285 GB/s = 27.274 GiB/s
    
    icc 16.0.2:
    farsh-x64-avx2            |   6.712 GB/s =  6.251 GiB/s  |   7.776 GB/s =  7.242 GiB/s
    farsh-x64-<native>        |   3.998 GB/s =  3.724 GiB/s  |   4.262 GB/s =  3.969 GiB/s
    
    
    
    model name    : Intel(R) Xeon Phi(TM) CPU 7210 @ 1.30GHz

  5. Thanks (2):

    Bulat Ziganshin (8th September 2016),Cyan (8th September 2016)

  6. #65
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,572
    Thanks
    780
    Thanked 687 Times in 372 Posts
    seems gcc successfully optimized main cycle for avx-512? with 72 cores, overall speed should be 2.1 TB/s

  7. #66
    Member
    Join Date
    Dec 2011
    Location
    Cambridge, UK
    Posts
    506
    Thanks
    187
    Thanked 177 Times in 120 Posts
    It wasn't multi-threaded

    I'm not sure at what speed the memory can be driven, but I believe there is distributed memory per core so in theory you can parallelise that too. Getting data into the per-core memory would then be the bottleneck. I think we're over-doing it a bit though.

    Anyway, this machine "only" had 64 cores (x4 hyperthreads).

  8. #67
    Member Razor12911's Avatar
    Join Date
    May 2016
    Location
    South Africa
    Posts
    32
    Thanks
    58
    Thanked 56 Times in 16 Posts
    Intel(R) Core(TM)2 Extreme CPU X9650 @ 3.00GHz: SSE 4.1
    Code:
    aligned-farsh-x64-nosimd  |   9.655 GB/s =  8.992 GiB/s  |   7.473 GB/s =  6.960 GiB/s
    aligned-farsh-x64         |  15.810 GB/s = 14.725 GiB/s  |  19.617 GB/s = 18.269 GiB/s
    aligned-farsh-x86-sse2    |  17.291 GB/s = 16.104 GiB/s  |  19.806 GB/s = 18.446 GiB/s
    aligned-farsh-x86         |   4.640 GB/s =  4.321 GiB/s  |   5.449 GB/s =  5.075 GiB/s
    farsh-x64-nosimd          |   5.314 GB/s =  4.949 GiB/s  |   5.551 GB/s =  5.170 GiB/s
    farsh-x64                 |   6.789 GB/s =  6.322 GiB/s  |   7.865 GB/s =  7.325 GiB/s
    farsh-x86-sse2            |   6.872 GB/s =  6.400 GiB/s  |   8.898 GB/s =  8.287 GiB/s
    farsh-x86                 |   3.930 GB/s =  3.660 GiB/s  |   4.725 GB/s =  4.401 GiB/s
    Intel(R) Celeron(R) CPU 1005M @ 1.90GHz: SSE 4.2
    Code:
    aligned-farsh-x64-nosimd  |   8.419 GB/s =  7.841 GiB/s  |   8.984 GB/s =  8.367 GiB/s
    aligned-farsh-x64         |  13.252 GB/s = 12.342 GiB/s  |  16.139 GB/s = 15.031 GiB/s
    aligned-farsh-x86-sse2    |  12.614 GB/s = 11.747 GiB/s  |  15.991 GB/s = 14.893 GiB/s
    aligned-farsh-x86         |   2.941 GB/s =  2.739 GiB/s  |   3.318 GB/s =  3.090 GiB/s
    farsh-x64-nosimd          |   8.207 GB/s =  7.644 GiB/s  |   8.985 GB/s =  8.368 GiB/s
    farsh-x64                 |  13.514 GB/s = 12.586 GiB/s  |  15.595 GB/s = 14.524 GiB/s
    farsh-x86-sse2            |  12.100 GB/s = 11.269 GiB/s  |  15.822 GB/s = 14.736 GiB/s
    farsh-x86                 |   2.894 GB/s =  2.695 GiB/s  |   3.467 GB/s =  3.229 GiB/s
    Intel(R) Atom(TM) CPU Z3735G @ 1.33GHz: AES-NI
    Code:
    aligned-farsh-x86-sse2    |   2.713 GB/s =  2.526 GiB/s  |   3.260 GB/s =  3.036 GiB/s
    aligned-farsh-x86         |   1.336 GB/s =  1.244 GiB/s  |   1.621 GB/s =  1.509 GiB/s
    farsh-x86-sse2            |   3.178 GB/s =  2.960 GiB/s  |   3.734 GB/s =  3.478 GiB/s
    farsh-x86                 |   1.384 GB/s =  1.289 GiB/s  |   1.632 GB/s =  1.520 GiB/s
    Last edited by Razor12911; 17th October 2016 at 10:03.

Page 3 of 3 FirstFirst 123

Similar Threads

  1. Knuth's Multiplicative Hashing
    By encode in forum Data Compression
    Replies: 27
    Last Post: 21st July 2020, 00:17
  2. Hashing
    By Bulat Ziganshin in forum Data Compression
    Replies: 1
    Last Post: 19th May 2015, 08:00
  3. hashing LZ
    By willvarfar in forum Data Compression
    Replies: 13
    Last Post: 24th August 2010, 20:29
  4. What type of hashing is this.
    By Earl Colby Pottinger in forum Data Compression
    Replies: 11
    Last Post: 22nd June 2010, 05:23
  5. A nice article about hashing
    By encode in forum Forum Archive
    Replies: 19
    Last Post: 26th September 2007, 21:31

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •