Not much better on Phenom 2:
Code:
pcbsd-8973% please ./fsbench -i3 zlib,1 zlib,2 zlib,3 zlib,4 zlib,5 zlib,6 zlib,7 zlib,8 zlib,9 ~/bench/scc1.tar
Codec version args
C.Size (C.Ratio) E.Speed D.Speed E.Eff. D.Eff.
zlib 2014-06-16 Intel 1
5583776 (x 2.256) 40.6 MB/s 213 MB/s 22e6 118e6
zlib 2014-06-16 Intel 2
5452748 (x 2.310) 37.7 MB/s 221 MB/s 21e6 125e6
zlib 2014-06-16 Intel 3
5320826 (x 2.367) 32.2 MB/s 229 MB/s 18e6 132e6
zlib 2014-06-16 Intel 4
5248936 (x 2.400) 29.0 MB/s 223 MB/s 16e6 129e6
zlib 2014-06-16 Intel 5
5142273 (x 2.449) 22.5 MB/s 228 MB/s 13e6 134e6
zlib 2014-06-16 Intel 6
5069744 (x 2.484) 16.4 MB/s 231 MB/s 9e6 137e6
zlib 2014-06-16 Intel 7
5056512 (x 2.491) 14.0 MB/s 234 MB/s 8560e3 140e6
zlib 2014-06-16 Intel 8
5041030 (x 2.499) 9594 KB/s 236 MB/s 5754e3 141e6
zlib 2014-06-16 Intel 9
5039048 (x 2.500) 8680 KB/s 235 MB/s 5207e3 140e6
Codec version args
C.Size (C.Ratio) E.Speed D.Speed E.Eff. D.Eff.
done... (3*X*1) iteration(s)).
pcbsd-8973% please ./fsbench -i3 zlib,1 zlib,2 zlib,3 zlib,4 zlib,5 zlib,6 zlib,7 zlib,8 zlib,9 ~/bench/scc1.tar
Codec version args
C.Size (C.Ratio) E.Speed D.Speed E.Eff. D.Eff.
zlib 1.2.8 1
5583512 (x 2.256) 40.8 MB/s 217 MB/s 22e6 120e6
zlib 1.2.8 2
5451648 (x 2.310) 37.6 MB/s 226 MB/s 21e6 128e6
zlib 1.2.8 3
5315631 (x 2.369) 31.3 MB/s 235 MB/s 18e6 135e6
zlib 1.2.8 4
5244572 (x 2.402) 28.8 MB/s 229 MB/s 16e6 133e6
zlib 1.2.8 5
5132183 (x 2.454) 21.6 MB/s 233 MB/s 12e6 137e6
zlib 1.2.8 6
5069744 (x 2.484) 15.9 MB/s 239 MB/s 9733e3 142e6
zlib 1.2.8 7
5056513 (x 2.491) 13.4 MB/s 238 MB/s 8225e3 142e6
zlib 1.2.8 8
5041027 (x 2.499) 9104 KB/s 240 MB/s 5460e3 143e6
zlib 1.2.8 9
5039044 (x 2.500) 8249 KB/s 240 MB/s 4949e3 143e6
Codec version args
C.Size (C.Ratio) E.Speed D.Speed E.Eff. D.Eff.
done... (3*X*1) iteration(s)).
Compression is slightly better, especially in strong modes, decompression is worse. Maybe I do something wrong?
ADDED: benchmark code is here:
https://chiselapp.com/user/Justin_be.../dir?type=tree
If you want to check yourself, you need to build 2 versions of the program, one should be stock, in the other you should edit CMakeLists.txt and change
Code:
set(USE_ZLIB 1)
set(USE_ZLIB_INTEL 0)
To
Code:
set(USE_ZLIB 0)
set(USE_ZLIB_INTEL 1)
ADDED:
If I enable SSE2, the code gets faster:
Code:
pcbsd-8973% please ./fsbench -i3 zlib,1 zlib,2 zlib,3 zlib,4 zlib,5 zlib,6 zlib,7 zlib,8 zlib,9 ~/bench/scc1.tar
Codec version args
C.Size (C.Ratio) E.Speed D.Speed E.Eff. D.Eff.
zlib 2014-06-16 Intel 1
5583776 (x 2.256) 52.4 MB/s 215 MB/s 29e6 119e6
zlib 2014-06-16 Intel 2
5452748 (x 2.310) 47.7 MB/s 222 MB/s 27e6 126e6
zlib 2014-06-16 Intel 3
5320826 (x 2.367) 39.1 MB/s 233 MB/s 22e6 134e6
zlib 2014-06-16 Intel 4
5248936 (x 2.400) 34.4 MB/s 227 MB/s 20e6 132e6
zlib 2014-06-16 Intel 5
5142273 (x 2.449) 25.6 MB/s 231 MB/s 15e6 136e6
zlib 2014-06-16 Intel 6
5069744 (x 2.484) 18.1 MB/s 236 MB/s 10e6 140e6
zlib 2014-06-16 Intel 7
5056512 (x 2.491) 15.1 MB/s 237 MB/s 9225e3 142e6
zlib 2014-06-16 Intel 8
5041030 (x 2.499) 9.87 MB/s 240 MB/s 6061e3 144e6
zlib 2014-06-16 Intel 9
5039048 (x 2.500) 9125 KB/s 240 MB/s 5474e3 143e6
Codec version args
C.Size (C.Ratio) E.Speed D.Speed E.Eff. D.Eff.
done... (3*X*1) iteration(s)).
It has options to use newer x86 extensions, but I don't have a CPU to test. And fsbench support for such things is quite bad and needs a redesign.