On linux I prefer to use google perftools as it can give much more information, including important things like clock speed variations.
Code:
@ seq3a[/tmp]; perf stat gzip enwik8
Performance counter stats for 'gzip enwik8':
7418.784633 task-clock # 0.997 CPUs utilized
626 context-switches # 0.000 M/sec
0 CPU-migrations # 0.000 M/sec
269 page-faults # 0.000 M/sec
16264693567 cycles # 2.192 GHz [83.31%]
8801104276 stalled-cycles-frontend # 54.11% frontend cycles idle [83.37%]
6272475913 stalled-cycles-backend # 38.56% backend cycles idle [66.63%]
19361014083 instructions # 1.19 insns per cycle
# 0.45 stalled cycles per insn [83.31%]
3888002759 branches # 524.075 M/sec [83.41%]
144395301 branch-misses # 3.71% of all branches [83.33%]
7.439664200 seconds time elapsed
@ seq3a[/tmp]; gunzip enwik8.gz
@ seq3a[/tmp]; perf stat gzip enwik8
Performance counter stats for 'gzip enwik8':
6459.695532 task-clock # 0.998 CPUs utilized
547 context-switches # 0.000 M/sec
0 CPU-migrations # 0.000 M/sec
269 page-faults # 0.000 M/sec
17658879966 cycles # 2.734 GHz [83.35%]
10214032205 stalled-cycles-frontend # 57.84% frontend cycles idle [83.31%]
7496824886 stalled-cycles-backend # 42.45% backend cycles idle [66.65%]
19356436112 instructions # 1.10 insns per cycle
# 0.53 stalled cycles per insn [83.35%]
3885518453 branches # 601.502 M/sec [83.37%]
144184551 branch-misses # 3.71% of all branches [83.33%]
6.475480539 seconds time elapsed
This system has automatic cpu frequency scaling enabled in the bios, which makes timings a complete nightmare. The elapsed and CPU time reported by "time" is highly variable, but we see from perf that the number of instructions is the same. Cycles still differs a bit (8.6% variation between 2 runs), but this is much lower than the CPUs variation (15%). I can run it a few times until I see CPU MHz has stabilised, which gives me more confidence the answer is correct. The next two goes I got:
Code:
Performance counter stats for 'gzip enwik8':
5560.004486 task-clock # 0.998 CPUs utilized
472 context-switches # 0.000 M/sec
3 CPU-migrations # 0.000 M/sec
269 page-faults # 0.000 M/sec
16309805185 cycles # 2.933 GHz [83.33%]
8850957237 stalled-cycles-frontend # 54.27% frontend cycles idle [83.36%]
6365518735 stalled-cycles-backend # 39.03% backend cycles idle [66.67%]
19354319475 instructions # 1.19 insns per cycle
# 0.46 stalled cycles per insn [83.33%]
3887668162 branches # 699.220 M/sec [83.36%]
144299232 branch-misses # 3.71% of all branches [83.32%]
5.572425708 seconds time elapsed
@ seq3a[/tmp]; perf stat gzip enwik8
Performance counter stats for 'gzip enwik8':
5617.336945 task-clock # 0.998 CPUs utilized
477 context-switches # 0.000 M/sec
2 CPU-migrations # 0.000 M/sec
269 page-faults # 0.000 M/sec
16751214185 cycles # 2.982 GHz [83.30%]
9285739462 stalled-cycles-frontend # 55.43% frontend cycles idle [83.32%]
6746852365 stalled-cycles-backend # 40.28% backend cycles idle [66.65%]
19344892064 instructions # 1.15 insns per cycle
# 0.48 stalled cycles per insn [83.36%]
3886052970 branches # 691.796 M/sec [83.39%]
144311360 branch-misses # 3.71% of all branches [83.37%]
5.629982425 seconds time elapsed
I could do that with straight "time" of course, but it's nice to know *why* the variation is happening.