Results 1 to 7 of 7

Thread: Code snippet to compute CPU frequency

  1. #1
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,553
    Thanks
    767
    Thanked 685 Times in 371 Posts

    Code snippet to compute CPU frequency

    Note that you absolutely need to compile with optimizations on (-O2) in order to get correct result. The algorithm is based on assumptions that a CPU is superscalar and can perform ADD and XOR operations in a single cycle


    #include <stdio.h>
    #include <stdlib.h>
    #include <sys/time.h>


    // Perform CYCLES simple in-order operations
    unsigned loop(int CYCLES)
    {
    unsigned a = rand(), b = rand(), x = rand();
    for (int i=0; i < CYCLES/10; i++)
    {
    x = (x + a) ^ b;
    x = (x + a) ^ b;
    x = (x + a) ^ b;
    x = (x + a) ^ b;
    x = (x + a) ^ b;
    }
    return x;
    }


    int main(int argc, char *argv[])
    {
    int CYCLES = 100*1000*1000;
    unsigned x = loop(CYCLES/10); // warm up the cpu

    struct timeval tm, tn;
    gettimeofday(&tm, NULL);

    x += loop(CYCLES);

    gettimeofday(&tn, NULL);
    double t1 = (tn.tv_sec - tm.tv_sec) +
    (tn.tv_usec - tm.tv_usec) / 1e6;

    if (x)
    printf("Time: %.6f s, CPU freq %.2f GHz\n", t1, (CYCLES/1e9)/t1);
    return 0;
    }

  2. Thanks:

    Lucas (7th May 2020)

  3. #2
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,842
    Thanks
    288
    Thanked 1,244 Times in 697 Posts
    1. VS doesn't have gettimeofday: https://stackoverflow.com/questions/...ay-for-windows

    2. Your code didn't work correctly for me:
    Code:
    Time: 0.022079 s, CPU freq 4.53 GHz; RDTSC result: 2.01 Ghz
    3. There's this: https://github.com/google/benchmark/...leclock.h#L116
    Attached Files Attached Files

  4. Thanks:

    Bulat Ziganshin (7th May 2020)

  5. #3
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,553
    Thanks
    767
    Thanked 685 Times in 371 Posts
    yeah, my code directly measures frequency based on some assumptions about command execution on modern CPUs rather than requesting OS/CPU for this info

    my goal was to write a small portable code measuring the CURRENT frequency of the CORE executing this code. My best bet is that 2 GHz is base freq of your CPU, since it's what RDTSC measures while my code measured 4.5 GHz which is turbo freq for a single core on your CPU. If you can disclose your CPU model, we can check my assumption.

    By any means, the code performed 100 millions of (hopefully) dependent operations in 1/45 sec, and it's really strange to do what at 2 GHz.

    My own measurement on i7-8665 (taskman reports base freq of 2.1 and current freqs up to ~4 GHz):
    ====
    Time: 0.026001 s, CPU freq 3.85 GHz; RDTSC result: 2.08 Ghz
    ====

    Note that the time is different by 18% while RDTSC-measured freqs differ only by a few percents. I.e. according to RDTSC, our CPUs spent different amount of cpu cycles performing the same, cache-local code. Moreover, on my own CPU we can get pretty different results:
    ====
    D:\Downloads\003>1.exe
    Time: 0.027003 s, CPU freq 3.70 GHz; RDTSC result: 2.10 Ghz

    D:\Downloads\003>1.exe
    Time: 0.023000 s, CPU freq 4.35 GHz; RDTSC result: 2.16 Ghz
    ====

    I think, the conclusion is obvious


    PS: I should try longer warm-up to reach maximum freq possible, or at least more stable one

  6. #4
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,553
    Thanks
    767
    Thanked 685 Times in 371 Posts
    Quick check at godbolt shows that all compilers generate exactly the expected code. F.e. clang on POWER64:

    Code:
    .L3:
            add 3,31,3
            xor 3,30,3
            add 3,3,31
            xor 3,30,3
            add 3,3,31
            xor 3,30,3
            add 3,3,31
            xor 3,30,3
            add 3,3,31
            xor 3,30,3
            rldicl 3,3,0,32
            bdnz.L3
    ​
    Last edited by Bulat Ziganshin; 7th May 2020 at 22:44.

  7. #5
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,842
    Thanks
    288
    Thanked 1,244 Times in 697 Posts
    > My best bet is that 2 GHz is base freq of your CPU, since it's what RDTSC measures

    Actually its the idle freq of my cpu.

    > measuring the CURRENT frequency of the CORE executing this code

    TSC does exactly that - its per-core and it increments at actual current clock speed.
    Turbo tricks make it hard to use for time measurement, but its still the perfect
    tool for measurement of cpu clocks taken by some code - by design.

    Also TSC equivalents exist on all modern architectures (including GPU) -
    google benchmark header linked in my previous post includes TSC-like code
    for ARM and PPC.

    > If you can disclose your CPU model, we can check my assumption.

    Its 7820X.

    > By any means, the code performed 100 millions of (hopefully) dependent operations in 1/45 sec,
    > and it's really strange to do what at 2 GHz.

    Ok, I made a new version which measures clocks taken by _one_ instance of your code:
    test1:
    .text:04015F7 call _Z5rdtscv
    .text:04015FC mov [rsp+38h+var_10], rax
    .text:0401601 call _Z5rdtscv

    test2:
    .text:040158A call _Z5rdtscv
    .text:040158F mov [rsp+38h+var_10], rax
    .text:0401594 lea eax, [r8+r11]
    .text:0401598 xor eax, r9d
    .text:040159B add eax, r11d
    .text:040159E xor eax, r9d
    .text:04015A1 add eax, r11d
    .text:04015A4 xor eax, r9d
    .text:04015A7 add eax, r11d
    .text:04015AA xor eax, r9d
    .text:04015AD add eax, r11d
    .text:04015B0 xor r9d, eax
    .text:04015B3 call _Z5rdtscv


    Code:
    subsequent RDTSC calls: 13 clk
    RDTSC calls around 5 xor-adds: 14 clk
    So I suspect that you underestimated the IPC in this case (I disabled HT).

    > PS: I should try longer warm-up to reach maximum freq possible, or at least more stable one

    Problem is, not all compression algorithms manage to warm up the cpu properly.
    Also "warm up" with AVX2/AVX512 usage could reduce cpu freq rather than increasing it.
    Attached Files Attached Files

  8. #6
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,842
    Thanks
    288
    Thanked 1,244 Times in 697 Posts
    Interesting... I tried adding more "x = (x + a) ^ b" lines:
    Code:
     0 lines -> 13 clk
     5 lines -> 14 clk
    10 lines -> 15 clk
    15 lines -> 19 clk
    20 lines -> 23 clk
    25 lines -> 27 clk
    30 lines -> 31 clk
    35 lines -> 37 clk
    40 lines -> 41 clk
    45 lines -> 45 clk
    50 lines -> 51 clk
    Ok, so I modified the first script according to this, somehow its a better match for TSC now.
    GCC doesn't actually unroll the loop though.
    // Perform CYCLES simple in-order operations
    uint loop( int CYCLES ) {
    uint i,j, a = rand(), b = rand(), x = rand();
    for( i=0; i<CYCLES/45; i++ ) {
    #pragma unroll(50)
    for( j=0; j<50; j++ ) x = (x + a) ^ b;
    }
    return x;
    }

    Code:
    Time: 0.049632 s, CPU freq 2.01 GHz; RDTSC result: 1.99 Ghz

  9. #7
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,553
    Thanks
    767
    Thanked 685 Times in 371 Posts
    https://ark.intel.com/content/www/us...-4-30-ghz.html :

    Intel® Turbo Boost Max Technology 3.0 Frequency ‡
    4.50 GHz

Similar Threads

  1. Replies: 8
    Last Post: 30th July 2019, 17:20
  2. Best CPU platform
    By Bulat Ziganshin in forum The Off-Topic Lounge
    Replies: 12
    Last Post: 7th July 2019, 19:03
  3. Replies: 8
    Last Post: 23rd September 2016, 13:41
  4. Finding most frequency sequences in data?
    By RichSelian in forum Data Compression
    Replies: 5
    Last Post: 21st September 2012, 03:29
  5. More CPU or More Ghz?
    By Nania Francesco in forum The Off-Topic Lounge
    Replies: 17
    Last Post: 19th March 2009, 17:23

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •