Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C / C++ / MFC
  4. QueryThreadCycleTime, QueryProcessCycleTime: what do these number mean?

QueryThreadCycleTime, QueryProcessCycleTime: what do these number mean?

Scheduled Pinned Locked Moved C / C++ / MFC
questioncom
9 Posts 3 Posters 26 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • P Offline
    P Offline
    peterchen
    wrote on last edited by
    #1

    I've tried e.g. QueryThreadCycleTime as follows:

    #include
    #include

    int main()
    {

    for (int i = 0; i < 10; ++i)
    {
    uint64_t n1 = 0, n2 = 0;
    BOOL ok1 = QueryThreadCycleTime(GetCurrentThread(), &n1);
    BOOL ok2 = QueryThreadCycleTime(GetCurrentThread(), &n2);

      if (ok1 && ok2)
         std::cout << n2 - n1 << "\\n";
      else
         std::cout << "n/a\\n";
    

    }
    }

    Typical results are:

    1036
    1114
    1734
    748
    706
    670
    652
    716
    652
    666

    The numbers vary widely - ok, maybe these are expensive calls - but what is the point then? If I replace the thread cycles with process cycles, numbers get even weirder:

      BOOL ok1 = QueryProcessCycleTime(GetCurrentProcess(), &n1);
      BOOL ok2 = QueryProcessCycleTime(GetCurrentProcess(), &n2);
    

    With typical results:

    39666
    39520
    304964
    145932
    47486
    287156
    191528
    208652
    196176
    288642

    This blog post[^] suggests that it should be no worse than QueryPerformanceCounter, but that's not what I'm seeing. Anyone with insights?

    L CPalliniC 3 Replies Last reply
    0
    • P peterchen

      I've tried e.g. QueryThreadCycleTime as follows:

      #include
      #include

      int main()
      {

      for (int i = 0; i < 10; ++i)
      {
      uint64_t n1 = 0, n2 = 0;
      BOOL ok1 = QueryThreadCycleTime(GetCurrentThread(), &n1);
      BOOL ok2 = QueryThreadCycleTime(GetCurrentThread(), &n2);

        if (ok1 && ok2)
           std::cout << n2 - n1 << "\\n";
        else
           std::cout << "n/a\\n";
      

      }
      }

      Typical results are:

      1036
      1114
      1734
      748
      706
      670
      652
      716
      652
      666

      The numbers vary widely - ok, maybe these are expensive calls - but what is the point then? If I replace the thread cycles with process cycles, numbers get even weirder:

        BOOL ok1 = QueryProcessCycleTime(GetCurrentProcess(), &n1);
        BOOL ok2 = QueryProcessCycleTime(GetCurrentProcess(), &n2);
      

      With typical results:

      39666
      39520
      304964
      145932
      47486
      287156
      191528
      208652
      196176
      288642

      This blog post[^] suggests that it should be no worse than QueryPerformanceCounter, but that's not what I'm seeing. Anyone with insights?

      L Offline
      L Offline
      Lost User
      wrote on last edited by
      #2

      As quoted at QueryProcessCycleTime function (realtimeapiset.h) - Win32 apps | Microsoft Learn[^]:

      The number of CPU clock cycles used by the threads of the process. This value includes cycles spent in both user mode and kernel mode.

      . So it just shows which threads are consuming what. That may allow you to tune your application if it is largely compute bound.

      1 Reply Last reply
      0
      • P peterchen

        I've tried e.g. QueryThreadCycleTime as follows:

        #include
        #include

        int main()
        {

        for (int i = 0; i < 10; ++i)
        {
        uint64_t n1 = 0, n2 = 0;
        BOOL ok1 = QueryThreadCycleTime(GetCurrentThread(), &n1);
        BOOL ok2 = QueryThreadCycleTime(GetCurrentThread(), &n2);

          if (ok1 && ok2)
             std::cout << n2 - n1 << "\\n";
          else
             std::cout << "n/a\\n";
        

        }
        }

        Typical results are:

        1036
        1114
        1734
        748
        706
        670
        652
        716
        652
        666

        The numbers vary widely - ok, maybe these are expensive calls - but what is the point then? If I replace the thread cycles with process cycles, numbers get even weirder:

          BOOL ok1 = QueryProcessCycleTime(GetCurrentProcess(), &n1);
          BOOL ok2 = QueryProcessCycleTime(GetCurrentProcess(), &n2);
        

        With typical results:

        39666
        39520
        304964
        145932
        47486
        287156
        191528
        208652
        196176
        288642

        This blog post[^] suggests that it should be no worse than QueryPerformanceCounter, but that's not what I'm seeing. Anyone with insights?

        CPalliniC Offline
        CPalliniC Offline
        CPallini
        wrote on last edited by
        #3

        Your code on my machine:

        2716
        2793
        2768
        2617
        2977
        2708
        2686
        2795
        2731
        2651

        well, not perfect, but... Let's try another time:

        2703
        2891
        52734
        2611
        2623
        2613
        2611
        2609
        2608
        2592

        Absolute rubbish laddie!

        "In testa che avete, Signor di Ceprano?" -- Rigoletto

        In testa che avete, signor di Ceprano?

        L 1 Reply Last reply
        0
        • CPalliniC CPallini

          Your code on my machine:

          2716
          2793
          2768
          2617
          2977
          2708
          2686
          2795
          2731
          2651

          well, not perfect, but... Let's try another time:

          2703
          2891
          52734
          2611
          2623
          2613
          2611
          2609
          2608
          2592

          Absolute rubbish laddie!

          "In testa che avete, Signor di Ceprano?" -- Rigoletto

          L Offline
          L Offline
          Lost User
          wrote on last edited by
          #4

          CPallini wrote:

          Absolute rubbish laddie!

          I always suspected you were Scottish and not Italian. :laugh:

          CPalliniC 1 Reply Last reply
          0
          • L Lost User

            CPallini wrote:

            Absolute rubbish laddie!

            I always suspected you were Scottish and not Italian. :laugh:

            CPalliniC Offline
            CPalliniC Offline
            CPallini
            wrote on last edited by
            #5

            I am... Pink[^]!

            "In testa che avete, Signor di Ceprano?" -- Rigoletto

            In testa che avete, signor di Ceprano?

            1 Reply Last reply
            0
            • P peterchen

              I've tried e.g. QueryThreadCycleTime as follows:

              #include
              #include

              int main()
              {

              for (int i = 0; i < 10; ++i)
              {
              uint64_t n1 = 0, n2 = 0;
              BOOL ok1 = QueryThreadCycleTime(GetCurrentThread(), &n1);
              BOOL ok2 = QueryThreadCycleTime(GetCurrentThread(), &n2);

                if (ok1 && ok2)
                   std::cout << n2 - n1 << "\\n";
                else
                   std::cout << "n/a\\n";
              

              }
              }

              Typical results are:

              1036
              1114
              1734
              748
              706
              670
              652
              716
              652
              666

              The numbers vary widely - ok, maybe these are expensive calls - but what is the point then? If I replace the thread cycles with process cycles, numbers get even weirder:

                BOOL ok1 = QueryProcessCycleTime(GetCurrentProcess(), &n1);
                BOOL ok2 = QueryProcessCycleTime(GetCurrentProcess(), &n2);
              

              With typical results:

              39666
              39520
              304964
              145932
              47486
              287156
              191528
              208652
              196176
              288642

              This blog post[^] suggests that it should be no worse than QueryPerformanceCounter, but that's not what I'm seeing. Anyone with insights?

              L Offline
              L Offline
              Lost User
              wrote on last edited by
              #6

              peterchen wrote:

              This blog post[^] suggests that it should be no worse than QueryPerformanceCounter, but that's not what I'm seeing. Anyone with insights?

              My translation of what you are asking:

              peterchen should have asked:

              QueryProcessCycleTime uses the RDTSC instruction. QueryPerformanceCounter historically also used the RDTSC instruction. Why aren't they giving similar outputs?

              As you probably already know the TSC is superceeded by [the HPET](https://en.wikipedia.org/wiki/High\_Precision\_Event\_Timer). There are a dozen reasons why RDTSC provides inaccurate results. The Meltdown/Spectre mitigations were probably the nail in the coffin so to speak. The answer to your question is that [QueryPerformanceCounter](https://learn.microsoft.com/en-us/windows/win32/api/profileapi/nf-profileapi-queryperformancecounter) uses the HPET/APIC clock and QueryProcessCycleTime is still using the old [rdtsc instruction](https://learn.microsoft.com/en-us/cpp/intrinsics/rdtsc?view=msvc-170). It's apparently a huge mess, @HaroldAptroot says *sometimes* QueryPerformanceCounter uses HPET and sometimes it doesn't depending on whether or not the TSC is invariant.

              L 1 Reply Last reply
              0
              • L Lost User

                peterchen wrote:

                This blog post[^] suggests that it should be no worse than QueryPerformanceCounter, but that's not what I'm seeing. Anyone with insights?

                My translation of what you are asking:

                peterchen should have asked:

                QueryProcessCycleTime uses the RDTSC instruction. QueryPerformanceCounter historically also used the RDTSC instruction. Why aren't they giving similar outputs?

                As you probably already know the TSC is superceeded by [the HPET](https://en.wikipedia.org/wiki/High\_Precision\_Event\_Timer). There are a dozen reasons why RDTSC provides inaccurate results. The Meltdown/Spectre mitigations were probably the nail in the coffin so to speak. The answer to your question is that [QueryPerformanceCounter](https://learn.microsoft.com/en-us/windows/win32/api/profileapi/nf-profileapi-queryperformancecounter) uses the HPET/APIC clock and QueryProcessCycleTime is still using the old [rdtsc instruction](https://learn.microsoft.com/en-us/cpp/intrinsics/rdtsc?view=msvc-170). It's apparently a huge mess, @HaroldAptroot says *sometimes* QueryPerformanceCounter uses HPET and sometimes it doesn't depending on whether or not the TSC is invariant.

                L Offline
                L Offline
                Lost User
                wrote on last edited by
                #7

                Randor wrote:

                As you probably already know the TSC is superceeded by the HPET.

                Not actually true though, QPC is based on HPET *only when necessary*, which is basically if you have a CPU that does not have Invariant TSC (and you don't, unless your CPU is from the mid 2000's). QPC is based on the TSC on every reasonable computer.

                L 1 Reply Last reply
                0
                • L Lost User

                  Randor wrote:

                  As you probably already know the TSC is superceeded by the HPET.

                  Not actually true though, QPC is based on HPET *only when necessary*, which is basically if you have a CPU that does not have Invariant TSC (and you don't, unless your CPU is from the mid 2000's). QPC is based on the TSC on every reasonable computer.

                  L Offline
                  L Offline
                  Lost User
                  wrote on last edited by
                  #8

                  Hmmm, Do you know where I can find a list of processors with an invariant TSC? I see that cpuid has 80000007H to indicate support but where can I find a list of processors that support it?

                  L 1 Reply Last reply
                  0
                  • L Lost User

                    Hmmm, Do you know where I can find a list of processors with an invariant TSC? I see that cpuid has 80000007H to indicate support but where can I find a list of processors that support it?

                    L Offline
                    L Offline
                    Lost User
                    wrote on last edited by
                    #9

                    Unfortunately I don't know of such a list and I couldn't find one either. There are lists of CPUID dumps.. not very convenient

                    1 Reply Last reply
                    0
                    Reply
                    • Reply as topic
                    Log in to reply
                    • Oldest to Newest
                    • Newest to Oldest
                    • Most Votes


                    • Login

                    • Don't have an account? Register

                    • Login or register to search.
                    • First post
                      Last post
                    0
                    • Categories
                    • Recent
                    • Tags
                    • Popular
                    • World
                    • Users
                    • Groups