Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. Programing gut feeling - ** Update! **

Programing gut feeling - ** Update! **

Scheduled Pinned Locked Moved The Lounge
questionsysadminbeta-testingannouncementcode-review
14 Posts 10 Posters 1 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M megaadam

    I am not asking for absolute knowledge, I am not asking you to Google for me. And I know I could measure the answer myself. I just wonder whether you share my gut feeling on this. During a review yesterday I came across: (The language here is Go, but I would argue the same way in C etc. And uint64() is compile-time)

    gopNr = reqGopNr - entryStart + uint64(entry.Offset)
    if gopNr >= uint64(entry.assetLen) { // Avoid mod operation every time since wrap is unusual
    gopNr = gopNr % uint64(entry.assetLen)
    }

    I commented:

    I do not think % can have a measurable cost for small divisors. I would skip the if. It is a single IDIV operation in X86. If that is expensive, there might even be an if in the operator already...

    Shooting from the hip, what is your gut feeling? ** Update! ** Thanks for all the interesting feedback. So I did measure it: Go Playground - The Go Programming Language[^] For some reason the code always measures zero or times out on that playground, but measures fine locally. The verdict is: Running with if is in fact faster, if we stick to the original assumption that the divisor is almost always smaller. The difference is a blazing 10 nanoseconds or if you prefer factor ~4x on an old x86 laptop. I was wrong thinking that it would not be measurable. But this will run on monster server and this is not the most frequently visited code. So I still vote to remove the if for the sake of readability.

    "If we don't change direction, we'll end up where we're going"

    OriginalGriffO Offline
    OriginalGriffO Offline
    OriginalGriff
    wrote on last edited by
    #4

    I'm in two minds: part of me agrees with with Carlos - it depends on the target processor. ARM for example has conditional execution on almost every instruction, so the if becomes a "skip" rather than a full on jump. But ... the modulus operator is an integer divide with knobs on (unless the divisor is always a power of two), and they aren't cheap, so it could be that it's worth the comparison cost even if it breaks branch prediction. And since the condition requires address calculation as well as a comparison, I'd probably say "dump it" even then. Optimization may improve it if it's in a tight loop, but I'd want to look at the assembly code before making a final decision.

    "I have no idea what I did, but I'm taking full credit for it." - ThisOldTony "Common sense is so rare these days, it should be classified as a super power" - Random T-shirt AntiTwitter: @DalekDave is now a follower!

    "I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
    "Common sense is so rare these days, it should be classified as a super power" - Random T-shirt

    H 1 Reply Last reply
    0
    • OriginalGriffO OriginalGriff

      I'm in two minds: part of me agrees with with Carlos - it depends on the target processor. ARM for example has conditional execution on almost every instruction, so the if becomes a "skip" rather than a full on jump. But ... the modulus operator is an integer divide with knobs on (unless the divisor is always a power of two), and they aren't cheap, so it could be that it's worth the comparison cost even if it breaks branch prediction. And since the condition requires address calculation as well as a comparison, I'd probably say "dump it" even then. Optimization may improve it if it's in a tight loop, but I'd want to look at the assembly code before making a final decision.

      "I have no idea what I did, but I'm taking full credit for it." - ThisOldTony "Common sense is so rare these days, it should be classified as a super power" - Random T-shirt AntiTwitter: @DalekDave is now a follower!

      H Offline
      H Offline
      honey the codewitch
      wrote on last edited by
      #5

      Way more in depth than I've ever gotten. I just avoid ifs in tight code! But maybe I shouldn't be. I do know if you can make say, an entire DFA traversal without conditional branching (and I think it's possible?) it should be significantly faster than the traditional method, which requires a ton of branching, but then you probably wouldn't be using idiv instructions in the first place with such a beast. So I guess ultimately it depends, as you suggest. I did not know that about the ARMs. I've been mostly dealing with Tensilica XTensa LX chips, but I'm getting sick of them. The trouble with ARMs is they're as rare as hen's teeth. Out of stock everywhere for the ones I want.

      To err is human. Fortune favors the monsters.

      1 Reply Last reply
      0
      • M megaadam

        I am not asking for absolute knowledge, I am not asking you to Google for me. And I know I could measure the answer myself. I just wonder whether you share my gut feeling on this. During a review yesterday I came across: (The language here is Go, but I would argue the same way in C etc. And uint64() is compile-time)

        gopNr = reqGopNr - entryStart + uint64(entry.Offset)
        if gopNr >= uint64(entry.assetLen) { // Avoid mod operation every time since wrap is unusual
        gopNr = gopNr % uint64(entry.assetLen)
        }

        I commented:

        I do not think % can have a measurable cost for small divisors. I would skip the if. It is a single IDIV operation in X86. If that is expensive, there might even be an if in the operator already...

        Shooting from the hip, what is your gut feeling? ** Update! ** Thanks for all the interesting feedback. So I did measure it: Go Playground - The Go Programming Language[^] For some reason the code always measures zero or times out on that playground, but measures fine locally. The verdict is: Running with if is in fact faster, if we stick to the original assumption that the divisor is almost always smaller. The difference is a blazing 10 nanoseconds or if you prefer factor ~4x on an old x86 laptop. I was wrong thinking that it would not be measurable. But this will run on monster server and this is not the most frequently visited code. So I still vote to remove the if for the sake of readability.

        "If we don't change direction, we'll end up where we're going"

        Greg UtasG Offline
        Greg UtasG Offline
        Greg Utas
        wrote on last edited by
        #6

        I wouldn't care one way or the other unless this code was executed frequently. Very frequently.

        Robust Services Core | Software Techniques for Lemmings | Articles
        The fox knows many things, but the hedgehog knows one big thing.

        <p><a href="https://github.com/GregUtas/robust-services-core/blob/master/README.md">Robust Services Core</a>
        <em>The fox knows many things, but the hedgehog knows one big thing.</em></p>

        1 Reply Last reply
        0
        • M megaadam

          I am not asking for absolute knowledge, I am not asking you to Google for me. And I know I could measure the answer myself. I just wonder whether you share my gut feeling on this. During a review yesterday I came across: (The language here is Go, but I would argue the same way in C etc. And uint64() is compile-time)

          gopNr = reqGopNr - entryStart + uint64(entry.Offset)
          if gopNr >= uint64(entry.assetLen) { // Avoid mod operation every time since wrap is unusual
          gopNr = gopNr % uint64(entry.assetLen)
          }

          I commented:

          I do not think % can have a measurable cost for small divisors. I would skip the if. It is a single IDIV operation in X86. If that is expensive, there might even be an if in the operator already...

          Shooting from the hip, what is your gut feeling? ** Update! ** Thanks for all the interesting feedback. So I did measure it: Go Playground - The Go Programming Language[^] For some reason the code always measures zero or times out on that playground, but measures fine locally. The verdict is: Running with if is in fact faster, if we stick to the original assumption that the divisor is almost always smaller. The difference is a blazing 10 nanoseconds or if you prefer factor ~4x on an old x86 laptop. I was wrong thinking that it would not be measurable. But this will run on monster server and this is not the most frequently visited code. So I still vote to remove the if for the sake of readability.

          "If we don't change direction, we'll end up where we're going"

          L Offline
          L Offline
          Lost User
          wrote on last edited by
          #7

          The conditional branch should be slower on modern/latest Desktop cpu. Have a look at this table: [Instruction tables](https://www.agner.org/optimize/instruction\_tables.pdf) Scroll down to the Intel 11th generation Tiger Lake. The IDIV only costs 4 ops. The JGE and two MOVs for the conditional will exceed that. It depends on the cpu, older architectures benefit from the branch.

          L 1 Reply Last reply
          0
          • M megaadam

            I am not asking for absolute knowledge, I am not asking you to Google for me. And I know I could measure the answer myself. I just wonder whether you share my gut feeling on this. During a review yesterday I came across: (The language here is Go, but I would argue the same way in C etc. And uint64() is compile-time)

            gopNr = reqGopNr - entryStart + uint64(entry.Offset)
            if gopNr >= uint64(entry.assetLen) { // Avoid mod operation every time since wrap is unusual
            gopNr = gopNr % uint64(entry.assetLen)
            }

            I commented:

            I do not think % can have a measurable cost for small divisors. I would skip the if. It is a single IDIV operation in X86. If that is expensive, there might even be an if in the operator already...

            Shooting from the hip, what is your gut feeling? ** Update! ** Thanks for all the interesting feedback. So I did measure it: Go Playground - The Go Programming Language[^] For some reason the code always measures zero or times out on that playground, but measures fine locally. The verdict is: Running with if is in fact faster, if we stick to the original assumption that the divisor is almost always smaller. The difference is a blazing 10 nanoseconds or if you prefer factor ~4x on an old x86 laptop. I was wrong thinking that it would not be measurable. But this will run on monster server and this is not the most frequently visited code. So I still vote to remove the if for the sake of readability.

            "If we don't change direction, we'll end up where we're going"

            P Offline
            P Offline
            PIEBALDconsult
            wrote on last edited by
            #8

            I suppose both operands are known to always be positive? If not, you understand that the meaning of the % operator varies between languages? I probably wouldn't use the if, though I might test both ways just out of curiosity. It looks like a sophomoric inclusion. A freshman doesn't know an issue may exist. A sophomore thinks an issue may exist -- and adds protection. A master knows the suspected issue doesn't exist. A little knowledge is a dangerous thing. It's like when junior developers test an index value every time rather than simply catching an Exception (C#) when something goes awry.

            1 Reply Last reply
            0
            • M megaadam

              I am not asking for absolute knowledge, I am not asking you to Google for me. And I know I could measure the answer myself. I just wonder whether you share my gut feeling on this. During a review yesterday I came across: (The language here is Go, but I would argue the same way in C etc. And uint64() is compile-time)

              gopNr = reqGopNr - entryStart + uint64(entry.Offset)
              if gopNr >= uint64(entry.assetLen) { // Avoid mod operation every time since wrap is unusual
              gopNr = gopNr % uint64(entry.assetLen)
              }

              I commented:

              I do not think % can have a measurable cost for small divisors. I would skip the if. It is a single IDIV operation in X86. If that is expensive, there might even be an if in the operator already...

              Shooting from the hip, what is your gut feeling? ** Update! ** Thanks for all the interesting feedback. So I did measure it: Go Playground - The Go Programming Language[^] For some reason the code always measures zero or times out on that playground, but measures fine locally. The verdict is: Running with if is in fact faster, if we stick to the original assumption that the divisor is almost always smaller. The difference is a blazing 10 nanoseconds or if you prefer factor ~4x on an old x86 laptop. I was wrong thinking that it would not be measurable. But this will run on monster server and this is not the most frequently visited code. So I still vote to remove the if for the sake of readability.

              "If we don't change direction, we'll end up where we're going"

              D Offline
              D Offline
              Dan Neely
              wrote on last edited by
              #9

              My gut feeling is that code that's clever at the expense of readability should only be allowed when it has a demonstrated performance impact. Show me something that indicates that it will significantly increase application performance or the PR gets rejected.

              Did you ever see history portrayed as an old man with a wise brow and pulseless heart, weighing all things in the balance of reason? Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful? --Zachris Topelius

              F 1 Reply Last reply
              0
              • M megaadam

                I am not asking for absolute knowledge, I am not asking you to Google for me. And I know I could measure the answer myself. I just wonder whether you share my gut feeling on this. During a review yesterday I came across: (The language here is Go, but I would argue the same way in C etc. And uint64() is compile-time)

                gopNr = reqGopNr - entryStart + uint64(entry.Offset)
                if gopNr >= uint64(entry.assetLen) { // Avoid mod operation every time since wrap is unusual
                gopNr = gopNr % uint64(entry.assetLen)
                }

                I commented:

                I do not think % can have a measurable cost for small divisors. I would skip the if. It is a single IDIV operation in X86. If that is expensive, there might even be an if in the operator already...

                Shooting from the hip, what is your gut feeling? ** Update! ** Thanks for all the interesting feedback. So I did measure it: Go Playground - The Go Programming Language[^] For some reason the code always measures zero or times out on that playground, but measures fine locally. The verdict is: Running with if is in fact faster, if we stick to the original assumption that the divisor is almost always smaller. The difference is a blazing 10 nanoseconds or if you prefer factor ~4x on an old x86 laptop. I was wrong thinking that it would not be measurable. But this will run on monster server and this is not the most frequently visited code. So I still vote to remove the if for the sake of readability.

                "If we don't change direction, we'll end up where we're going"

                L Offline
                L Offline
                Lost User
                wrote on last edited by
                #10

                I'll put the `if` there, especially if wrapping is unusual. Even on modern chips with so-called "fast division", 64-bit `div` (surely we're talking about unsigned numbers here?) takes over a dozen cycles *at best*. Sure it has only a few µops today, but they're µops with a high latency (or at least one of them is anyway). Further in the past, `div` only gets worse. Computers with "slow division" are still extremely common. Cascade Lake still had slow division, those are high-end computers that are only a couple of years old. By contrast, a branch *can* be bad, but this one won't be, if the comment is to be believed. If wrapping is unusual, then the branch will usually be correctly predicted non-taken. The comparison (and associated loads, if any) that happens before the branch is also nearly irrelevant in that case, because that dependency chain ends in the branch. Code after it does not need to wait until the comparison is done. In the normal case where there is no wrapping, an instruction that uses the new value of `gopNr` may be able to execute back-to-back with the instruction that produced it (doesn't mean it *will*, but it could). That is of course impossible if there was a `div` between them. > If that is expensive, there might even be an if in the operator already... Doesn't happen on any compiler I'm familiar with. I'm not familiar with the Go compiler, but still. It's not really a thing.

                1 Reply Last reply
                0
                • D Dan Neely

                  My gut feeling is that code that's clever at the expense of readability should only be allowed when it has a demonstrated performance impact. Show me something that indicates that it will significantly increase application performance or the PR gets rejected.

                  Did you ever see history portrayed as an old man with a wise brow and pulseless heart, weighing all things in the balance of reason? Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful? --Zachris Topelius

                  F Offline
                  F Offline
                  Forogar
                  wrote on last edited by
                  #11

                  Yeah! If you can't do it in VB then don't other. :wtf:

                  - I would love to change the world, but they won’t give me the source code.

                  1 Reply Last reply
                  0
                  • L Lost User

                    The conditional branch should be slower on modern/latest Desktop cpu. Have a look at this table: [Instruction tables](https://www.agner.org/optimize/instruction\_tables.pdf) Scroll down to the Intel 11th generation Tiger Lake. The IDIV only costs 4 ops. The JGE and two MOVs for the conditional will exceed that. It depends on the cpu, older architectures benefit from the branch.

                    L Offline
                    L Offline
                    Lost User
                    wrote on last edited by
                    #12

                    CMP+JGE (these macro-fuse) and two MOVs is only 3 µops, and they're fast µops. Why are there two movs anyway, only `entry.assetLen` should be getting loaded here, we already have `gopNr` and I don't see any immediate reason to copy it to another register. These µops are also not in the dependency chain of `gopNr`, they're only there for the compare&branch, following code could execute at the same time as this condition is being evaluated (of course subject to throughput limitations). At least one of the µops in IDIV has a bad latency and moderately bad throughput, and they're in the dependency chain from computing `gopNr` to using it (not shown).

                    L 1 Reply Last reply
                    0
                    • L Lost User

                      CMP+JGE (these macro-fuse) and two MOVs is only 3 µops, and they're fast µops. Why are there two movs anyway, only `entry.assetLen` should be getting loaded here, we already have `gopNr` and I don't see any immediate reason to copy it to another register. These µops are also not in the dependency chain of `gopNr`, they're only there for the compare&branch, following code could execute at the same time as this condition is being evaluated (of course subject to throughput limitations). At least one of the µops in IDIV has a bad latency and moderately bad throughput, and they're in the dependency chain from computing `gopNr` to using it (not shown).

                      L Offline
                      L Offline
                      Lost User
                      wrote on last edited by
                      #13

                      Yeah, If it makes you feel better, I would probably use the 'if statement' simply because not everyone is on Ice/Tiger Lake. :)

                      1 Reply Last reply
                      0
                      • M megaadam

                        I am not asking for absolute knowledge, I am not asking you to Google for me. And I know I could measure the answer myself. I just wonder whether you share my gut feeling on this. During a review yesterday I came across: (The language here is Go, but I would argue the same way in C etc. And uint64() is compile-time)

                        gopNr = reqGopNr - entryStart + uint64(entry.Offset)
                        if gopNr >= uint64(entry.assetLen) { // Avoid mod operation every time since wrap is unusual
                        gopNr = gopNr % uint64(entry.assetLen)
                        }

                        I commented:

                        I do not think % can have a measurable cost for small divisors. I would skip the if. It is a single IDIV operation in X86. If that is expensive, there might even be an if in the operator already...

                        Shooting from the hip, what is your gut feeling? ** Update! ** Thanks for all the interesting feedback. So I did measure it: Go Playground - The Go Programming Language[^] For some reason the code always measures zero or times out on that playground, but measures fine locally. The verdict is: Running with if is in fact faster, if we stick to the original assumption that the divisor is almost always smaller. The difference is a blazing 10 nanoseconds or if you prefer factor ~4x on an old x86 laptop. I was wrong thinking that it would not be measurable. But this will run on monster server and this is not the most frequently visited code. So I still vote to remove the if for the sake of readability.

                        "If we don't change direction, we'll end up where we're going"

                        J Offline
                        J Offline
                        jsc42
                        wrote on last edited by
                        #14

                        megaadam wrote:

                        blazing 10 nanoseconds

                        That's about 10 feet (3 meters) at light speed. Whenever I see 'nanoseconds', I am reminded of USN Rear Admiral Grace Hopper - one of the "great's" in early computing history. See, for example, Grace Hopper's Nanoseconds[^] Mis-spelt 'Hopper'

                        1 Reply Last reply
                        0
                        Reply
                        • Reply as topic
                        Log in to reply
                        • Oldest to Newest
                        • Newest to Oldest
                        • Most Votes


                        • Login

                        • Don't have an account? Register

                        • Login or register to search.
                        • First post
                          Last post
                        0
                        • Categories
                        • Recent
                        • Tags
                        • Popular
                        • World
                        • Users
                        • Groups