Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
CODE PROJECT For Those Who Code
  • Home
  • Articles
  • FAQ
Community
  1. Home
  2. General Programming
  3. C#
  4. How can I avoid a for-loop and speed up my code

How can I avoid a for-loop and speed up my code

Scheduled Pinned Locked Moved C#
csharpdata-structuresperformancequestion
17 Posts 8 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • C ComCoderCsharp

    Hi coders I'm doing some advanced calculations mostly on sound, but I'm relative new to C#...so I was wondering if anyone has some advice on how I can (if possible) speed up the more primitive part of my code...e.g when I want to multply to arrays colmnwise. In Matlab this would look like Z=X.*Y but I cannot figure out a smart way to do this without using a for-loop in C# like for (int i = 0; i < length; i++) { Z[i]=x[i]*Y[i]; } As you can see, this takes a long time if the array consist of a sound segment of e.g 3 seconds which is 132300 samples. Do you know of any better way how this can be done... Best regards AL

    J Offline
    J Offline
    Judah Gabriel Himango
    wrote on last edited by
    #6

    One way to speed up things like this on multi-processor or multi-core machines is to use multiple threads to do the processing. Say you have one of those new Intel Quad Core processors, you've got 4 hardware threads available. Split the array into 4 segments, then spawn 3 new threads, each the processes a segment of its own. Then have the current thread process a segment of its own (4 threads total). You've essentially increased performance 4x. Of course, this solution only works for processors that are hyperthreaded (i.e. 2 hardware threads per physical processor core) or for machines that have multiple processors or multiple cores. p.s. one should always question optimizing code like this unless you're absolutely, positively certain that there is a performance bottleneck here and the current performance is not acceptable.

    Tech, life, family, faith: Give me a visit. I'm currently blogging about: Check out this cutie The apostle Paul, modernly speaking: Epistles of Paul Judah Himango

    L 1 Reply Last reply
    0
    • J Judah Gabriel Himango

      One way to speed up things like this on multi-processor or multi-core machines is to use multiple threads to do the processing. Say you have one of those new Intel Quad Core processors, you've got 4 hardware threads available. Split the array into 4 segments, then spawn 3 new threads, each the processes a segment of its own. Then have the current thread process a segment of its own (4 threads total). You've essentially increased performance 4x. Of course, this solution only works for processors that are hyperthreaded (i.e. 2 hardware threads per physical processor core) or for machines that have multiple processors or multiple cores. p.s. one should always question optimizing code like this unless you're absolutely, positively certain that there is a performance bottleneck here and the current performance is not acceptable.

      Tech, life, family, faith: Give me a visit. I'm currently blogging about: Check out this cutie The apostle Paul, modernly speaking: Epistles of Paul Judah Himango

      L Offline
      L Offline
      Luc Pattyn
      wrote on last edited by
      #7

      For a job as simple as an inner product, I expect bus bandwidth limitations will be dominant over CPU limitations, so no much help from multi-threading... I do fully agree with the p.s. though :)

      Luc Pattyn

      J 1 Reply Last reply
      0
      • J Judah Gabriel Himango

        Unsafe C# can get around bounds checking as well, and would be a lot easier that having to code up a separate C++ lib and invoke that from C#.

        Tech, life, family, faith: Give me a visit. I'm currently blogging about: Check out this cutie The apostle Paul, modernly speaking: Epistles of Paul Judah Himango

        D Offline
        D Offline
        Dan Neely
        wrote on last edited by
        #8

        True, but IIRC native code is still faster than unsafe.

        -- Rules of thumb should not be taken for the whole hand.

        J 1 Reply Last reply
        0
        • D Dan Neely

          True, but IIRC native code is still faster than unsafe.

          -- Rules of thumb should not be taken for the whole hand.

          J Offline
          J Offline
          Judah Gabriel Himango
          wrote on last edited by
          #9

          I'd be surprised at that. And given the P/Invoke overhead that would be required with a C++ lib on the side, unsafe C# may quite well out-perform.

          Tech, life, family, faith: Give me a visit. I'm currently blogging about: Check out this cutie The apostle Paul, modernly speaking: Epistles of Paul Judah Himango

          D 1 Reply Last reply
          0
          • L Luc Pattyn

            For a job as simple as an inner product, I expect bus bandwidth limitations will be dominant over CPU limitations, so no much help from multi-threading... I do fully agree with the p.s. though :)

            Luc Pattyn

            J Offline
            J Offline
            Judah Gabriel Himango
            wrote on last edited by
            #10

            The job is simple, but the bottleneck is the linear time it takes to compute one operation, move onto the next, until finished. Meanwhile, potentially 1 or more cores are idle and could be doing these operations in the meantime. I'm quite certain you'd see a good speed up here. The MS Robotics team that built the CCR (concurrency and coordination runtime, a .NET library for threading and coordination among threads) found big speedups, often near a multiple of the number of cores in a machine, by utilizing multiple threads to do this kind of thing. Given, they are doing lots of IO, however. Joe Duffy, a CLR architect, is busy working on the PLinq (Parallel Language Integrated Query) project that will allow devs to easily parallelize queries and transformations on data. This is essentially the technique I described above: using 1 thread per hardware thread to parallelize queries and transformations on data. Eric Sink has an article[^] on his blog showing how C# can do Map, which utilizes this idea.

            Tech, life, family, faith: Give me a visit. I'm currently blogging about: Check out this cutie The apostle Paul, modernly speaking: Epistles of Paul Judah Himango

            L 1 Reply Last reply
            0
            • J Judah Gabriel Himango

              The job is simple, but the bottleneck is the linear time it takes to compute one operation, move onto the next, until finished. Meanwhile, potentially 1 or more cores are idle and could be doing these operations in the meantime. I'm quite certain you'd see a good speed up here. The MS Robotics team that built the CCR (concurrency and coordination runtime, a .NET library for threading and coordination among threads) found big speedups, often near a multiple of the number of cores in a machine, by utilizing multiple threads to do this kind of thing. Given, they are doing lots of IO, however. Joe Duffy, a CLR architect, is busy working on the PLinq (Parallel Language Integrated Query) project that will allow devs to easily parallelize queries and transformations on data. This is essentially the technique I described above: using 1 thread per hardware thread to parallelize queries and transformations on data. Eric Sink has an article[^] on his blog showing how C# can do Map, which utilizes this idea.

              Tech, life, family, faith: Give me a visit. I'm currently blogging about: Check out this cutie The apostle Paul, modernly speaking: Epistles of Paul Judah Himango

              L Offline
              L Offline
              Luc Pattyn
              wrote on last edited by
              #11

              Sure, I believe multithreading can be great for any single job that is compute bound (including Map, the CCR stuff, and much more), as well as for most situations where a multitude of jobs come together. But my point is multiplying two arrays isnt much more than a data mover. And I expect the loop overhead will mostly be dealt with by the CPU's out-of-order capabilities. So lets wait and see. :)

              Luc Pattyn

              1 Reply Last reply
              0
              • J Judah Gabriel Himango

                I'd be surprised at that. And given the P/Invoke overhead that would be required with a C++ lib on the side, unsafe C# may quite well out-perform.

                Tech, life, family, faith: Give me a visit. I'm currently blogging about: Check out this cutie The apostle Paul, modernly speaking: Epistles of Paul Judah Himango

                D Offline
                D Offline
                Dan Neely
                wrote on last edited by
                #12

                I'm not sure if this counts or not, there'e //unsafe comments in the C# source, but I thought you needed an unsafe keyword as well which I didn't see. It does have c++ scoring better in most math related benchmarks. Bencharks[^]

                -- Rules of thumb should not be taken for the whole hand.

                J 1 Reply Last reply
                0
                • C ComCoderCsharp

                  Hi coders I'm doing some advanced calculations mostly on sound, but I'm relative new to C#...so I was wondering if anyone has some advice on how I can (if possible) speed up the more primitive part of my code...e.g when I want to multply to arrays colmnwise. In Matlab this would look like Z=X.*Y but I cannot figure out a smart way to do this without using a for-loop in C# like for (int i = 0; i < length; i++) { Z[i]=x[i]*Y[i]; } As you can see, this takes a long time if the array consist of a sound segment of e.g 3 seconds which is 132300 samples. Do you know of any better way how this can be done... Best regards AL

                  I Offline
                  I Offline
                  Insincere Dave
                  wrote on last edited by
                  #13

                  There is a library at Microsoft Research called Accelerator that might be worth investigating. Accelerator provides a high-level data-parallel programming model as a library that is available for all .Net programming languages. The library translates the data-parallel operations on-the-fly to optimized GPU pixel shader code and API calls. Future versions will target multi-core cpus. Download, Channel9 Video

                  1 Reply Last reply
                  0
                  • D Dan Neely

                    I'm not sure if this counts or not, there'e //unsafe comments in the C# source, but I thought you needed an unsafe keyword as well which I didn't see. It does have c++ scoring better in most math related benchmarks. Bencharks[^]

                    -- Rules of thumb should not be taken for the whole hand.

                    J Offline
                    J Offline
                    Judah Gabriel Himango
                    wrote on last edited by
                    #14

                    Yeah, it looks like he commented the unsafe portions, meaning he's back to normal array bounds checking and all that. I wonder why?

                    Tech, life, family, faith: Give me a visit. I'm currently blogging about: Check out this cutie The apostle Paul, modernly speaking: Epistles of Paul Judah Himango

                    1 Reply Last reply
                    0
                    • C ComCoderCsharp

                      Hi coders I'm doing some advanced calculations mostly on sound, but I'm relative new to C#...so I was wondering if anyone has some advice on how I can (if possible) speed up the more primitive part of my code...e.g when I want to multply to arrays colmnwise. In Matlab this would look like Z=X.*Y but I cannot figure out a smart way to do this without using a for-loop in C# like for (int i = 0; i < length; i++) { Z[i]=x[i]*Y[i]; } As you can see, this takes a long time if the array consist of a sound segment of e.g 3 seconds which is 132300 samples. Do you know of any better way how this can be done... Best regards AL

                      L Offline
                      L Offline
                      lost in transition
                      wrote on last edited by
                      #15

                      Bet you didn't know you would get all this 'feed back' did you:) Just remember that no matter what you will still have the loop. So, think of it like this do I have a loop here with a little code or do I have a loop somewhere else with extra code to get there? Your call, but these guys are giving some real good knowledge that you should diffently try to learn. Good Luck, Jason

                      Programmer: A biological machine designed to convert caffeine into code. * Developer: A person who develops working systems by writing and using software. [^]

                      1 Reply Last reply
                      0
                      • C ComCoderCsharp

                        Hi coders I'm doing some advanced calculations mostly on sound, but I'm relative new to C#...so I was wondering if anyone has some advice on how I can (if possible) speed up the more primitive part of my code...e.g when I want to multply to arrays colmnwise. In Matlab this would look like Z=X.*Y but I cannot figure out a smart way to do this without using a for-loop in C# like for (int i = 0; i < length; i++) { Z[i]=x[i]*Y[i]; } As you can see, this takes a long time if the array consist of a sound segment of e.g 3 seconds which is 132300 samples. Do you know of any better way how this can be done... Best regards AL

                        E Offline
                        E Offline
                        Ennis Ray Lynch Jr
                        wrote on last edited by
                        #16

                        You can multithread the multiplication if the arrays are large enough. Another option is to drop into unsafe code and use pointer arithmetic. Antoher option is to use 64 bit multiplication but you might need a bitwise transform on the result. (Check out the the assembly for strlen for a non-application example) I don't know about intel chips but some processors provide op codes for array based operations. Check the MMX instruction set and you may be able to multiply the entire set in two or three op codes. http://web.cs.wpi.edu/~matt/courses/cs563/talks/powwie/p3/mmx.htm


                        On two occasions I have been asked [by members of Parliament], 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - Charles Babbage

                        1 Reply Last reply
                        0
                        • C ComCoderCsharp

                          Hi coders I'm doing some advanced calculations mostly on sound, but I'm relative new to C#...so I was wondering if anyone has some advice on how I can (if possible) speed up the more primitive part of my code...e.g when I want to multply to arrays colmnwise. In Matlab this would look like Z=X.*Y but I cannot figure out a smart way to do this without using a for-loop in C# like for (int i = 0; i < length; i++) { Z[i]=x[i]*Y[i]; } As you can see, this takes a long time if the array consist of a sound segment of e.g 3 seconds which is 132300 samples. Do you know of any better way how this can be done... Best regards AL

                          C Offline
                          C Offline
                          ComCoderCsharp
                          wrote on last edited by
                          #17

                          Hi coders Just wanted to say thanks for all your many inputs - As expected there seems to be no magic answer to this quistion, but I will take a closer look at using pointers - I did not know this was possible in C# so thanks alot... AL

                          1 Reply Last reply
                          0
                          Reply
                          • Reply as topic
                          Log in to reply
                          • Oldest to Newest
                          • Newest to Oldest
                          • Most Votes


                          • Login

                          • Don't have an account? Register

                          • Login or register to search.
                          • First post
                            Last post
                          0
                          • Categories
                          • Recent
                          • Tags
                          • Popular
                          • World
                          • Users
                          • Groups