Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. Any one using Nvidia's CUDA

Any one using Nvidia's CUDA

Scheduled Pinned Locked Moved The Lounge
34 Posts 8 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S Stuart Dootson

    kmg365 wrote:

    It's the second one.

    Kinda thought it might be ;) All I know about SAS is that my cousin used to be a SAS contractor.

    Java, Basic, who cares - it's all a bunch of tree-hugging hippy cr*p

    M Offline
    M Offline
    Mycroft Holmes
    wrote on last edited by
    #14

    Stuart Dootson wrote:

    my cousin used to be a SAS contractor

    Made a lot of money and went insane.

    Never underestimate the power of human stupidity RAH

    S 1 Reply Last reply
    0
    • M Mycroft Holmes

      Stuart Dootson wrote:

      my cousin used to be a SAS contractor

      Made a lot of money and went insane.

      Never underestimate the power of human stupidity RAH

      S Offline
      S Offline
      Stuart Dootson
      wrote on last edited by
      #15

      Ah, you know him...

      Java, Basic, who cares - it's all a bunch of tree-hugging hippy cr*p

      1 Reply Last reply
      0
      • K kmg365

        Or is planning to use it. Just wondering.

        R Offline
        R Offline
        Rama Krishna Vavilala
        wrote on last edited by
        #16

        No but I am planning to use OpenCL. I am not really sure how much they differ.

        M 1 Reply Last reply
        0
        • R Rama Krishna Vavilala

          No but I am planning to use OpenCL. I am not really sure how much they differ.

          M Offline
          M Offline
          martin_hughes
          wrote on last edited by
          #17

          At a guess, one's called CUDA and the other OpenCL.

          print "http://www.codeproject.com".toURL().text Ain't that Groovy?

          L 1 Reply Last reply
          0
          • M martin_hughes

            At a guess, one's called CUDA and the other OpenCL.

            print "http://www.codeproject.com".toURL().text Ain't that Groovy?

            L Offline
            L Offline
            Lost User
            wrote on last edited by
            #18

            OpenCL should work on ATIs as well

            L 1 Reply Last reply
            0
            • K kmg365

              Or is planning to use it. Just wondering.

              L Offline
              L Offline
              Lost User
              wrote on last edited by
              #19

              It's used by Google in their search engines. I have an interesting post about GPU's and Amdahls's Law on Nvidia: How can CUDA break Amdahl's Law?[^] If you are thinking of using CUDA, you might want to read the post first, it may or may not help your team make a decision to use CUDA. The truth about CUDA is revealed. ~TheArch :cool:

              modified on Wednesday, July 22, 2009 10:44 PM

              K L 2 Replies Last reply
              0
              • L Lost User

                Sure, but I don't really like the API. Someone make LINQ to CUDA please? :)

                L Offline
                L Offline
                Lost User
                wrote on last edited by
                #20

                If you can find me the CUDA Language Spec (Op Code) I will make a IL -> CUDA IL assemble linker for it. I looked around for it a few weeks ago, but could not find it. The alternitive it to use DirectML, but I have no idea when it's comming out?!?! ~TheArch

                L 1 Reply Last reply
                0
                • L Lost User

                  OpenCL should work on ATIs as well

                  L Offline
                  L Offline
                  Lost User
                  wrote on last edited by
                  #21

                  DirectML(sp?) will do it also. A mathematics lib for in the next version of DirectX.

                  1 Reply Last reply
                  0
                  • L Lost User

                    If you can find me the CUDA Language Spec (Op Code) I will make a IL -> CUDA IL assemble linker for it. I looked around for it a few weeks ago, but could not find it. The alternitive it to use DirectML, but I have no idea when it's comming out?!?! ~TheArch

                    L Offline
                    L Offline
                    Lost User
                    wrote on last edited by
                    #22

                    Like this? PTX specs[^] Sadly it doesn't seem to specify the binary encoding..

                    L 1 Reply Last reply
                    0
                    • L Lost User

                      Sure, but I don't really like the API. Someone make LINQ to CUDA please? :)

                      D Offline
                      D Offline
                      Daniel Grunwald
                      wrote on last edited by
                      #23

                      Something like 'Brahma'[^]?

                      L 2 Replies Last reply
                      0
                      • D Daniel Grunwald

                        Something like 'Brahma'[^]?

                        L Offline
                        L Offline
                        Lost User
                        wrote on last edited by
                        #24

                        Yea I tried that some time ago, but while it is nice it only works on 32bit mode and/because it uses DirectX. It's convenient, but the performance is not great, and in 64bit mode it just dies because the DirectX dll's die (MS's fault) which makes it impossible to use it in a plugin for Paint.NET without making it multi-process but that would completely kill the performance. And it seems a bit abandoned.

                        1 Reply Last reply
                        0
                        • L Lost User

                          Like this? PTX specs[^] Sadly it doesn't seem to specify the binary encoding..

                          L Offline
                          L Offline
                          Lost User
                          wrote on last edited by
                          #25

                          Good enough for goverment work. I will start in the am. ~TheArch

                          L 1 Reply Last reply
                          0
                          • D Daniel Grunwald

                            Something like 'Brahma'[^]?

                            L Offline
                            L Offline
                            Lost User
                            wrote on last edited by
                            #26

                            Daniel Grunwald wrote:

                            Something like 'Brahma'[^]?

                            Kind of. There is also a .NET plug in for CUDA. You don't have to wite C code to get to the GPU. I have not tested it though. It can be found on the Nvidia CUDA applications portal.

                            1 Reply Last reply
                            0
                            • L Lost User

                              Good enough for goverment work. I will start in the am. ~TheArch

                              L Offline
                              L Offline
                              Lost User
                              wrote on last edited by
                              #27

                              Sweet, seems a bit large for a 1 man project though, need any help? :)

                              L 1 Reply Last reply
                              0
                              • L Lost User

                                Sweet, seems a bit large for a 1 man project though, need any help? :)

                                L Offline
                                L Offline
                                Lost User
                                wrote on last edited by
                                #28

                                :thumbsup:

                                harold aptroot wrote:

                                Sweet, seems a bit large for a 1 man project though, need any help?

                                That would be great. Pick your flavor! Prototype AutoParalizer Archiecture:

                                1. Tool to parse the target .NET IL. (Reflect onto IL using Relflection.Emit)
                                2. Assembly Table Linker (HashTable to translate the IL into PSX)
                                3. CUDA integration:
                                a. Generate parallel mini functions in CUDA from #2
                                b. Markup .NET IL with 3.b
                                4. Recompile everyithing 3.a & 3.b

                                'Did I miss something?' ~TheArch

                                L 1 Reply Last reply
                                0
                                • L Lost User

                                  :thumbsup:

                                  harold aptroot wrote:

                                  Sweet, seems a bit large for a 1 man project though, need any help?

                                  That would be great. Pick your flavor! Prototype AutoParalizer Archiecture:

                                  1. Tool to parse the target .NET IL. (Reflect onto IL using Relflection.Emit)
                                  2. Assembly Table Linker (HashTable to translate the IL into PSX)
                                  3. CUDA integration:
                                  a. Generate parallel mini functions in CUDA from #2
                                  b. Markup .NET IL with 3.b
                                  4. Recompile everyithing 3.a & 3.b

                                  'Did I miss something?' ~TheArch

                                  L Offline
                                  L Offline
                                  Lost User
                                  wrote on last edited by
                                  #29

                                  Hm I dunno, but some observations: - MSIL uses a read-only* stack so essentially it's equivalent to SSA (source: cr88192) - Due to that there is no direct mapping of MSIL onto PTX, at the very least you need register allocation, but since it's SSA the interference graph will be chordal, so graph colouring has a polynomial running time (so optimal register usage isn't even hard) - local/shared/shared-at-other-level/etc could be a problem, defaulting to global-ish is extremely slow but determining where it should go is probably hard (sounds like escape analysis to me, which is hard) - Different kinds of loads/stores, might take some additional analysis to figure out which one to use (mov vs ld vs tex etc) - There is no GC in CUDA - but classes wouldn't be all that useful there anyway, seems ok to me to limit it to structs - There is probably more to this than might be seen at a first glance.. - ???? - Profit * ok let me clarify, I don't really mean that it doesn't write to the stack - I mean that it doesn't overwrite things deep down in the stack, it just pushes thing onto it. Well except in some rare and creepy cases anyway.

                                  L 1 Reply Last reply
                                  0
                                  • L Lost User

                                    Hm I dunno, but some observations: - MSIL uses a read-only* stack so essentially it's equivalent to SSA (source: cr88192) - Due to that there is no direct mapping of MSIL onto PTX, at the very least you need register allocation, but since it's SSA the interference graph will be chordal, so graph colouring has a polynomial running time (so optimal register usage isn't even hard) - local/shared/shared-at-other-level/etc could be a problem, defaulting to global-ish is extremely slow but determining where it should go is probably hard (sounds like escape analysis to me, which is hard) - Different kinds of loads/stores, might take some additional analysis to figure out which one to use (mov vs ld vs tex etc) - There is no GC in CUDA - but classes wouldn't be all that useful there anyway, seems ok to me to limit it to structs - There is probably more to this than might be seen at a first glance.. - ???? - Profit * ok let me clarify, I don't really mean that it doesn't write to the stack - I mean that it doesn't overwrite things deep down in the stack, it just pushes thing onto it. Well except in some rare and creepy cases anyway.

                                    L Offline
                                    L Offline
                                    Lost User
                                    wrote on last edited by
                                    #30

                                    harold aptroot wrote:

                                    - MSIL uses a read-only stack so essentially it's equivalent to SSA (source: cr88192)

                                    Hmm, okay I'll take your word on it. But the emited IL won't be read only after decomposing it and saving to a new file.

                                    harold aptroot wrote:

                                    - Due to that there is no direct mapping of MSIL onto PTX, at the very least you need register allocation, but since it's SSA the interference graph will be chordal, so graph colouring has a polynomial running time (so optimal register usage isn't even hard)

                                    Yeah, I though as much. From my first 20 sec glance at it, I think there are about 45% direct, the others will have to use some diffrent translation logic.

                                    harold aptroot wrote:

                                    - local/shared/shared-at-other-level/etc could be a problem, defaulting to global-ish is extremely slow but determining where it should go is probably hard (sounds like escape analysis to me, which is hard)

                                    Hmm, I don't know much about this I will have to research it in the morning.

                                    harold aptroot wrote:

                                    - Different kinds of loads/stores, might take some additional analysis to figure out which one to use (mov vs ld vs tex etc)

                                    Yeah, this is similar to #2 on your list. The I invision the translation lib will have custom functions in PSX. ie. we run into 'String s = new String("something")' translate to PSX function 'CUDA.String s = new CUDA.String((CUDA.String)"something")'.

                                    harold aptroot wrote:

                                    - There is no GC in CUDA - but classes wouldn't be all that useful there anyway, seems ok to me to limit it to structs

                                    Correct I think?!?!? This is where it becomes very important to correctly use the shared 16k memory space. We can clean it up on the .NET side.

                                    harold aptroot wrote:

                                    - There is probably more to this than might be seen at a first glance..

                                    Yeah, google LABVIEW. This will break down most of our barriers.

                                    harold aptroot wrote:

                                    - Profit

                                    I sugest a good prototype here on The Code Project. Then after we get some interest and more help a professional version with better features. We won't handicap the prototype, but the pro version would have many performance enhancments and

                                    L 1 Reply Last reply
                                    0
                                    • L Lost User

                                      harold aptroot wrote:

                                      - MSIL uses a read-only stack so essentially it's equivalent to SSA (source: cr88192)

                                      Hmm, okay I'll take your word on it. But the emited IL won't be read only after decomposing it and saving to a new file.

                                      harold aptroot wrote:

                                      - Due to that there is no direct mapping of MSIL onto PTX, at the very least you need register allocation, but since it's SSA the interference graph will be chordal, so graph colouring has a polynomial running time (so optimal register usage isn't even hard)

                                      Yeah, I though as much. From my first 20 sec glance at it, I think there are about 45% direct, the others will have to use some diffrent translation logic.

                                      harold aptroot wrote:

                                      - local/shared/shared-at-other-level/etc could be a problem, defaulting to global-ish is extremely slow but determining where it should go is probably hard (sounds like escape analysis to me, which is hard)

                                      Hmm, I don't know much about this I will have to research it in the morning.

                                      harold aptroot wrote:

                                      - Different kinds of loads/stores, might take some additional analysis to figure out which one to use (mov vs ld vs tex etc)

                                      Yeah, this is similar to #2 on your list. The I invision the translation lib will have custom functions in PSX. ie. we run into 'String s = new String("something")' translate to PSX function 'CUDA.String s = new CUDA.String((CUDA.String)"something")'.

                                      harold aptroot wrote:

                                      - There is no GC in CUDA - but classes wouldn't be all that useful there anyway, seems ok to me to limit it to structs

                                      Correct I think?!?!? This is where it becomes very important to correctly use the shared 16k memory space. We can clean it up on the .NET side.

                                      harold aptroot wrote:

                                      - There is probably more to this than might be seen at a first glance..

                                      Yeah, google LABVIEW. This will break down most of our barriers.

                                      harold aptroot wrote:

                                      - Profit

                                      I sugest a good prototype here on The Code Project. Then after we get some interest and more help a professional version with better features. We won't handicap the prototype, but the pro version would have many performance enhancments and

                                      L Offline
                                      L Offline
                                      Lost User
                                      wrote on last edited by
                                      #31

                                      Harold wrote:

                                      - ???? - Profit

                                      Never seen that one before? :) On the SSA/MSIL thing - it's the operand stack that would be read-only, making the opcodes "implicit SSA" as I'll call it now (because the operands are not listed but inferred from the stack, and that also ensures that it really is SSA form - albeit implicit) Ok I saw LABVIEW, what am I supposed to see though? *bookmarks project page*

                                      L 1 Reply Last reply
                                      0
                                      • L Lost User

                                        Harold wrote:

                                        - ???? - Profit

                                        Never seen that one before? :) On the SSA/MSIL thing - it's the operand stack that would be read-only, making the opcodes "implicit SSA" as I'll call it now (because the operands are not listed but inferred from the stack, and that also ensures that it really is SSA form - albeit implicit) Ok I saw LABVIEW, what am I supposed to see though? *bookmarks project page*

                                        L Offline
                                        L Offline
                                        Lost User
                                        wrote on last edited by
                                        #32

                                        Let's move this discussion to the project thread. We will have a little more privacy, and once the article goes public the thread will be erased.

                                        1 Reply Last reply
                                        0
                                        • L Lost User

                                          It's used by Google in their search engines. I have an interesting post about GPU's and Amdahls's Law on Nvidia: How can CUDA break Amdahl's Law?[^] If you are thinking of using CUDA, you might want to read the post first, it may or may not help your team make a decision to use CUDA. The truth about CUDA is revealed. ~TheArch :cool:

                                          modified on Wednesday, July 22, 2009 10:44 PM

                                          K Offline
                                          K Offline
                                          kmg365
                                          wrote on last edited by
                                          #33

                                          Thanks! and thanks for all your input good thread, very helpful.

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups