Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. I had been looking for ideas for a code generator

I had been looking for ideas for a code generator

Scheduled Pinned Locked Moved The Lounge
designcsharpvisual-studiocomgraphics
18 Posts 10 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • H Offline
    H Offline
    honey the codewitch
    wrote on last edited by
    #1

    And I think I found my most ambitious idea yet. Training models to make LLMs spit out code for input specs where the code loops hand written. So like parser generators. DAL generators etc. Different model for each. Each model comes in a nuget package along with a C# source generator that invokes it. The only thing is it will require hosting your own LLM. I have two 4080s across two machines, so it's not a problem for me - part of why I bought them, but I wonder how practical it is in general.

    Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

    D J J 3 Replies Last reply
    0
    • H honey the codewitch

      And I think I found my most ambitious idea yet. Training models to make LLMs spit out code for input specs where the code loops hand written. So like parser generators. DAL generators etc. Different model for each. Each model comes in a nuget package along with a C# source generator that invokes it. The only thing is it will require hosting your own LLM. I have two 4080s across two machines, so it's not a problem for me - part of why I bought them, but I wonder how practical it is in general.

      Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

      D Offline
      D Offline
      Daniel Pfeffer
      wrote on last edited by
      #2

      honey the codewitch wrote:

      I have two 4080s across two machines, so it's not a problem for me - part of why I bought them, but I wonder how practical it is in general.

      While it might work, I suspect that at the current state of the art it would not be cost-effective. The costs of hardware, collection of training data, classification of the training data, etc. are likely to be more expensive than the time that you'd save on the coding.

      Freedom is the freedom to say that two plus two make four. If that is granted, all else follows. -- 6079 Smith W.

      H 1 Reply Last reply
      0
      • H honey the codewitch

        And I think I found my most ambitious idea yet. Training models to make LLMs spit out code for input specs where the code loops hand written. So like parser generators. DAL generators etc. Different model for each. Each model comes in a nuget package along with a C# source generator that invokes it. The only thing is it will require hosting your own LLM. I have two 4080s across two machines, so it's not a problem for me - part of why I bought them, but I wonder how practical it is in general.

        Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

        J Offline
        J Offline
        Jo_vb net
        wrote on last edited by
        #3

        I'm wondering if you may have enough patience case waiting for results of a training task lasts longer then one or two days :confused:

        H 1 Reply Last reply
        0
        • J Jo_vb net

          I'm wondering if you may have enough patience case waiting for results of a training task lasts longer then one or two days :confused:

          H Offline
          H Offline
          honey the codewitch
          wrote on last edited by
          #4

          I mean, stable-diffusion runs pretty quickly on my machine.

          Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

          0 J 2 Replies Last reply
          0
          • D Daniel Pfeffer

            honey the codewitch wrote:

            I have two 4080s across two machines, so it's not a problem for me - part of why I bought them, but I wonder how practical it is in general.

            While it might work, I suspect that at the current state of the art it would not be cost-effective. The costs of hardware, collection of training data, classification of the training data, etc. are likely to be more expensive than the time that you'd save on the coding.

            Freedom is the freedom to say that two plus two make four. If that is granted, all else follows. -- 6079 Smith W.

            H Offline
            H Offline
            honey the codewitch
            wrote on last edited by
            #5

            I mean that I intend to release nuget packages with pretrained models, integrated as C# Source Generators that prompt a local LLM, trained with a (relatively) small model to undertake a specific type of coding task, like generating a parser given a context free grammar. I am not looking to make an all purpose code generator or anything like that. My interest is in code synthesis by which I mean generating "hand written" code. The differences between a generated parser and a hand rolled parser are far deeper than basic cosmetic. The details of how they work are different, even if the principles are the same. Mainly a generated left recursive parser with fixed lookahead will always greedy match. A left recursive descent parser such as hand rolling would produce can switch between lazy and greedy matching, leading to more efficient and often much smaller code.

            Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

            1 P 2 Replies Last reply
            0
            • H honey the codewitch

              I mean, stable-diffusion runs pretty quickly on my machine.

              Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

              0 Offline
              0 Offline
              0x01AA
              wrote on last edited by
              #6

              Define 'pretty quickliy' ;)

              H 1 Reply Last reply
              0
              • 0 0x01AA

                Define 'pretty quickliy' ;)

                H Offline
                H Offline
                honey the codewitch
                wrote on last edited by
                #7

                Stable diffusion takes minutes at most even for largest renders it can do in 16GB on my card. Usually under a minute to render to my prompts. Edited: That's on my laptop's "4090" which is actually a 4080 die. But it is not as fast as my desktop's 4080. I haven't run SD on my desktop yet.

                Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                1 Reply Last reply
                0
                • H honey the codewitch

                  I mean, stable-diffusion runs pretty quickly on my machine.

                  Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                  J Offline
                  J Offline
                  Jo_vb net
                  wrote on last edited by
                  #8

                  Sure, but we should not compare the time which a trained model needs to finish a given job with the time it needs to train a model (and then find/optimize the right parameters and run training again and again).

                  H 1 Reply Last reply
                  0
                  • J Jo_vb net

                    Sure, but we should not compare the time which a trained model needs to finish a given job with the time it needs to train a model (and then find/optimize the right parameters and run training again and again).

                    H Offline
                    H Offline
                    honey the codewitch
                    wrote on last edited by
                    #9

                    I'm training the model once to do a specific task, and releasing that trained model. I am not building models as part of a code generator. I don't even know why that would come up.

                    Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                    pkfoxP 1 Reply Last reply
                    0
                    • H honey the codewitch

                      I'm training the model once to do a specific task, and releasing that trained model. I am not building models as part of a code generator. I don't even know why that would come up.

                      Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                      pkfoxP Offline
                      pkfoxP Offline
                      pkfox
                      wrote on last edited by
                      #10

                      Where will you get the training data?

                      In a closed society where everybody's guilty, the only crime is getting caught. In a world of thieves, the only final sin is stupidity. - Hunter S Thompson - RIP

                      H B 2 Replies Last reply
                      0
                      • pkfoxP pkfox

                        Where will you get the training data?

                        In a closed society where everybody's guilty, the only crime is getting caught. In a world of thieves, the only final sin is stupidity. - Hunter S Thompson - RIP

                        H Offline
                        H Offline
                        honey the codewitch
                        wrote on last edited by
                        #11

                        That's the part I don't know enough about yet. 1000 foot view I'd like to train it using traditional code generators. "Hey ChatGPT, see this? This is the result of this input grammar. Now can you improve it?" Except actual training, not prompting. I only prompted just now to give you an idea of what i want. I have no idea how to use training data, or what it even really looks like. I've never done anything related to "AI" or LLMs. I've barely even asked ChatGPT anything and last time I did it tried to dox me. :~

                        Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                        1 Reply Last reply
                        0
                        • H honey the codewitch

                          I mean that I intend to release nuget packages with pretrained models, integrated as C# Source Generators that prompt a local LLM, trained with a (relatively) small model to undertake a specific type of coding task, like generating a parser given a context free grammar. I am not looking to make an all purpose code generator or anything like that. My interest is in code synthesis by which I mean generating "hand written" code. The differences between a generated parser and a hand rolled parser are far deeper than basic cosmetic. The details of how they work are different, even if the principles are the same. Mainly a generated left recursive parser with fixed lookahead will always greedy match. A left recursive descent parser such as hand rolling would produce can switch between lazy and greedy matching, leading to more efficient and often much smaller code.

                          Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                          1 Offline
                          1 Offline
                          11917640 Member
                          wrote on last edited by
                          #12

                          nuget packages - pretrained models - LLM - coding task - generating a parser - context free grammar ...

                          Perfect candidates to extend the Word List in the Makebullshit - Tech Bullshit Generator[^] This wonderful site is not updated yet with new AI buzzwords. Maybe it's time to do this.

                          1 Reply Last reply
                          0
                          • pkfoxP pkfox

                            Where will you get the training data?

                            In a closed society where everybody's guilty, the only crime is getting caught. In a world of thieves, the only final sin is stupidity. - Hunter S Thompson - RIP

                            B Offline
                            B Offline
                            bryanren
                            wrote on last edited by
                            #13

                            Scraping SO?

                            1 Reply Last reply
                            0
                            • H honey the codewitch

                              I mean that I intend to release nuget packages with pretrained models, integrated as C# Source Generators that prompt a local LLM, trained with a (relatively) small model to undertake a specific type of coding task, like generating a parser given a context free grammar. I am not looking to make an all purpose code generator or anything like that. My interest is in code synthesis by which I mean generating "hand written" code. The differences between a generated parser and a hand rolled parser are far deeper than basic cosmetic. The details of how they work are different, even if the principles are the same. Mainly a generated left recursive parser with fixed lookahead will always greedy match. A left recursive descent parser such as hand rolling would produce can switch between lazy and greedy matching, leading to more efficient and often much smaller code.

                              Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                              P Offline
                              P Offline
                              Payton Byrd 2023
                              wrote on last edited by
                              #14

                              This seems like the worst idea ever. Not only do you have not insight into the training of the model in the nuget package, but you also need to capture the generated source to see what's being compiled into your code. Throw a build pipeline and obfuscation on top and you have the perfectly opaque platform for distributing just about any kind of malware.

                              H 1 Reply Last reply
                              0
                              • P Payton Byrd 2023

                                This seems like the worst idea ever. Not only do you have not insight into the training of the model in the nuget package, but you also need to capture the generated source to see what's being compiled into your code. Throw a build pipeline and obfuscation on top and you have the perfectly opaque platform for distributing just about any kind of malware.

                                H Offline
                                H Offline
                                honey the codewitch
                                wrote on last edited by
                                #15

                                I don't see how that wouldn't be true of any code generator that someone for some reason obfuscated the output of.

                                Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                                S 1 Reply Last reply
                                0
                                • H honey the codewitch

                                  I don't see how that wouldn't be true of any code generator that someone for some reason obfuscated the output of.

                                  Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                                  S Offline
                                  S Offline
                                  Sharp Ninja
                                  wrote on last edited by
                                  #16

                                  The use of a pretrained model to create Code Generators is bad enough. It's not impossible to see what was created, but not directly easy, either. And how many people would bother to even try? For those who do care about what code generators are putting into their code any WHY, then being able to see the algorithm being injected via source code of the generator is helpful, but here all you have is a collection of Tensors that are impossible to reverse engineer. If stuff like this becomes common, we are doomed.

                                  1 Reply Last reply
                                  0
                                  • H honey the codewitch

                                    And I think I found my most ambitious idea yet. Training models to make LLMs spit out code for input specs where the code loops hand written. So like parser generators. DAL generators etc. Different model for each. Each model comes in a nuget package along with a C# source generator that invokes it. The only thing is it will require hosting your own LLM. I have two 4080s across two machines, so it's not a problem for me - part of why I bought them, but I wonder how practical it is in general.

                                    Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                                    J Offline
                                    J Offline
                                    jochance
                                    wrote on last edited by
                                    #17

                                    Having done image gen stuff, I'd bet you could get away with skinnier metal and still not have excruciating waits for this use case. One of the bigger limiters will be whether the model fits in VRAM. My guess is these are going to be far smaller models owing to greater specificity and not trying to encompass every picture humanity has ever made.

                                    H 1 Reply Last reply
                                    0
                                    • J jochance

                                      Having done image gen stuff, I'd bet you could get away with skinnier metal and still not have excruciating waits for this use case. One of the bigger limiters will be whether the model fits in VRAM. My guess is these are going to be far smaller models owing to greater specificity and not trying to encompass every picture humanity has ever made.

                                      H Offline
                                      H Offline
                                      honey the codewitch
                                      wrote on last edited by
                                      #18

                                      That's what I was hoping. As I told Daniel my primary interest is in code synthesis, so I'd be working with well defined processes for generating the code, but looking to generate it in a more refined manner.

                                      Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

                                      1 Reply Last reply
                                      0
                                      Reply
                                      • Reply as topic
                                      Log in to reply
                                      • Oldest to Newest
                                      • Newest to Oldest
                                      • Most Votes


                                      • Login

                                      • Don't have an account? Register

                                      • Login or register to search.
                                      • First post
                                        Last post
                                      0
                                      • Categories
                                      • Recent
                                      • Tags
                                      • Popular
                                      • World
                                      • Users
                                      • Groups