I had been looking for ideas for a code generator

honey the codewitch

And I think I found my most ambitious idea yet. Training models to make LLMs spit out code for input specs where the code loops hand written. So like parser generators. DAL generators etc. Different model for each. Each model comes in a nuget package along with a C# source generator that invokes it. The only thing is it will require hosting your own LLM. I have two 4080s across two machines, so it's not a problem for me - part of why I bought them, but I wonder how practical it is in general.

Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

Daniel Pfeffer

honey the codewitch wrote:

I have two 4080s across two machines, so it's not a problem for me - part of why I bought them, but I wonder how practical it is in general.

While it might work, I suspect that at the current state of the art it would not be cost-effective. The costs of hardware, collection of training data, classification of the training data, etc. are likely to be more expensive than the time that you'd save on the coding.

Freedom is the freedom to say that two plus two make four. If that is granted, all else follows. -- 6079 Smith W.

Jo_vb net

I'm wondering if you may have enough patience case waiting for results of a training task lasts longer then one or two days :confused:

honey the codewitch

I mean, stable-diffusion runs pretty quickly on my machine.

Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

honey the codewitch

I mean that I intend to release nuget packages with pretrained models, integrated as C# Source Generators that prompt a local LLM, trained with a (relatively) small model to undertake a specific type of coding task, like generating a parser given a context free grammar. I am not looking to make an all purpose code generator or anything like that. My interest is in code synthesis by which I mean generating "hand written" code. The differences between a generated parser and a hand rolled parser are far deeper than basic cosmetic. The details of how they work are different, even if the principles are the same. Mainly a generated left recursive parser with fixed lookahead will always greedy match. A left recursive descent parser such as hand rolling would produce can switch between lazy and greedy matching, leading to more efficient and often much smaller code.

Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

0x01AA

Define 'pretty quickliy' ;)

honey the codewitch

Stable diffusion takes minutes at most even for largest renders it can do in 16GB on my card. Usually under a minute to render to my prompts. Edited: That's on my laptop's "4090" which is actually a 4080 die. But it is not as fast as my desktop's 4080. I haven't run SD on my desktop yet.

Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

Jo_vb net

Sure, but we should not compare the time which a trained model needs to finish a given job with the time it needs to train a model (and then find/optimize the right parameters and run training again and again).

honey the codewitch

I'm training the model once to do a specific task, and releasing that trained model. I am not building models as part of a code generator. I don't even know why that would come up.

Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

pkfox

Where will you get the training data?

In a closed society where everybody's guilty, the only crime is getting caught. In a world of thieves, the only final sin is stupidity. - Hunter S Thompson - RIP

honey the codewitch

That's the part I don't know enough about yet. 1000 foot view I'd like to train it using traditional code generators. "Hey ChatGPT, see this? This is the result of this input grammar. Now can you improve it?" Except actual training, not prompting. I only prompted just now to give you an idea of what i want. I have no idea how to use training data, or what it even really looks like. I've never done anything related to "AI" or LLMs. I've barely even asked ChatGPT anything and last time I did it tried to dox me. :~

Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

11917640 Member

nuget packages - pretrained models - LLM - coding task - generating a parser - context free grammar ...

Perfect candidates to extend the Word List in the Makebullshit - Tech Bullshit Generator[^] This wonderful site is not updated yet with new AI buzzwords. Maybe it's time to do this.

bryanren

Scraping SO?

Payton Byrd 2023

This seems like the worst idea ever. Not only do you have not insight into the training of the model in the nuget package, but you also need to capture the generated source to see what's being compiled into your code. Throw a build pipeline and obfuscation on top and you have the perfectly opaque platform for distributing just about any kind of malware.

honey the codewitch

I don't see how that wouldn't be true of any code generator that someone for some reason obfuscated the output of.

Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix

Sharp Ninja

The use of a pretrained model to create Code Generators is bad enough. It's not impossible to see what was created, but not directly easy, either. And how many people would bother to even try? For those who do care about what code generators are putting into their code any WHY, then being able to see the algorithm being injected via source code of the generator is helpful, but here all you have is a collection of Tensors that are impossible to reverse engineer. If stuff like this becomes common, we are doomed.

jochance

Having done image gen stuff, I'd bet you could get away with skinnier metal and still not have excruciating waits for this use case. One of the bigger limiters will be whether the model fits in VRAM. My guess is these are going to be far smaller models owing to greater specificity and not trying to encompass every picture humanity has ever made.

honey the codewitch

That's what I was hoping. As I told Daniel my primary interest is in code synthesis, so I'd be working with well defined processes for generating the code, but looking to generate it in a more refined manner.

Check out my IoT graphics library here: https://honeythecodewitch.com/gfx And my IoT UI/User Experience library here: https://honeythecodewitch.com/uix