My crazy scientist idea - collaborators wanted!

W Balboos GHB

A long time ago in a laboratory far far away I used to do Monte-carlo simulations of molecular surface interactions. Although the data was random (the Monte-carlo part) it had rules to follow. Your proposal seems to just be feed a ton of random into a system and . . . well that's what I don't get. Chaos begets chaos (localized pockets of order are implicit for true chaos as the sample grows large). Perhaps explain how your neural network will digest the information ?

Ravings en masse^

"The difference between genius and stupidity is that genius has its limits." - Albert Einstein

"If you are searching for perfection in others, then you seek disappointment. If you are seek perfection in yourself, then you will find failure." - Balboos HaGadol Mar 2010

Tomaz Stih 0

Ok. For start I want to train ANN from nothing to Z80 interpreter. That's a limited, not to difficult task. It will provide a proof of concept and I will learn a lot about how to approach such problem. 2) Then, if concept is good and it works, it could be used to train to interpret, compile, optimize other programming languages.

Tomaz Stih 0

Every byte is more or less a valid Z80 instruction. 2) Hence - if you have a real Z80 emulator to decode instruction and produce outputs as they should be then you can feed the same byte into ANN and from outputs produced by real emulator calculate error of network and back propagate. 3) So a 1 GB is not garbage. It is simply 1 billion Z80 instructions (little less due to the fact that some are 2 or 3 bytes long). You could limit feed to only valid instructions. But for the ANN to learn Z80 alone you do not need to interpret multiple instructions together (i.e. the whole program). You can train the network instruction by instruction and each byte is a self-sustained perfectly meaningful sample.

W Balboos GHB

I still don't see what would really be learned from a scillion randomized one-byte instructions. If they're all essentially Z80 instructions 'as is' then they all would be 'true' - and so the result for the training set would be, in Carribian style: "It's All OK".

Ravings en masse^

"The difference between genius and stupidity is that genius has its limits." - Albert Einstein

"If you are searching for perfection in others, then you seek disappointment. If you are seek perfection in yourself, then you will find failure." - Balboos HaGadol Mar 2010

Kornfeld Eliyahu Peter

That's more clear now... All you have to understand, that 'nothing' means that there is no hidden layer in your ANN, and that means nothing will build as the ANN lack any connection that would change the input... So first thing is to set up minimal rules, that can be the hearth of a Z80 interpreter (like one of the basic rules of the Z80 is that there is no such thing as unknown op-code)... As you want to feed back the output errors as new training material, you have to have a second rule set (a parallel hidden layer) to start with from that point of view...

Skipper: We'll fix it. Alex: Fix it? How you gonna fix this? Skipper: Grit, spit and a whole lotta duct tape.

Pete OHanlon

Isn't that the premise behind Man V Food?

This space for rent

Tomaz Stih 0

W∴ Balboos wrote:

I still don't see what would really be learned from a scillion randomized one-byte instructions. If they're all essentially Z80 instructions 'as is' then they all would be 'true' - and so the result for the training set would be, in Carribian style: "It's All OK".

Not all instructions are one byte long; you could insert a 2-byte or 3-byte instruction(s) to training samples. And there would be multiple outputs as described earlier. Imagine emulator like this:

case clear_a_instruction:
setflag(z);
clearreg(a); /* write data value 0 to register a */
break;
case clear_z_instruction:
setflag(z);
break;
...

Neural network would have following outputs (each 1 bit): - output to say that we are writing data - 3 outputs, each 1 float, to say what is the source (7 possible registers) - output to say we want to set zero flag - 8 outputs for data So when our neural net obtains a byte with value "clear_a_instruction" it's output would be: a) output setflag(z) set between 0.5 and 1 which means it fires, b) output for write to register set between 0.5 and 1 which means it fires, c) three outputs for registers set so that it fires 'A' out of them, d) and data bits set to < 0.5 which gives binary 0. So you would do a write operation with source = 0, destionation = register A, and set the zero flag. Which is what this instruction is supposed to do.

W Balboos GHB

Well - I admit I'm not following - or it seems that the neural net already has been trained. That is, effectively, a rule set.

Tomaž Štih wrote:

Not all instructions are one byte long; you could insert a 2-byte or 3-byte instruction(s) to training samples.

You did say a gigabyte of random data, did you not? One does not insert things into random data, deliberately, and maintain calling it random. Perhaps you're trying to seed code in a sea of random data in order that the Neural Net will learn how to find it? I could see that as a way to search for messages or embedded code. But, as I started out - I guess I'm not following your vision.

Ravings en masse^

"The difference between genius and stupidity is that genius has its limits." - Albert Einstein

"If you are searching for perfection in others, then you seek disappointment. If you are seek perfection in yourself, then you will find failure." - Balboos HaGadol Mar 2010

Tomaz Stih 0

My fault, I am not a native speaker. Perhaps it is better to prepare a small sample and come back later. Just one last try, simplified case of fictional processor with just four 8 bit registers and only 8 bit instructions that only affect these registers and nothing else. So... 1. Neural network outputs can be from 0 .. 1. These values are float. This means that a single neural network output can -for example- assume value between 0 and 255 and even 0 and 65535 (if you want to represent values from 0 to 255 as values between 0 and 1 then you use n/255 for value n and 1 for value 255). 2. This means that a single neural network output can represent 8 or 16 bit register. For simplicity let us assume our neural network has only four 8 bit registers: A, B, C, and D. So we have four outputs. 3. Now let us feed neural network with 10 sequential bytes. Since values to feed the net are 0-255 we can use one single input for that (...or 8 inputs, and feed them binary pattern, it is a matter of choice). Structure of ANN: so let us have 8 inputs, we have 4 outputs and we have (let's invent it right now:) eight invisible levels, each one 8 inputs. 4. We now produce an emulation software (not a neural network, but a real emulator) for this fictional processor. And we execute these 10 sequential bytes on the emulator. Emulator will interpret instructions and correctly set the four registers after each instruction. 5. Now we have possibility to train our network. We feed it one byte. We then feed the same byte to the emulator too. Emulator provides correct value for each register after instruction. Neural network provides output. We calculate error (as difference between correct value and ANN output) and back-propagate into the network i.e. update weights. 6. Now we feed neural net with 100.000 random instructions (bytes) and we do the same with emulator, back propagating after each byte fed into network. The assumption is that after this training the neural network will actually know how to emulate our fictional processor and will produce correct values in registers (4 outputs) based on input (8 inputs). T.

Forogar

That won't work. It's like yelling random words from the dictionary at a baby and expecting it learn English! It'll all end in tears!

- I would love to change the world, but they won’t give me the source code.

megaadam

I have seen a lot of parents do just that and it actually seems to work! ;P

... such stuff as dreams are made on

Mark_Wallace

Funnily enough, random data would probably serve better, especially if backprop were used in place of the neural network. With AI, the best starting point is often not the one that makes logical (programming-type) sense, because it has to learn how to work processes out for itself, not how to take data and reprocess it.

I wanna be a eunuchs developer! Pass me a bread knife!

Mark_Wallace

That's the usual definition we use for AI procedures. The thing is, we settle for one line of Shakespeare, and use billions of iterations, rather than infinite monkeys.

I wanna be a eunuchs developer! Pass me a bread knife!

Mark_Wallace

Garbage in 1,000,000,000 times : garbage out 999,999,999 times is a good start. Iterate until it's: Garbage in 1,000,000,000 times : Whoa!

I wanna be a eunuchs developer! Pass me a bread knife!

Mark_Wallace

OK, let's simplify AI training*: 0. You have data. You might have an idea what it means to you, but it's really only ones and zeroes. 1. You stream this data into a neural network/backprop routine/whatever, which does something (it doesn't matter what) with the data and gives you output. 2a. If the output is useless (99.99999% of the first very many iterations, if you've got it right), you tell it "That output is not useful" (in a coded manner, of course). 2b. If the output is useful, you tell it "That output is useful" 3. Rinse and repeat. Eventually, you get quasi-self-written code that does exactly what you want it to do, and actually does better than you could code it, because, as in facial recognition, where you'd need a hundred billion lines of code to deal with every possible detail/angle/lighting effect/mm of hair growth, it can just look at a face and tell you who it is. * As in training an AI, not training people to use it

I wanna be a eunuchs developer! Pass me a bread knife!

Tomaz Stih 0

Mark_Wallace wrote:

OK, let's simplify AI training

That's pretty much how genetic algorithms do auto programming. Only they combine useful programs in hope that they'll become even more useful.

Mark_Wallace

Tomaž Štih wrote:

Only they combine useful programs in hope that they'll become even more useful.

Ha! Not twenty years ago, they didn't, and nor do they have to now. (Some of us old buggers have been in the game a while, you know)

I wanna be a eunuchs developer! Pass me a bread knife!