Speech to text.
-
I didn't say it's impossible. I said that code cannot be treated as a dialect of English. I think that the bigger obstacle for the development of coding text to speech is economic. I doubt that there are enough coders who need or would want such a system.
Freedom is the freedom to say that two plus two make four. If that is granted, all else follows. -- 6079 Smith W.
At my first job, they actually had a blind "intern" who programmed in Braile on his special typewriter. I don't remember how we got his program onto "cards", but I was asked to review his code. I can't help but think that some "braile to speech" would have helped his comprehension. (My issue is "slow" talkers). "Too much work" depends on the recipient.
"Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I
-
Do you know of and can recommend a Speech to Text program that would work for coding. Or, is just impossible to do that sort of thing?
VB6 conversions to C#, VB.net
It is possible. Just not very viable if other solutions exist. So with someone that is mobility limited it can be done. But for someone that can type, even just with a couple of fingers then that is how they should do it. I used the following with google and found multiple solutions.
typing with speech to text code writer
-
Given enough time, anything is possible. Your code or someone else's? I use a dictionary to validate every word in my text-to-speech program. I use "markup" to indicate words that need to be spoken via phonetics. [RecognizedWordUnit.Pronunciation Property (System.Speech.Recognition) | Microsoft Learn](https://learn.microsoft.com/en-us/dotnet/api/system.speech.recognition.recognizedwordunit.pronunciation?view=netframework-4.8.1)
"Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I
Gerry Schmitz wrote:
I use a dictionary to validate every word in my text-to-speech program.
A least that can give a recognition quality comparable to word-by-word translation from one language to another, with no concern about context or grammar. :-) (I suspect that you intended to write "... in my speech-to-text program". If you really meant text-to-speech, that is a different, although related, problem. Are you then referring to a pronunciation dictionary? How do you handle homographs?)
-
It is possible. Just not very viable if other solutions exist. So with someone that is mobility limited it can be done. But for someone that can type, even just with a couple of fingers then that is how they should do it. I used the following with google and found multiple solutions.
typing with speech to text code writer
jschell wrote:
So with someone that is mobility limited it can be done.
I have been doing some work with visually handicapped youth. You would be surprised to see how tolerant they are with their tools, learning very fast which mistakes the tools make. Text that looks gibberish to me makes perfect sense to them. Like "It always confuses 'I scream' and 'ice cream', that's no problem". They have no problems with, say, a course named "LOL Introduction to programming". Handling zillions of such misinterpretations is like a "survival technique" for them, as way to manage even with mediocre tools. The negative side of it is that they do not care to report the problems to the programmers, so the tools can be improved. When I 'catch' them with such problems and ask them "Don't you want to report it, so it can be fixed?", the answer usually is something in the direction of "Naaah ... I understand what is meant. It isn't necessary." IF we, as users (whether we are programmers or not), really should be much more eager to report bugs and problems to the developers. That is the best way to have the tools improved. As programmers we know what information a software developer needs of information, and in which format. Some years ago, I received a Christmas Greeting from one developer of a tool I was using professionally: He wanted to express his gratitude for all the error reports I had delivered through the year: Always clear, to the point, with all unnecessary parts shaved off. He really wished that other users could learn to provide similar error reports :-)
-
Aside: here is a Speech Recognition joke of the previous millennium - A smart programmer went to a college classroom and proudly claimed that "My speech recognition software is so advanced that it can run voice commands on my DOS machine; you are free to test it now", and ran it. Immediately, a smarter student from the last bench shouted - "FORMAT C COLON ENTER". It is left to your imagination about what happened next.
xkcd: Listening[^] I have done something similar to this at one place I knew they had Alexa. It didn't work (I suppose I used the wrong formulation or Amazon changed the way to do it), but the owner got :elephant:ing frightened and almost bans me from the house. The other guests were rofling for half an hour.
M.D.V. ;) If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about? Help me to understand what I'm saying, and I'll explain it better to you Rating helpful answers is nice, but saying thanks can be even nicer.
-
Gerry Schmitz wrote:
I use a dictionary to validate every word in my text-to-speech program.
A least that can give a recognition quality comparable to word-by-word translation from one language to another, with no concern about context or grammar. :-) (I suspect that you intended to write "... in my speech-to-text program". If you really meant text-to-speech, that is a different, although related, problem. Are you then referring to a pronunciation dictionary? How do you handle homographs?)
Thanks! One of those where you read, but register something else. Yes; I was thinking "Text to speech". I use a dictionary to scan 3rd party text for words not in the dictionary (mostly English; including well known proper names). Spelling mistakes. Wrong "title case". Weird punctuation. Initials. "Item numbers". Abbreviations. Things that will cause issues with the speech engine. I run it through my parser until it "speaks" well; patching or adding "markup" as I go. I have programmed a "speech to menu item". :-\
"Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I
-
I didn't say it's impossible. I said that code cannot be treated as a dialect of English. I think that the bigger obstacle for the development of coding text to speech is economic. I doubt that there are enough coders who need or would want such a system.
Freedom is the freedom to say that two plus two make four. If that is granted, all else follows. -- 6079 Smith W.
Daniel Pfeffer wrote:
I doubt that there are enough coders who need or would want such a system.
You are really under estimating what people get up to. Github has 370+ million repositories and 28 million public ones. They don't create them based on need but rather want. If you google for the following you will find at least some solutions.
speech to code
-
I saw the first doctor speech-to-text program many years ago. The system recognized medical terms only, not general chitchat. So it was quite reliable, within its domain. Code also has a limited vocabulary, and a strict grammar. Assuming that the program knows the syntax, and maintains a parse tree and a current position within the parse tree. If a spoken word may have 2+ interpretations, chances are that some of the alternates will give a parse error, so they are not likely to be correct. In most cases, there will be one parseable interpretation. Your examples: If there is a declared variable or method named 'SumOfSquares', and it is syntactially legal at the current position, then it is in word. If you are in the middle of a literal string constant, it is more likely to be in three words (with no camel casing). If your have just opened an 'if' or 'while' condition, then it goes as x==5. If you have just completed the previous statement, and an assignment to x is a legal next statement, then it goes as x = 5. I am sure you could find examples where two entirely different interpretations of the speech would both be syntactically legal. But for the very most code, that is not the case. Side remark: I have a hobby of giving hell to speech synthesis - from text to speech. Even though it turns the problem upside down, there is a lot of common handling. I collect all sorts of words of differing meanings and pronunciations, but written identically. (First time I read "Lead guitar: ..." on a vinyl cover, I thought it was a joke on the bass guitar. Heavy!) I have gathered a handful of sentences which have two very different meanings, both grammatically correct. For 99% of the words, if you analyze the sentence, syntactically and semantically, only one interpretation and pronunciation gives a meaning. (But most speech generators do not sufficiently deep analysis to do it correctly.) Unfortunately, for this forum: My 'homograph' collection is in Norwegian, so the examples I could present would make no sense to the very most of you.
Heh, I wire wrapped a board for my computer (s100 bus) back in the 80s to interface with the Votrax speech synthesis chip. Designing and wire wrapping the board was the easy part! Writing a simple program to make the computer 'talk' wasn't too hard, it was words like 'read' and 'lead' that caused problems. I didn't really have the chops to programmatically determine the sentence context so I ended up have a list of words that had special code that attempted to determine the correct pronunciation. Eventually, the program got to big for the amount of memory I had at that time (16K). It was definitely a fun home project.
-
Heh, I wire wrapped a board for my computer (s100 bus) back in the 80s to interface with the Votrax speech synthesis chip. Designing and wire wrapping the board was the easy part! Writing a simple program to make the computer 'talk' wasn't too hard, it was words like 'read' and 'lead' that caused problems. I didn't really have the chops to programmatically determine the sentence context so I ended up have a list of words that had special code that attempted to determine the correct pronunciation. Eventually, the program got to big for the amount of memory I had at that time (16K). It was definitely a fun home project.
You are the first person I have talked to that has used (and even built a board for) an actual S100 machine! I guess that 3 out of 4 CP members do not know what it is! BYTE magazine had a number of articles in those days, DYI speech synthesis and, what the original post was about, speech recognition. There were several articles about a speech recognition board that could be trained to understand 64 words. As far as I remember of what the authors told, it would be reasonable reliable only with the voice of the person who had trained it, and the 64 words should be be as acoustically different as possible. Alexa is somewhat more sophisticated :-) When I read about people who worked with S100 machines, I'm itching go to down in my basement to pick up those BYTE magazines from the late 1970s and early 1980s to let my mind wander back to the days when you could understand every single bit in a computer. :-) About 15 years ago, I went into embedded programming on 8051 chips; that was sort a return to the old days. When we picked up the ARM M0 (with our own monitor), I still had the feeling of being in control, but when we progressed to M4 and an external OS (Zephyr), and further on to M33, again something was slipping out of my hands...
-
You are the first person I have talked to that has used (and even built a board for) an actual S100 machine! I guess that 3 out of 4 CP members do not know what it is! BYTE magazine had a number of articles in those days, DYI speech synthesis and, what the original post was about, speech recognition. There were several articles about a speech recognition board that could be trained to understand 64 words. As far as I remember of what the authors told, it would be reasonable reliable only with the voice of the person who had trained it, and the 64 words should be be as acoustically different as possible. Alexa is somewhat more sophisticated :-) When I read about people who worked with S100 machines, I'm itching go to down in my basement to pick up those BYTE magazines from the late 1970s and early 1980s to let my mind wander back to the days when you could understand every single bit in a computer. :-) About 15 years ago, I went into embedded programming on 8051 chips; that was sort a return to the old days. When we picked up the ARM M0 (with our own monitor), I still had the feeling of being in control, but when we progressed to M4 and an external OS (Zephyr), and further on to M33, again something was slipping out of my hands...
I built a fair number of boards for my S100 bus system. Besides the Votrax board I built a 4K RAM board, a dual port serial board, a cassette tape storage interface board and a Selectric mechanizm control board (no dot matrix for me!!). My whole career was embedded programming (retired in 2019). I just loved it. I really loved the control.
-
Daniel Pfeffer wrote:
I doubt that there are enough coders who need or would want such a system.
You are really under estimating what people get up to. Github has 370+ million repositories and 28 million public ones. They don't create them based on need but rather want. If you google for the following you will find at least some solutions.
speech to code
jschell wrote:
They don't create them based on need but rather want.
True. But commercial (as opposed to freeware/shareware) packages require maintenance, support, etc., which IMO would be uneconomical for such a niche product.
Freedom is the freedom to say that two plus two make four. If that is granted, all else follows. -- 6079 Smith W.