Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. Speech to text.

Speech to text.

Scheduled Pinned Locked Moved The Lounge
csharpquestion
29 Posts 12 Posters 2 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S Slow Eddie

    Do you know of and can recommend a Speech to Text program that would work for coding. Or, is just impossible to do that sort of thing?

    VB6 conversions to C#, VB.net

    D Offline
    D Offline
    Daniel Pfeffer
    wrote on last edited by
    #2

    Speech to text programs are pretty good these days, but code is not English. You would need a special language module for each computer language. For example, how would you enter a variable 'SumOfSquares'? Should it be one word or three? Is 'x equals 5' 'x = 5', or 'x == 5'? Other examples are easy to find. While I can see the utility of such a program for people who have lost the use of their arms/fingers, I have my doubts whether there are enough programmers in that state to make development commercially viable.

    Freedom is the freedom to say that two plus two make four. If that is granted, all else follows. -- 6079 Smith W.

    T L 2 Replies Last reply
    0
    • D Daniel Pfeffer

      Speech to text programs are pretty good these days, but code is not English. You would need a special language module for each computer language. For example, how would you enter a variable 'SumOfSquares'? Should it be one word or three? Is 'x equals 5' 'x = 5', or 'x == 5'? Other examples are easy to find. While I can see the utility of such a program for people who have lost the use of their arms/fingers, I have my doubts whether there are enough programmers in that state to make development commercially viable.

      Freedom is the freedom to say that two plus two make four. If that is granted, all else follows. -- 6079 Smith W.

      T Offline
      T Offline
      trønderen
      wrote on last edited by
      #3

      I saw the first doctor speech-to-text program many years ago. The system recognized medical terms only, not general chitchat. So it was quite reliable, within its domain. Code also has a limited vocabulary, and a strict grammar. Assuming that the program knows the syntax, and maintains a parse tree and a current position within the parse tree. If a spoken word may have 2+ interpretations, chances are that some of the alternates will give a parse error, so they are not likely to be correct. In most cases, there will be one parseable interpretation. Your examples: If there is a declared variable or method named 'SumOfSquares', and it is syntactially legal at the current position, then it is in word. If you are in the middle of a literal string constant, it is more likely to be in three words (with no camel casing). If your have just opened an 'if' or 'while' condition, then it goes as x==5. If you have just completed the previous statement, and an assignment to x is a legal next statement, then it goes as x = 5. I am sure you could find examples where two entirely different interpretations of the speech would both be syntactically legal. But for the very most code, that is not the case. Side remark: I have a hobby of giving hell to speech synthesis - from text to speech. Even though it turns the problem upside down, there is a lot of common handling. I collect all sorts of words of differing meanings and pronunciations, but written identically. (First time I read "Lead guitar: ..." on a vinyl cover, I thought it was a joke on the bass guitar. Heavy!) I have gathered a handful of sentences which have two very different meanings, both grammatically correct. For 99% of the words, if you analyze the sentence, syntactically and semantically, only one interpretation and pronunciation gives a meaning. (But most speech generators do not sufficiently deep analysis to do it correctly.) Unfortunately, for this forum: My 'homograph' collection is in Norwegian, so the examples I could present would make no sense to the very most of you.

      D S 2 Replies Last reply
      0
      • S Slow Eddie

        Do you know of and can recommend a Speech to Text program that would work for coding. Or, is just impossible to do that sort of thing?

        VB6 conversions to C#, VB.net

        A Offline
        A Offline
        Amarnath S
        wrote on last edited by
        #4

        Aside: here is a Speech Recognition joke of the previous millennium - A smart programmer went to a college classroom and proudly claimed that "My speech recognition software is so advanced that it can run voice commands on my DOS machine; you are free to test it now", and ran it. Immediately, a smarter student from the last bench shouted - "FORMAT C COLON ENTER". It is left to your imagination about what happened next.

        B T J K N 5 Replies Last reply
        0
        • T trønderen

          I saw the first doctor speech-to-text program many years ago. The system recognized medical terms only, not general chitchat. So it was quite reliable, within its domain. Code also has a limited vocabulary, and a strict grammar. Assuming that the program knows the syntax, and maintains a parse tree and a current position within the parse tree. If a spoken word may have 2+ interpretations, chances are that some of the alternates will give a parse error, so they are not likely to be correct. In most cases, there will be one parseable interpretation. Your examples: If there is a declared variable or method named 'SumOfSquares', and it is syntactially legal at the current position, then it is in word. If you are in the middle of a literal string constant, it is more likely to be in three words (with no camel casing). If your have just opened an 'if' or 'while' condition, then it goes as x==5. If you have just completed the previous statement, and an assignment to x is a legal next statement, then it goes as x = 5. I am sure you could find examples where two entirely different interpretations of the speech would both be syntactically legal. But for the very most code, that is not the case. Side remark: I have a hobby of giving hell to speech synthesis - from text to speech. Even though it turns the problem upside down, there is a lot of common handling. I collect all sorts of words of differing meanings and pronunciations, but written identically. (First time I read "Lead guitar: ..." on a vinyl cover, I thought it was a joke on the bass guitar. Heavy!) I have gathered a handful of sentences which have two very different meanings, both grammatically correct. For 99% of the words, if you analyze the sentence, syntactically and semantically, only one interpretation and pronunciation gives a meaning. (But most speech generators do not sufficiently deep analysis to do it correctly.) Unfortunately, for this forum: My 'homograph' collection is in Norwegian, so the examples I could present would make no sense to the very most of you.

          D Offline
          D Offline
          Daniel Pfeffer
          wrote on last edited by
          #5

          trønderen wrote:

          Code also has a limited vocabulary, and a strict grammar. Assuming that the program knows the syntax, and maintains a parse tree and a current position within the parse tree. If a spoken word may have 2+ interpretations, chances are that some of the alternates will give a parse error, so they are not likely to be correct. In most cases, there will be one parsable interpretation.

          Which is what I said - one would need to build an appropriate parser for the language. I never said it was impossible.

          Freedom is the freedom to say that two plus two make four. If that is granted, all else follows. -- 6079 Smith W.

          1 Reply Last reply
          0
          • A Amarnath S

            Aside: here is a Speech Recognition joke of the previous millennium - A smart programmer went to a college classroom and proudly claimed that "My speech recognition software is so advanced that it can run voice commands on my DOS machine; you are free to test it now", and ran it. Immediately, a smarter student from the last bench shouted - "FORMAT C COLON ENTER". It is left to your imagination about what happened next.

            B Offline
            B Offline
            BernardIE5317
            wrote on last edited by
            #6

            somewhat off topic but perhaps amusing though you may have heard this story previously . re/ language translation early technology English to Russian and back again "The Spirit is willing but the flesh is weak ." -> Russian -> English "The vodka is strong but the meat is rancid ."

            D A T 4 Replies Last reply
            0
            • B BernardIE5317

              somewhat off topic but perhaps amusing though you may have heard this story previously . re/ language translation early technology English to Russian and back again "The Spirit is willing but the flesh is weak ." -> Russian -> English "The vodka is strong but the meat is rancid ."

              D Offline
              D Offline
              Daniel Pfeffer
              wrote on last edited by
              #7

              "Out of sight, out of mind" -> Russian -> English "Invisible insanity". :)

              Freedom is the freedom to say that two plus two make four. If that is granted, all else follows. -- 6079 Smith W.

              1 Reply Last reply
              0
              • B BernardIE5317

                somewhat off topic but perhaps amusing though you may have heard this story previously . re/ language translation early technology English to Russian and back again "The Spirit is willing but the flesh is weak ." -> Russian -> English "The vodka is strong but the meat is rancid ."

                A Offline
                A Offline
                Amarnath S
                wrote on last edited by
                #8

                Yes. A Professor in the School of Automation at the Indian Institute of Science, Bengaluru, by name Prof. M R Chidambara had told this, sometime in the 1980's. I've heard it directly from him.

                1 Reply Last reply
                0
                • B BernardIE5317

                  somewhat off topic but perhaps amusing though you may have heard this story previously . re/ language translation early technology English to Russian and back again "The Spirit is willing but the flesh is weak ." -> Russian -> English "The vodka is strong but the meat is rancid ."

                  T Offline
                  T Offline
                  trønderen
                  wrote on last edited by
                  #9

                  There used to be "services" on the net where you could set up a list of languages, such as English -> Russian -> Greek -> German -> English, and give it a text that would be passed through the specified series of translations. At least one of the services could even iterate the sequence until the result was stable (or in some cases, oscillated between two alternatives). I saved printouts of a few such iterations in my scrapbook, but didn't save the URL. It most likely would be dead today anyway. Does anyone know of any such service in existence today?

                  1 Reply Last reply
                  0
                  • B BernardIE5317

                    somewhat off topic but perhaps amusing though you may have heard this story previously . re/ language translation early technology English to Russian and back again "The Spirit is willing but the flesh is weak ." -> Russian -> English "The vodka is strong but the meat is rancid ."

                    T Offline
                    T Offline
                    trønderen
                    wrote on last edited by
                    #10

                    For quite a few years (it has been fixed now), Google Translate claimed that Norwegian 'postoppkrav' (charge on delivery) would translate to Swedish 'TORSK' (codfish). Regardless of source and target language, everything was first translated to English, and then from English to the target language. So the Norwegian word for charge on delivery became COD, an COD in Swedish, maintaining the capitalization, is TORSK. Both steps make perfect sense. Google also could translate English number to French: Forty - quarante, fortyone - quarante-et-un, fortytwo - 42, fortythree - 43 ... It was a mystery to me why it stopped at 41, and not at some "round" number. Maybe it was because "42" has an iconic value. But we are sidetracking from the subject "Speech to text".

                    1 Reply Last reply
                    0
                    • A Amarnath S

                      Aside: here is a Speech Recognition joke of the previous millennium - A smart programmer went to a college classroom and proudly claimed that "My speech recognition software is so advanced that it can run voice commands on my DOS machine; you are free to test it now", and ran it. Immediately, a smarter student from the last bench shouted - "FORMAT C COLON ENTER". It is left to your imagination about what happened next.

                      T Offline
                      T Offline
                      trønderen
                      wrote on last edited by
                      #11

                      The next thing was another person jumping up, yelling "Yes" to answer the question "Are you sure?" This was regularly claimed to be a "true" story from Microsoft's first demonstration of their text recognition. Lots of people did believe that the story was true. In Norwegian, we have a way of speech that goes "Well, if it ain't true, it sure is a good lie!"

                      1 Reply Last reply
                      0
                      • S Slow Eddie

                        Do you know of and can recommend a Speech to Text program that would work for coding. Or, is just impossible to do that sort of thing?

                        VB6 conversions to C#, VB.net

                        J Offline
                        J Offline
                        Jeremy Falcon
                        wrote on last edited by
                        #12

                        If you're determined to not use your hands, you can always have macros/code snippets to handle the parts that a program wouldn't get correct and have voice dictation run those. Then you can use the normal functionality for the parts that will. Technology is a long way away from making this a worthwhile pursuit though. You'd be better off having ChatGPT code your crap and you using text to speech to give it prompts.

                        Jeremy Falcon

                        1 Reply Last reply
                        0
                        • S Slow Eddie

                          Do you know of and can recommend a Speech to Text program that would work for coding. Or, is just impossible to do that sort of thing?

                          VB6 conversions to C#, VB.net

                          0 Offline
                          0 Offline
                          0x01AA
                          wrote on last edited by
                          #13

                          System.Speech.Recognition Namespace | Microsoft Learn[^]

                          1 Reply Last reply
                          0
                          • A Amarnath S

                            Aside: here is a Speech Recognition joke of the previous millennium - A smart programmer went to a college classroom and proudly claimed that "My speech recognition software is so advanced that it can run voice commands on my DOS machine; you are free to test it now", and ran it. Immediately, a smarter student from the last bench shouted - "FORMAT C COLON ENTER". It is left to your imagination about what happened next.

                            J Offline
                            J Offline
                            Jeremy Falcon
                            wrote on last edited by
                            #14

                            Amarnath S wrote:

                            A smart programmer went to a college classroom and proudly claimed that "My speech recognition software is so advanced that it can run voice commands on my DOS machine; you are free to test it now", and ran it. Immediately, a smarter student from the last bench shouted - "FORMAT C COLON ENTER".

                            That says nothing of the quality of the medium in which someone delivered that command. It could've been done with a keyboard just as easy, rendering the intended point moot. I thought this dude was supposed to be smart in the example?

                            Jeremy Falcon

                            1 Reply Last reply
                            0
                            • A Amarnath S

                              Aside: here is a Speech Recognition joke of the previous millennium - A smart programmer went to a college classroom and proudly claimed that "My speech recognition software is so advanced that it can run voice commands on my DOS machine; you are free to test it now", and ran it. Immediately, a smarter student from the last bench shouted - "FORMAT C COLON ENTER". It is left to your imagination about what happened next.

                              K Offline
                              K Offline
                              k5054
                              wrote on last edited by
                              #15

                              Obligatory xkcd : [xkcd: Listening](https://xkcd.com/1807/)

                              "A little song, a little dance, a little seltzer down your pants" Chuckles the clown

                              1 Reply Last reply
                              0
                              • D Daniel Pfeffer

                                Speech to text programs are pretty good these days, but code is not English. You would need a special language module for each computer language. For example, how would you enter a variable 'SumOfSquares'? Should it be one word or three? Is 'x equals 5' 'x = 5', or 'x == 5'? Other examples are easy to find. While I can see the utility of such a program for people who have lost the use of their arms/fingers, I have my doubts whether there are enough programmers in that state to make development commercially viable.

                                Freedom is the freedom to say that two plus two make four. If that is granted, all else follows. -- 6079 Smith W.

                                L Offline
                                L Offline
                                Lost User
                                wrote on last edited by
                                #16

                                I'd generate pseudo-code: set x to 5.

                                "Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I

                                D 1 Reply Last reply
                                0
                                • S Slow Eddie

                                  Do you know of and can recommend a Speech to Text program that would work for coding. Or, is just impossible to do that sort of thing?

                                  VB6 conversions to C#, VB.net

                                  L Offline
                                  L Offline
                                  Lost User
                                  wrote on last edited by
                                  #17

                                  Given enough time, anything is possible. Your code or someone else's? I use a dictionary to validate every word in my text-to-speech program. I use "markup" to indicate words that need to be spoken via phonetics. [RecognizedWordUnit.Pronunciation Property (System.Speech.Recognition) | Microsoft Learn](https://learn.microsoft.com/en-us/dotnet/api/system.speech.recognition.recognizedwordunit.pronunciation?view=netframework-4.8.1)

                                  "Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I

                                  T 1 Reply Last reply
                                  0
                                  • L Lost User

                                    I'd generate pseudo-code: set x to 5.

                                    "Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I

                                    D Offline
                                    D Offline
                                    Daniel Pfeffer
                                    wrote on last edited by
                                    #18

                                    I didn't say it's impossible. I said that code cannot be treated as a dialect of English. I think that the bigger obstacle for the development of coding text to speech is economic. I doubt that there are enough coders who need or would want such a system.

                                    Freedom is the freedom to say that two plus two make four. If that is granted, all else follows. -- 6079 Smith W.

                                    L J 2 Replies Last reply
                                    0
                                    • D Daniel Pfeffer

                                      I didn't say it's impossible. I said that code cannot be treated as a dialect of English. I think that the bigger obstacle for the development of coding text to speech is economic. I doubt that there are enough coders who need or would want such a system.

                                      Freedom is the freedom to say that two plus two make four. If that is granted, all else follows. -- 6079 Smith W.

                                      L Offline
                                      L Offline
                                      Lost User
                                      wrote on last edited by
                                      #19

                                      At my first job, they actually had a blind "intern" who programmed in Braile on his special typewriter. I don't remember how we got his program onto "cards", but I was asked to review his code. I can't help but think that some "braile to speech" would have helped his comprehension. (My issue is "slow" talkers). "Too much work" depends on the recipient.

                                      "Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I

                                      1 Reply Last reply
                                      0
                                      • S Slow Eddie

                                        Do you know of and can recommend a Speech to Text program that would work for coding. Or, is just impossible to do that sort of thing?

                                        VB6 conversions to C#, VB.net

                                        J Offline
                                        J Offline
                                        jschell
                                        wrote on last edited by
                                        #20

                                        It is possible. Just not very viable if other solutions exist. So with someone that is mobility limited it can be done. But for someone that can type, even just with a couple of fingers then that is how they should do it. I used the following with google and found multiple solutions.

                                        typing with speech to text code writer

                                        T 1 Reply Last reply
                                        0
                                        • L Lost User

                                          Given enough time, anything is possible. Your code or someone else's? I use a dictionary to validate every word in my text-to-speech program. I use "markup" to indicate words that need to be spoken via phonetics. [RecognizedWordUnit.Pronunciation Property (System.Speech.Recognition) | Microsoft Learn](https://learn.microsoft.com/en-us/dotnet/api/system.speech.recognition.recognizedwordunit.pronunciation?view=netframework-4.8.1)

                                          "Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I

                                          T Offline
                                          T Offline
                                          trønderen
                                          wrote on last edited by
                                          #21

                                          Gerry Schmitz wrote:

                                          I use a dictionary to validate every word in my text-to-speech program.

                                          A least that can give a recognition quality comparable to word-by-word translation from one language to another, with no concern about context or grammar. :-) (I suspect that you intended to write "... in my speech-to-text program". If you really meant text-to-speech, that is a different, although related, problem. Are you then referring to a pronunciation dictionary? How do you handle homographs?)

                                          L 1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups