Compare recorded speech (with FFT?)
-
For language teaching a student hears a word and then repeats it. That gives me two wave-files, the first is my own, the second the one recorded by the student. That one will probably start end end with noise, of which I don't know the amplitude. The noise may also contain peaks (dropping a pen on the desk). I can't determine the exact start and end of the spoken word because of possible peaks and because the word may start and end 'under' the noise level. I must compare the two waves, which are in two short-int arrays, to see if the student pronounced the word correctly, based on rhithm, pitch and stress. So, 'rotor' in reply to 'motor' would be correct, but 'rotter' would not. Comparing the content of the waves doesn't work, so now I am wondering if using FFT might be the solution. I have tried WaveInFFT, processing the two arrays the same way and dispaying them in the same way, but do not see a resemblance between the two graphs. I hope someone can point me in the right direction. Ronald Wilmink (Netherlands)
modified on Friday, April 10, 2009 9:32 AM