Music Notation Parsing
-
I may be totally wrong, but it seems that there should be a fairly good piece of software that can "listen" to a WAV or MP3 of a song, and transcribe (polyphonically) the various parts into staves of notes for that instrument. Anyone know of such an application? Anyone aware of one being developed using machine learning? Thanks in advance.
I plan to take the trial versions of the software titles recommended in this thread, and try them out this weekend. Then I can report on how well they worked, given the same MP3. Trolls will be allowed to eat at that meal. :)
-
I think this is usually done in two steps: audio>midi and then midi>notation. Some of the audio>midi software works OK for a single-note (monophonic) line, but the recognition of midi note pitches from polyphonic (multi-note, multi-instrument) audio is inevitably a lot less accurate (to put it mildly). A lot depends on the instrumentation in the audio file. Complex instruments with lots of harmonics are harder to work with than e.g. a simple pure flute sound. Maybe Intelliscore, Wavemid or WIDI?
I agree. As an amateur musician (full time software engineer), if the software can get me 80% or better on the notes, I can sort out the instruments from hearing the song if the software can't get it. As for the key signature, if I can't spot a key signature and key change from the accidentals in a C key signature, then I should not be arranging music.
-
GenJerDan wrote:
isolating individual instruments
Easy in a three piece band perhaps, in an orchestra?
GenJerDan wrote:
map them to whatever
And how do you determine the key? There are only 12 notes. Working out the key is very difficult.
You would need to count the notes of each frequency of the whole piece, assume most are not accidentals, then compare the highest counts to a table of notes per key. If there is a question, you can look at the first and last notes of the piece to decide.
-
GenJerDan wrote:
isolating individual instruments
Easy in a three piece band perhaps, in an orchestra?
GenJerDan wrote:
map them to whatever
And how do you determine the key? There are only 12 notes. Working out the key is very difficult.
In a musical score you don't really show the key (the music could be in a mode instead) - you just apply the convenient number of sharps or flats after the clef sign. You'd just need a table of flats and sharps in the order they appear on the circle of 5ths. The program can notice that there are way more b-flats and e-flats than naturals, and put them by the clef. You'd want the table so you wouldn't end up with non-western key signatures with just an a-flat and a c-sharp. Modulation would be more difficult, but noticing that now there are all b and e-naturals, with c and f-sharps would be an indication. With instruments that throw out lots of overtones it would be difficult to determine which notes are actually being played. A Mixture Stop[^] on a pipe organ or an old Hammond drawbar would be pretty difficult to deal with.
-
You would need to count the notes of each frequency of the whole piece, assume most are not accidentals, then compare the highest counts to a table of notes per key. If there is a question, you can look at the first and last notes of the piece to decide.
Yes, getting complex isnt it.
-
In a musical score you don't really show the key (the music could be in a mode instead) - you just apply the convenient number of sharps or flats after the clef sign. You'd just need a table of flats and sharps in the order they appear on the circle of 5ths. The program can notice that there are way more b-flats and e-flats than naturals, and put them by the clef. You'd want the table so you wouldn't end up with non-western key signatures with just an a-flat and a c-sharp. Modulation would be more difficult, but noticing that now there are all b and e-naturals, with c and f-sharps would be an indication. With instruments that throw out lots of overtones it would be difficult to determine which notes are actually being played. A Mixture Stop[^] on a pipe organ or an old Hammond drawbar would be pretty difficult to deal with.
Take the intro to Sweet Child in Time. What key is that in? :)
-
Thanks, Ravi. Unfortunately, the app does not allow a trial period that reads WAV or MP3 files without providing a charge card. Before I handover card info, I need to see it work. I do appreciate your quick response.
I thought they had a free version that would let you notate by singing?
/ravi
My new year resolution: 2048 x 1536 Home | Articles | My .NET bits | Freeware ravib(at)ravib(dot)com
-
GenJerDan wrote:
isolating individual instruments
Easy in a three piece band perhaps, in an orchestra?
GenJerDan wrote:
map them to whatever
And how do you determine the key? There are only 12 notes. Working out the key is very difficult.
Munchies_Matt wrote:
And how do you determine the key? There are only 12 notes. Working out the key is very difficult.
The technology to do this has greatly advanced over the last 20+ years. There are several $100 pedals that do this very accurately. Many of them (Digitech, TC Helicon, BandInaBox) license the same software from a Canadian company (I forget the name). /ravi
My new year resolution: 2048 x 1536 Home | Articles | My .NET bits | Freeware ravib(at)ravib(dot)com
-
Munchies_Matt wrote:
Dream on, it is impossible.
MIT is already working on it - isolating individual instruments - in videos, anyway. From there you just need to do some FFT to get the notes, map them to whatever, then transcribe that. I wouldn't want to do it, but people already are.
We won't sit down. We won't shut up. We won't go quietly away. YouTube, VidMe and My Mu[sic], Films and Windows Programs, etc. and FB
As hinted at above -- it's not even as simple as identifying the pitch (frequency) of the note of a particular instrument. For example, a particular pitch could be considered a D# or an Eb, depending on the context. As for the rhythm, one example is that swing 8ths are notated on the page exactly the same way as Bach or Mozart's straight 8ths (let's see... those are quavers on the other side of the pond, I think.... two quavers per crotchet, isn't it?). And is it in 3/4 time or 6/8? That depends on where you the emphasis on those crotchets and quavers.
-
'Sounds' like a great concept! The problem would be isolating each instrument from a mixed track. You mean like this: Flying Colors - Infinite Fire - YouTube[^] (skip to 5:25 to see what I'm referring to)
"Go forth into the source" - Neal Morse
kmoorevs wrote:
The problem would be isolating each instrument from a mixed track.
You mean like this? https://gizmodo.com/mits-new-ai-powered-software-can-extract-individual-ins-1827372032
If you think 'goto' is evil, try writing an Assembly program without JMP.
-
I thought they had a free version that would let you notate by singing?
/ravi
My new year resolution: 2048 x 1536 Home | Articles | My .NET bits | Freeware ravib(at)ravib(dot)com
Ravi Bhavnani wrote:
I thought they had a free version that would let you notate by singing?
My intended use is to transcribe (within a reasonable percentage) a polyphonic musical WAV or MP3 file. Monophonic would not let me know how well the software works for what I want. Thanks
-
kmoorevs wrote:
The problem would be isolating each instrument from a mixed track.
You mean like this? https://gizmodo.com/mits-new-ai-powered-software-can-extract-individual-ins-1827372032
If you think 'goto' is evil, try writing an Assembly program without JMP.
-
Munchies_Matt wrote:
And how do you determine the key? There are only 12 notes. Working out the key is very difficult.
The technology to do this has greatly advanced over the last 20+ years. There are several $100 pedals that do this very accurately. Many of them (Digitech, TC Helicon, BandInaBox) license the same software from a Canadian company (I forget the name). /ravi
My new year resolution: 2048 x 1536 Home | Articles | My .NET bits | Freeware ravib(at)ravib(dot)com
A guitar effects pedal? Why would that need to know what key the guitar is playing in?
-
A guitar effects pedal? Why would that need to know what key the guitar is playing in?
Munchies_Matt wrote:
A guitar effects pedal? Why would that need to know what key the guitar is playing in?
To generate a harmony line and/or harmony vocals. Here are some examples:
Check out the videos on YouTube - they're pretty compelling! I used the DigiTech Harmony Man to create the harmony lead on this. /ravi
My new year resolution: 2048 x 1536 Home | Articles | My .NET bits | Freeware ravib(at)ravib(dot)com
-
Munchies_Matt wrote:
A guitar effects pedal? Why would that need to know what key the guitar is playing in?
To generate a harmony line and/or harmony vocals. Here are some examples:
Check out the videos on YouTube - they're pretty compelling! I used the DigiTech Harmony Man to create the harmony lead on this. /ravi
My new year resolution: 2048 x 1536 Home | Articles | My .NET bits | Freeware ravib(at)ravib(dot)com
Adding harmonics isnt the same as decoding music and working out the key it is in.
-
Adding harmonics isnt the same as decoding music and working out the key it is in.
The device needs to know the key you're playing in, in order to generate harmonies. /ravi
My new year resolution: 2048 x 1536 Home | Articles | My .NET bits | Freeware ravib(at)ravib(dot)com
-
The device needs to know the key you're playing in, in order to generate harmonies. /ravi
My new year resolution: 2048 x 1536 Home | Articles | My .NET bits | Freeware ravib(at)ravib(dot)com
No it doesnt. It takes the input sine wave and adds thirds and fifths commonly.
-
No it doesnt. It takes the input sine wave and adds thirds and fifths commonly.
Munchies_Matt wrote:
It takes the input sine wave
The control input to these devices is a chord played on a guitar, which (as I'm sure you know) is not a pure sine wave. The hardware (actually the software running on the hardware) determines the root note from the complex input waveform, and uses that to generate the selected harmonics. /ravi
My new year resolution: 2048 x 1536 Home | Articles | My .NET bits | Freeware ravib(at)ravib(dot)com
-
Pulling instruments out is impossible IMO, but the issue of key is crucial. Regardless of the same notation being used for various keys, if you cant tell the key, you cant score it in a standard way.
Working out the key is a trivially simple task that the human can do, either as an input to the program or later when editing its output. In fact, since the computer already knows the *pitches* of every note, a good heuristic would be to go through all keys and see which produces the least amount of accidentals in the resulting score. The really hard bit, as you pointed out first, is isolating the sounds of different instruments.
-
As hinted at above -- it's not even as simple as identifying the pitch (frequency) of the note of a particular instrument. For example, a particular pitch could be considered a D# or an Eb, depending on the context. As for the rhythm, one example is that swing 8ths are notated on the page exactly the same way as Bach or Mozart's straight 8ths (let's see... those are quavers on the other side of the pond, I think.... two quavers per crotchet, isn't it?). And is it in 3/4 time or 6/8? That depends on where you the emphasis on those crotchets and quavers.
kholsinger wrote:
As hinted at above -- it's not even as simple as identifying the pitch (frequency) of the note of a particular instrument. For example, a particular pitch could be considered a D# or an Eb, depending on the context.
If the machine could just identify the pitches of each note, that would be a massive leap forward and would simplify transcription greatly. Fixing up enharmonic equivalents later wouldn't present any kind of meaningful problem to the user and is a piece of cake compared to trying to transcribe a whole song by ear. That said, all automated transcription software I've tried to date has indeed done a really bad job at rhythmic dictation and tends to be produce a mess of rapid notes. This is more time-consuming to fix and badly obscures understanding of the music.