Software Mix of Sound

Datacrime

I'd like to mix two or more sound streams. I don't want to use hardware mixer or DirectSound for two reasons. i) I want to save the output to a file. ii)I want absolut control of the timing. (ex. mix 3463ms of source 1 with 3463ms of source 2) Of course if you know a way to do both of the above by using DirectSound it whould be welcome. What i tried is for each WORD of sound1 and sound2 to get differce, divide it by 2 add add it to the lowest value. What i get was terrible. :( A second though is to multiply the two WORDs and divide the result with 65536 (or shift by 16bits). Although it sounds to be faster, i thing it's gona be the same. am'i right? And now comes the real hard stuff. What i wanna do at the end is to mix 2 -or more- sounds, and each sound having a weight, like: sound1(60%) + sound2(31%) + sound3(9%) any ideas? Memory leaks is the price we pay

RockNix

To mix several sounds with a specizific gain you have to scale every single sound by a factor and then simply add all sounds together. this could be sa followed : float gain1 = 0.6; // 60% float gain2 = 0.31; // 31% float gain3 = 0.09; // 9% int mix; // your mixdown mix = (int)(gain1*(float)sample1+gain2*(float)sample2 ....) But BEWARE: If you add to much sounds you will surely get an overflow. So it is better to pre-scale all sounds with the same factor. hope it helps Greatings Mario /// ---------------------- www.klangwerker.de mario@klangwerker.de ----------------------

Datacrime

Thanx a lot. After making several changes i relised that values from streams in .wav files are of type 'short' and not WORD (a.k.a. 'unsigned short'). In other words, values are between -32k to +32k and not 0 to 64k. After reading them correctly, i turn them into positive by adding 32k, mix them and turn the result into real by substructing 32k. For mixing i use the way you just mention. If i scale all gains,so that the sum of all gains is 1.00 , i don't have to worry about overflows. I also heard about mixing using L2-node which is equal to: sqrt(S1^2+S2^2+..+Sv^2) But in this case you get overflow, and have to scale the result with sqrt(2)=0.707 Also i don't know how to fit different gains in this formula (yet). I will try the L2-node , and inform you about which one gives the best result. Memory leaks is the price we pay

Joe Moldovan

I assume you are mixing 16 bit PCM files streams. If that's the case then you have two problems: 1. The sound is stored as 16 bit SIGNED walues not WORD (unsigned int) values. The amplitude is between –32768 and 32767. You reduce the amplitude by going towards 0. 2. The amplitude is not linear. If you halve the amplitude the sound will not be half as loud. It follows a logarithmic law and something to do with 20*Log(A) where A is the amplitude. (I don't know the theory real well...life is too short.) The library should be full of books which will help you. Hope you get it working. I love playing around with audio but don't have the time. ------------------------------------------------------------------------------------ PS: I made time! The following is a very primitive C mixer I threw together. It works but probably has bugs. I used CoolEdit to capture raw PCM files. If you use WAV files then just strip the headers first to get to the PCM data. But you probably already know that. You can also use the API to get the data a lot neater. #include "stdafx.h" #include int APIENTRY WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR lpCmdLine, int nCmdShow) { FILE *f1; FILE *f2; FILE *f3; size_t result1; size_t result2; short buffer1[ 0xFFFF ]; short buffer2[ 0xFFFF ]; short buffer3[ 0xFFFF ]; int i; f1 = fopen( "sound1.pcm", "rb" ); f2 = fopen( "sound2.pcm", "rb" ); f3 = fopen( "sound3.pcm", "wb" ); do { result1 = fread( buffer1, sizeof( short ), 0xFFFF, f1 ); result2 = fread( buffer2, sizeof( short ), 0xFFFF, f2 ); for ( i = 0; i < 0xFFFF; i++ ) { buffer3[ i ] = buffer1[ i ] / 2 + buffer2[ i ] / 2; } fwrite( buffer3, sizeof( short ), 0xFFFF, f3 ); } while ( !feof( f1 )); fclose( f1 ); fclose( f2 ); fclose( f3 ); return 0; }

Datacrime

I already found out that its 16 signed but i didn't know that i have to use log for reducing the volume. Acually i never come to that point of handling volume and never thought about it. I assume this have to do with mixing to. For ex. (Sound1*0.33+Sound2*0.66) will give me Sound2 and half the volume of Sound1 ? What i am up to is a mixer where you could create filters using JScript. Something like: function OnGetStereoPaningAt(TimePosition) { // sound allways on the middle return 0.5; } function OnGetStereoPaningAt(TimePosition) { // from left to right every 1 sec return ( (TimePosition%44100)/44100 ); } function OnGetWaveAt(TimePosition) { // return layer5 with 3sec echo value w1=Document.layer5.Wave.GetValueAt(TimePosition); value w2=Document.layer5.Wave.GetValueAt(TimePosition-44100*3); return Mix(w1,w2,1,0.3) ; } Of course nothing of that is implemented right now. :rolleyes: