I noticed that when I view waveforms being output from mame, they have a very strong DC positive offset. However, when I play TMS5220 speech from the M1 app (which uses mame code) it has a mean of 0 and therefore has no offset. You can see in this screenshot, Gauntlet is speaking "Traps make walls disappear" with mame on the left, and with m1 on the right. The waveform on the right is much "better".http://tinyurl.com/pwr2bba
I tried setting all the K parameters to 0 to see the raw excitation filter, and I see this:http://tinyurl.com/n2zbl38
The voiced frames are completely positive, but the unvoiced frames have a zero mean. You can see, every place in the waveform that goes negative are samples created from the random noise generator (unvoiced frames with a pitch of 0).
I am wondering why is there this offset, and does anyone know of an easy way to get rid of it? I cannot simply subtract the average of samples because the unvoiced ones do not have the offset...