Previous Thread
Next Thread
Print Thread
Page 8 of 9 1 2 3 4 5 6 7 8 9
Joined: Oct 2024
Posts: 43
Likes: 2
V
Vag
Online: Content
Member
Member
V Online: Content
Joined: Oct 2024
Posts: 43
Likes: 2
Hi everyone,
I've been searching and trying stuff for days, but without any good results...
Just some various information:

At some point I really thought that it's G.721 ADPCM, but Cool Edit Pro supports it and it seems it's not.

Here are some different ADPCM flavors that can be heard in the game though (but with a lot of noise):
Dialogic ADPCM
DVI-IMA ADPCM 4
IMA ADPCM
Yamaha ADPCM

This is interesting: https://ethw.org/Oral-History:Takao_Nishitani

I have also found this ("Real Time Implementation of 32Kbps ADPCM CODEC on NEC uPD7720D DSP") here:
https://www.dbpia.co.kr/journal/articleDetail?nodeId=NODE01328118
and here:
https://koreascience.kr/article/CFKO198403977717771.page
It's in Korean, but it explains some things regarding the algorithm. It doesn't have any source code though, only 4 pages are shown and it's all text. In the first link, only the first page is readable (if you try to buy it, you buy these 4 pages). The second link has all 4 pages shown clearly; I wish I could find the original.

But I just got a reply email from a Greek girl that has a similar problem, who pointed me to this page: https://wiki.muc.ccc.de/millennium:voiceware
that also has a link to this decoder: https://github.com/muccc/millennium/tree/master/code/voiceware-decoder
I just started reading it, it says there are opcodes (e.g. for echo, looping, fade out) and also that you can have PCM sounds instead of ADPCM sounds.
Well, if we're not able to produce new ADPCM sounds, but we can use PCM sounds, I will do that ;-)

I need some time to experiment with the opcodes and the PCM, there's still hope :-)

Joined: Oct 2024
Posts: 43
Likes: 2
V
Vag
Online: Content
Member
Member
V Online: Content
Joined: Oct 2024
Posts: 43
Likes: 2
In Alien Syndrome, the repeating opcode byte that means 128 bytes of sound data will follow, is 59, not 53 like in Golden Axe. According to the information in that wiki, this also means the frequency is 6000 Hz. So imagine if you could clear/enhance the original sounds with an AI tool and then convert them to NEC ADPCM 8000Hz. You can have the same sounds, just with better quality. I must say that there is a lot of empty space in the sound ROMs, enough for many sounds, if not all of them, to be converted to 8000 Hz ;-)

Joined: Oct 2024
Posts: 43
Likes: 2
V
Vag
Online: Content
Member
Member
V Online: Content
Joined: Oct 2024
Posts: 43
Likes: 2
I have found some more things...

The Sega AI computer uses the uPD7759 and has a ROM with NEC ADPCM sounds. Here is an interesting thread about the Sega AI Computer and the dumps (page 2) :-))
https://www.smspower.org/forums/15244-SegaAIComputerReverseEngineeringThread

Here's a pdf with a few new details: https://portal.etsi.org/stq/workshop2007presentations/Wideband%20Speech%20Telephony%201984-200.pdf
It calls it "WB-ADPCM" for (wideband) and it has a diagram. It also says "a combination of the French and the Japanese Codec candidates was standardized: G.722 (1988)".

I found this page https://milkchoco.info/archives/9324 that refers to this thread. It says "At the moment, I'm struggling to make software that can output NEC ADPCM."!

As I was searching for the uPD7720, I found this very interesting page: https://pawozniak.com/
"The aim of this project was to develop a fully functional emulator of the Speech Plus CallText 5010 hardware voice synthesiser used by Professor Stephen Hawking".
That device included a uPD7720 chip. They made a uPD7720 emulator, based on higan, which emulates the uPD7725.
I contacted Mr. Pawel Wozniak and Mr. Peter Benie, who were both helpful and extremely polite.

Joined: Mar 2013
Posts: 82
D
Member
Member
D Offline
Joined: Mar 2013
Posts: 82
It seems that the μPD7730 or μPD77C30 might be the only chip that functions as a dedicated ADPCM encoder - the other chips are either decoders or general purpose DSPs. I can only find a 2 page datasheet about this chip although I have not looked in the product catalogs yet.

I also found a diagram on page 5 of this 7751 datasheet that shows how the audio samples are mapped to ADPCM codes.

Last edited by Dodg; 12/01/24 03:16 PM.
Joined: Oct 2024
Posts: 43
Likes: 2
V
Vag
Online: Content
Member
Member
V Online: Content
Joined: Oct 2024
Posts: 43
Likes: 2
I found this question, 2 days ago: https://retrocomputing.stackexchange.com/questions/30984/what-adpcm-algorithm-did-nec-use-in-their-%CE%BCpd775x-chips
I hope someone will manage to write an encoder. The information there helps a lot!


As for the responses from Mr. Pawel Wozniak and Mr. Peter Benie, I will show everyone Peter Benie's message:

"Hi,
In our reverse engineering project, the original authors had simply discarded any code written by the people who built the CallText interface, and replaced it with their own byte-code interpreter for KlattTalk and supplied their own routines to generate the wave forms.
There was some digital processing done on the 7720, but I doubt it was any standard encoding, because there was no need to match any standard. The 7720 doesn’t have any sound-encoder, so it has no special support for ADPCM.
The two things the 7720 has that make it special are:
The 16 x 16-bit multiplier – one multiplication per clock cycle, and
On each cycle, the operation code tells all the functional blocks what to do, not just one block.
This is how it gets high throughput, and is what makes it a DSP chip rather than a general purpose CPU.
That doesn’t help you though.

Something that stands out from the datasheet is that ADPCM is not a single standard – it is a description of a technique for encoding. It is a bit like saying a number is encoded as Floating Point; it may be true but that leaves open several arbitrary choices for the encoding of mantissa and exponent.
I couldn’t find a manual with details for the 7759, but looking at
https://github.com/mamedev/mame/blob/master/src/devices/sound/upd7759.cpp#L494
your repeated byte does indeed mean 128 bytes come next, though the comments in the code more correctly say 256 nibbles. These are fed, one nibble at a time, into the ADPCM state machine in lines 327-365.
You have to stare at the state machine for a while to figure out what it does, but when you see it, it turns out to be quite simple. It matches the block diagram on page 5 in this paper:
https://people.cs.ksu.edu/~tim/vox/dialogic_adpcm.pdf
with the exception that on the 7759, the step size and output size are 9 bits, not 12, so scale everything down accordingly.

The key thing to note is that the values in the input data represent the differences between successive output values.
In order to get both fidelity and range, these differences are scaled according to one of the curves in this diagram.

[Linked Image from i.ibb.co]

To calculate the next output value, the decoder looks up the input value on the selected curve and adds that amount to the previous output value.
It then picks a new curve, depending on that input value. For small input values, it picks a shallower curve to use next time. For large input values, it chooses a steeper curve.
That’s all ADPCM is.

In more detail:
Let curves[0..15] be the selection of curves in the above diagram, each mapping [-7..-0, +0..+7] to [-255 .. +255]
Let selected_curve = 0
Let output_value = 0
For each input value: # input is in the range [-7..-0, +0..+7]
# Step 1 – calculate the next output value.
output += curves[selected_curve][input_value]
# Step 2 – pick the next curve
If |value| in {0, 1}: selected_curve -= 1 # pick a shallower curve
If |value| in {2, 3}: do nothing
If |value| == 4: selected_curve += 1 # pick a steeper curve
If |value| in {5, 6}: selected_curve += 2
If |value| == 7: selected_curve += 3 # pick a much steeper curve
Clamp selected_curve between 0 and 15 inclusive.

The input values are encoded as sign-and-magnitude, so -0 and +0 are distinct values, both with magnitude 0. The steeper curves treat -0 and +0 differently.

An interesting feature is that both the Dialogic algorithm and the 7759 algorithm both use 4-bit sign-and-magnitude data for the input stream, which begs the question, what would happen if you were to put the data intended for one algorithm into the other?
In general, positive values would yield a rising slope, and negative values would yield a falling slope. Furthermore, large values would result in a steep slope and small values would result in a shallow slope. But if the slopes were chosen differently, the shape would be wrong and sound would be distorted.
There’s a very high chance that the fundamental mode would be clearly audible and at the correct pitch (modulo sample rate); you’d recognise which sound it was, even though it would sound ‘off’.
You might also run the risk of overflow if the output values were larger than intended. In the mame emulator, output value overflow results in undefined behaviour (only unsigned integers have defined overflow behaviour), but in practice it is likely to result in wraparound, which you would hear as a very loud pop.
You would almost certainly get a d.c. offset in the output, but you normally wouldn’t be able to hear that except on the transition to a silent block. This offset would look like a random walk – varying but spending most of the time going nowhere. But if you waited long enough, it would eventually be large enough to make normal data cause an overflow, which you would definitely hear.
Based your description and on the above, I’m convinced that what is going on in the noisy sounds is that you are feeding in data intended for a different set of curves.
In the ksu paper, page 4 shows a block diagram of the simplest possible encoder, which will let you make new sounds.
The encoder can’t produce a perfect output – it will have some quantisation error. The important thing to note from the diagram is that the differences, d(n), are not the differences between successive input values; they are the differences between the current input and the immediately previous quantised output. This has the effect of carrying any error forward to the next calculation; the error is never lost, but it might take several cycles for it to be incorporated, depending on which curve is selected.

I don’t know if you’ve come across the z-transform notation before. In the block diagrams, most of the boxes are assumed to take zero time and have no state. The z^-1 boxes, are a one position shift register so they do have state; on each operation, a new value goes in and the old value is pushed out. This means that when you turn on the “machine”, there must already be a value in the shift registers. For both of them, use the value 0 to match the implementation of the 7759 decoder.
I hope that’s of some help.
Peter"

Joined: Mar 2013
Posts: 82
D
Member
Member
D Offline
Joined: Mar 2013
Posts: 82
Originally Posted by Vag
...with the exception that on the 7759, the step size and output size are 9 bits, not 12, so scale everything down accordingly.

As I understand it, this would mean that the estimated (or predicted) value must be clamped to a two's complement 9-bit value (+255 to -256 or +255 to -255) and the values in the step table also cannot be greater than these values. So, does that mean that the original sample value must be converted to 9-bits as well before the difference is calculated?

Joined: Oct 2024
Posts: 43
Likes: 2
V
Vag
Online: Content
Member
Member
V Online: Content
Joined: Oct 2024
Posts: 43
Likes: 2
To tell you the truth, that confused me. What I know is that the encoder needed 12-bit PCM as source, to convert it to 4-bit ADPCM. The decoder would then convert that to 9-bit PCM, that's how I understand it. If that's the case, we must ignore the 9-bit part.
As for the noise, maybe the source sound must be converted to 12-bit first, and then converted to ADPCM with the Dialogic algorithm. Maybe it's simple I mean, except that initial conversion. After all, if you import the ADPCM file to Audacity as raw VOX, the sound is clear - and VOX is really Dialogic ADPCM.
Unfortunately I have no time these days, I can't wait to try some more things, hopefully next week.

Joined: Oct 2024
Posts: 43
Likes: 2
V
Vag
Online: Content
Member
Member
V Online: Content
Joined: Oct 2024
Posts: 43
Likes: 2
Hi everyone,
I haven't really done anything about the sounds for at least a week, but yesterday I decided to finish everything else.
I fixed all issues and finished all remaining details. The last remaining thing was the memory test. I found the checksum that is used in the memory test in the service menu and I updated it with the new one; the memory test was finally a pass. But then I saw that the game does has protection after all. The sounds ids are all reduced by one, so you hear wrong sounds, and the players controls are swapped! Probably there are more surprises. So if you alter the ROMs, you either see the memory test fail, or you have a damaged game.
Well, I am hoping I can bypass this, by using balancing bytes. I'll try writing values to unused zero bytes that would change the new checksum, so that it remains the same as the original.

Anyway, that doesn't have to do with the sounds... I can't stop thinking that the sound is clear when you import it as VOX (which is Dialogic ADPCM). I believe the key is that the encoder uses 12-bit PCM sounds and not 16-bit ones, like all the sounds I had been trying to convert. ChatGPT suggested that instead of converting a sound to 12-bit PCM, I could normalize it (at least for testing), so that the PCM values are not over the maximum values of a 12-bit sound (4096). So yesterday I used Audacity to normalize a sound, in fact I reduced the volume by 24 dB (ChatGPT suggested 24 dB). Obviously the sound in the game is low, but it had significantly less hiss and noise! Maybe -24 dB wasn't enough and there were still values over 4096, and that's why there is still some noise. Most probably, this is the reason for the noise and the encoder must use 12-bit PCM sounds.
I'll search for existing tools that you can use to convert to 12-bit, for now :-)

Joined: Mar 2013
Posts: 82
D
Member
Member
D Offline
Joined: Mar 2013
Posts: 82
Originally Posted by Vag
I'll search for existing tools that you can use to convert to 12-bit, for now :-)

In the example codecs I have looked at, one of the OKI encoders converts 16-bit samples to 12-bit like this:

Code
if (sample < 0x7ff8) {
    sample += 8; // round up the sample value
}

sample >>= 4; // convert to 12-bit 

Processing the original samples with low pass or high pass filters will help, but the cause of the problem seems to be with the PCM values not being converted to the correct step table values. I saw in the datasheet that the 7730 can accept 16-bit samples via a separate ADC chip so converting to 12-bit may solve some problems but there may still be the "wraparound" effect that Peter Benie mentioned. I noticed that if I really bitcrushed the predicted values to a stupid value (7-bit?) then it actually played back in M1 without any noise, although the sample sounded very flat (bandwidth limited, like being played down a telephone line).

I am still looking into this, but I got distracted trying to fix some PowerPC code smirk I am having difficulty understanding how to determine what the correct step table values are apart from guessing what the values might be and how many values there are. This has always puzzled me - why does OKI use 49 values, but IMA use 89? These values all generate a curve of some type, so it might be possible to work out an equation to draw a new curve and then try different combinations of values that will fit the curve. I think you should release the translation as soon as you have fixed the remaining issues because the work you have done is very good, and go back to replace the samples later. I understand that it is important to have everything finished when you release the ROM but the game is complete without it and if you have limited time to work on ROM hacks it would be better to spend your time on the next System 16 game. By that time, maybe we will have been able to make some progress.

Joined: Oct 2024
Posts: 43
Likes: 2
V
Vag
Online: Content
Member
Member
V Online: Content
Joined: Oct 2024
Posts: 43
Likes: 2
Hello everyone,
I still haven't done anything about the sound encoding... Instead, I managed to change the service menu and the settings submenu. I added three settings, that are saved in NVRAM, so they are permanent. You can move the lever up/down to select a setting, or left/right to change the value of a setting. The three settings are the starting stage, the violence level and the music. The starting stage is operational; you select a stage and then you start a new game at that stage :-) The other two settings are not operational yet, I'm hoping this will change soon. In the Japanese version, you see blood dripping from the SELECT PLAYER letters, and also Ax Battler decapitating an enemy. In the setting, you can select none, one of them, or both. I haven't worked out how the game determines it's the Japanese or the English version yet. The last setting is for music during attract mode; there will be no music, all songs, random song, or a selected song. There is already a DIP switch setting for attract mode sounds, this will work independently.

I started looking at the music and I'm able to play the selected song. But I just realized that when I start another sound (like the sound I want in the logo), the music stops. On the other hand, the other sound effects in the attract mode don't affect the song. The way I play a sound, is to write its id (the number M1 also uses) at address FFECFC (FF6CFC or FF2CFC have the same effect). If I do that with a song, the song will play simultaneously with the other attract mode sounds. The problem is that I can't play another sound that way, because it will stop the song. It's like they are on the same channel, and there can only be one. Does anyone know how I should play the sound independently?
I will also try to determine when the sound stops (so I can repeat the song, or go to the next one).

Page 8 of 9 1 2 3 4 5 6 7 8 9

Moderated by  R. Belmont 

Link Copied to Clipboard
Who's Online Now
0 members (), 142 guests, and 0 robots.
Key: Admin, Global Mod, Mod
ShoutChat
Comment Guidelines: Do post respectful and insightful comments. Don't flame, hate, spam.
Forum Statistics
Forums9
Topics9,355
Posts122,423
Members5,082
Most Online1,283
Dec 21st, 2022
Our Sponsor
These forums are sponsored by Superior Solitaire, an ad-free card game collection for macOS and iOS. Download it today!

Superior Solitaire
Powered by UBB.threads™ PHP Forum Software 8.0.0