One is a music keyboard, other is a game console. Let's keep it on-topic.
The topic was "uPD77X chips", and these are definitely part of them, although it may be that only the release date of NEC chips determined their number, since the Casiotone patent suggest a gate logic implementation rather than a Von Neumann architecture with address and data bus. The pin count rarely tells much, since in many MCUs with same silicon die the unused pads in smaller package versions simply remain unbonded to reduce cost. (Pin order can be a stronger hint to the employed die.) At least I see none of the NEC uPD77X type numbers used twice for as well a game and a keyboard CPU. I had examined these Music LSI chips quite detailedly. This is what I found out.
Unusual is that in Casiotone 201 the keyboard matrix is polled by both CPUs those are mostly wired parallel. Both CPUs do exactly the same but have (in the manner of SIMD vector computing) different sound rom data. So each CPU outputs its polyphonic digital audio (that per channel is already made from 2 subvoices with each a digital volume envelope) through its own 14 bit DAC and a switchable fixed analogue filter to modify the timbre. Finally the analogue output of both filters is mixed together to form the sound signal. The filter settings are static and do not change during envelopes (likely to avoid dependencies between polyphony channels). Apparently one CPU can use the normal and the other a spread chromatic tone scale to produce a chorus effect (phasing) when layered.
Most waveforms are composed of symmetrical straight and ramp sections and look quite geometric. Only the sine wave looks as round as the coarse step resolution permits. Like with the later D931C, (which was detailedly researched by Robin Whittle) the sound generator apparently can not only mirror the waveform in hardware, but also skip either the positive or negative halves or even pass only every n-th wave cycle (i.e. a wave is followed by multiple wave lengths of silence) which creates the typical buzzy bass range known from squarewave based instruments. Unfortunately the employed waveforms in this mother of all Casiotones use mainly simple symmetric ramp patterns those don't sound too great. (Successors sound better.) I don't understand why Casio didn't use more asymmetric waveforms like a real sawtooth to imitate a trumpet.
The sound generation for the 8 polyphony channels is time multiplexed, thus like in most later 1980th Casio keyboards, all register contents of sound and envelope hardware is genuinely stored in 8 stage circular multi-bit shift registers. As a form of lightning fast hardware multitasking, after processing each channel they cycle to the next entry every clock step, so a task runs every 8th step and outputs its audio increment to an accumulator that finally sums them as a 14 bit DAC output value. Also this is a Casio speciality - who knows if a foreign patent prevented them from using address counters for cycling through sound channels, but shift registers may be also just a proven concept from calculator design they were most familiar with.
The concept of this instrument is described in the US patent 4283983 (particularly focussing on the user interface with tone memory). It is based on logic gates and not software controlled. The circuit for sound selection through keyboard keys (with waveforms and envelope hardware) is detailedly explained in US patent 4348932, and the part for sounding a demo note (and improvements like layered sounds) in 4387619.
In the nicely detailed reference implementation of patent 4348932 the keyboard input from the key matrix decoder is demultiplexed and then one line per key is running into a code converter (simple sort of ROM without address decoder) which outputs for each key number a 6 bit note frequency and 12 bit sound definition data for the preset sound selectable through that key. During sound selection the sound definition data is written into a register that controls waveform and envelope generator. And the described envelope generator is truly bizarre, because by the lack of multipliers it can not(!) change volume and waveform independently. Each waveform consists of straight and ramp sections (like sawtooth) of fixed steepness, so the amplitude can increase only by making that ramp either grow row by row ("fixed mode", like building a brick pyramid bottom-up) or dive up peak-first vertically out of the zero line ("floating mode"). A waveform is always 32 steps long and up to 15 steps high. Said 4 bit volume envelope consists of 5 linear sections {increase, transform, decrease 1..3} with different clock rates to roughly approximate logarithmic shape. All envelope clocks are derived from the pitch clock. During "increase" (attack) the waveform always grows in fixed mode (like opening a voltage limiter), during "transform" at full height it morphs into another waveform (square, ramp) with intermediate shapes looking like one waveform cut out of the other (like y=min(wave1(t), wave2(t)) ). The "decrease" steps can either make the waveform shrink vertically like closing a voltage limiter (fixed mode), or make it sink into the zero line (floating mode, peaks stick out last); a square wave pulse can even shrink also horizontally (floating mode, of course making the blank section longer to keep the same frequency). The whole morphing waveform generator works by adding/ subtracting steps at a certain clock rate to the amplitude; what is done in which section is switched by gates at certain waveform step numbers (of 0..31) and a comparator that compares the actual amplitude with the waveform step number and so switches addition or subtraction of clock pulses to the amplitude counter. Both halves (the part before and after step 16) of a waveform can morph independently, but a ramp in the first half is always ascending, in the 2nd half descending, so they can form a triangular wave. A square in the 1st half stays zero and has the pulse at the start of the 2nd. The described preset sound definition uses 2 bits for 4 settings of the fixed analogue filter, 3 bits to select 5 different envelopes, 5 bits to select 18 different waveforms and 2 bits to select 3 octave shifter settings. The output DAC has only 7 bit.
But this reference implementation substantially differs from the finished instrument. E.g. it lacks vibrato, sustain pedal and tone memory, and supports only 48 keys. The "Tone" switch is digital (i.e. 2 preset sounds for each key stored in the code converter for up to 96 sounds in total). "Tone" and sound select switch are outside the keyboard matrix. And instead of one demo note it even sounds a sequence of 3 (C4, C4#, D4) - a gimmick that was not implemented until the (technically very different) CT-8000 of the Symphonytron stage organ. The reference implementation even seems to use only positive half waves and may lack the mirroring mode for symmetric waveforms. The actual Casiotone 201 IC supports 8 instead of only 2 filter control outputs and its envelope generator definitely can change amplitude (likely logarithmic using a ROM lookup table) without morphing the waveform. It doesn't seem to distort all attacks by simulated voltage limiter envelope, but rather add a 2nd waveform (symmetric, i.e. quaterwave definition read out of ROM?) with short decay envelope to imitate things like string pluck noise of acoustic instruments. Several of these short waveform blips are eastereggs on the black keys, as well as even an unused sine wave, which may be even there for internal computation, because US patent 4453440 mentions a fast multiplication method based on subtracting 2 phase shifted versions of the same sine wave read from a ROM lookup table. Possibly also the bit shift multiplication network from US patent 4590838 (D931C predecessor) is used. While I don't see the pure morphing envelopes from patent 4348932, various preset sounds employ trapeze waveforms those may be indeed based on a triangular wave truncated by a vertically mirrored static version of said "fixed mode". In other sounds it even morphs a triangular wave by the "floating mode" sinking motion. So it may be that Casio indeed layered a modified version of their morphing waveform generator as the "vowel" with a technically different, ROM based attack waveform as the "consonant", hence the name "Consonant Vowel Synthesis".
The concept of geometric waveforms those gradually change shape by growing out of the floor or being truncated from one side to modify timbre has very strong similarities with the "paper sound" technique that exposed paper cut shapes as waveforms on film to be played as the sound track in a film projector. Also here timbres were modified by gradually cutting shapes by means of stopmotion cartoon animation (e.g. by moving a black shade over parts of the bright waveform). On the same idea the Russian Evgeny Sholpo created in 1930 the optical synthesizer Variophone - a mechanical contraption with waveforms on exchangeable spinning cardboard tonewheels that was used to compose polyphonic music for cartoon movies that sounded surprisingly similar like chiptunes.
Nevertheless that a Casiotone 201 doesn't sound overly great, it would be fascinating to simulate the original morphing waveform envelope generator of the patent 4348932 prototype in software (or perhaps even FPGA) to explore what it sounded like. This was the mother of all Casio keyboards, and this bizarre piece of minimalistic gate logics design without multiplication is so Pong-age - a weird chip invention like Atari Video Music that deserves to be preserved.
US patent 4387619 poorly describes a later variant of said reference implementation; the sound and envelope generator here supports 84 keys, an 8 bit DAC, vibrato and sustain. Its implementation is much more complicated (twice schematics size) with plenty of multiplexing, involving e.g. instead of the code converter a ROM followed by a bunch of gate logics to translate its 8 bit output into 13 control bits. Main reason for this was likely to implement a so-called "staggered multi-performance mode", i.e. a preset sound can consist of multiple layered subvoices occupying 2 or 4 polyphony channels ("duet", "quartet" - a feature that was not released until the much later "unison" modes in the CPU controlled Casio CT-6000), which needs independent management of key presses and polyphony channels. In an LSI chip without software control the routing is quite a mess - involving plenty of additional shift registers to memorize which key press belongs to which sounding note and such stuff. The master clock is stepping the shift registers with 1MHz, which permits rapid 8µs polyphony multitasking. The preset sound definition in ROM here uses 2 bit attack, 2 bit release, 2 bit period ( = pitch?), 1 bit delay, 3 bit waveshape designation (1 bit = fixed/floating, 2 bits select sawtooth, rectangle, triangle), 1 bit vibrato, 1 bit octave. But this system looks even more restricted than the first implementation - allowing only linear attack-decay envelopes without held notes (this resembles MT-30), and apparently the "floating" triangular wave is gone. Worst is it lacks the transform effect. At least waveforms have here 64 steps (center at 30) with 30 step height, and the delay bit can make the triangular wave asymmetric by slowing down attack. Unfortunately this patent text doesn't explain much, but rather describes wiring in lengthy sentences and omits details (like ROM data format), which makes it hard to understand. The key feature of this circuit is apparently the "staggered multi performance" mode, which loads 1 or 3 additional preset sound definitions from ROM as subvoices; 2 bits "minute difference" (+1/64, -1/64) apparently allow to detune these for chorus effect, 2 other bits set their octave range. This sound generator supports 7 octaves, but the highest one cheats by only repeating the 6th octave (foldback) with waveform changed to "floating sawtooth". The delay bit here apparently can delay a subvoice, and when with "quartet" multiple subvoices set this bit, the delays accumulate to make them sound one after another. According to the text, apparently the circuit is multi-timbrale enough that selecting a new preset sound would not affect held notes (201 can't do this), and other parts (mentioning "12 scales", perhaps poor translation) sound like that it can even transpose. The demo note (named "sample tone") here is indeed a single note. The digital vibrato is implemented by making a frequency derived from the clock add and subtract 1/64 (i.e. 1 waveform step?) to the waveform.
But also this implementation does not describe a finished instrument. E.g. it supports 84 keys, does not have preset sound selection through note keys (only mentioned as optional variant), no filters and has switches for vibrato, sustain (aka hold pedal?) and demo note on/off placed outside the keyboard matrix. Interesting is that the "minute difference" effect in layered preset sounds uses for chorus the same addition of +/-1/64 like the vibrato circuit. In the actual 201 the phasing disappears (at least on one channel on my scope) when vibrato or spread scale is on, so they obviously share the same internal resources. However the 201 does not reduce polyphony in any preset sounds, which proves that it is not based on the "staggered multi performance mode". Interesting is also that Casio refers internal polyphony channels as "lines" - a term that was later used for PD synthesis.
At this time Casio patented a lot of things those didn't make it into the final instrument. E.g. US patent 4476766 (priority date 1980) describes a key split mode with simple "any key play" sequencer to manually step with 2 key groups through each a previously recorded main and "accompaniment" (rather obligato) voice. The note data is stored here as key matrix signals (12 keys in each of 4 octave groups) in 2 RAMs. Although with timers it could recognize simultaneous notes (e.g. chords) to be stored as one step, it did not store note or pause duration and thus could not do autoplay, nor there was edit. The illustration drawings blatantly resemble 201. US patents 4522100 and 4594931 were a simpler variant without 2 simultaneous timbres; US patent 4361067 even describes a variant with playback volume change using different keys (instead of velocity) during "any key play". These 3 could also set the keysplit point through a simultaneous switch + key press (not bad for simple gate logics). Although these patents already mention loading note data from external means like barcode or RAM-packs, the first remotely similar feature in a finished product was the "one key play" in the monophonic Casio VL-1 (which had edit and autoplay), and even the barcode sequencer of VL-5 was still monophonic.