Previous Thread
Next Thread
Print Thread
Chinese translation help needed #115411 06/09/19 04:04 PM
Joined: Mar 2001
Posts: 16,274
R
R. Belmont Online Content OP
Very Senior Member
OP Online Content
Very Senior Member
R
Joined: Mar 2001
Posts: 16,274
I've got some actual Chinese Apple II-clone software, which came with Chinese filenames. I've done a translation pass with Google Translate's visual translator on my iPhone and most of the translations appear sensible (a few obviously aren't), but I'd like someone to proofread it.

小学语文01.dsk (Primary Language)
小学语文02.dsk
股票咨询系统(000000).dsk (Stock Advisory System)
成龙.dsk (Jackie Chan)
飞斧神童01.dsk. (Prodigy)
飞斧神童02.dsk
超级汉字文章编辑.dsk (Super Chinese characters article edit)
跟我学6502汇编.dsk (Learn With Me 6502 Assembly)
软件目录编印工具V2.2.dsk (Software Catalog Published Tools)
福尔摩斯-01.dsk (Sherlock Holmes)
福尔摩斯-02.dsk
生物钟曲线及性格咨询.dsk (Biological and Character Consulting)
中华机打印盒.dsk (Print)
通用数据库管理系统(YES) (1).dsk (Universal Database Management System)
电脑算命.dsk (Computer Fortune)
中华机系统盘cec-dos.dsk (System Disk CEC-DOS)
电子线路计算机辅助设计-01.dsk (Electronic Circuit Computer Aided Design)
电子线路计算机辅助设计-02.dsk
音乐黑板.dsk (Music)
计算机咨询服务.dsk (Computer Consulting Services)

Re: Chinese translation help needed [Re: R. Belmont] #115414 06/10/19 07:48 AM
Joined: Feb 2004
Posts: 2,055
Vas Crabb Offline
Very Senior Member
Offline
Very Senior Member
Joined: Feb 2004
Posts: 2,055
Well, just glancing at it, there are some obvious poor translations there. For example "小学语文" is roughly "primary school language and writing", "软件目录编印工具" is "software catalog printing tool", "生物钟曲线及性格咨询" is something like "biological clock and personality counseling", and "音乐黑板" is "music blackboard". But we're trying to move away from subjective English translations and use Pinyin for titles primarily aimed at Mandarin-speaking markets and Jyutping for titles primarily aimed at Cantonese-speaking markets. This is how the software lists for Haze's Chinese consoles are being done, and arcade stuff is being changed slowly.

Here's Pinyin for these titles (assuming the Chinese is correct) - for filenames just strip off the diacritics and replace spaces with underscores:
  • 小学语文01.dsk "Xiǎoxué Yǔwén 01"
  • 小学语文02.dsk "Xiǎoxué Yǔwén 02"
  • 股票咨询系统(000000).dsk "Gǔpiào Zīxún Xìtǒng (000000)"
  • 成龙.dsk "Chénglóng"
  • 飞斧神童01.dsk "Fēi Fǔ Shéntóng 01"
  • 飞斧神童02.dsk "Fēi Fǔ Shéntóng 02"
  • 超级汉字文章编辑.dsk "Chāojí Hànzì Wénzhāng Biānjí"
  • 跟我学6502汇编.dsk "Gēn Wǒ Xué 6502 Huìbiān"
  • 软件目录编印工具V2.2.dsk "Ruǎnjiàn Mùlù Biān Yìn Gōngjù V2.2"
  • 福尔摩斯-01.dsk "Fú'ěrmósī-01"
  • 福尔摩斯-02.dsk "Fú'ěrmósī-02"
  • 生物钟曲线及性格咨询.dsk "Shēngwùzhōng Qūxiàn Jí Xìnggé Zīxún"
  • 中华机打印盒.dsk "Zhōnghuá Jī Dǎyìn Hé"
  • 通用数据库管理系统(YES) (1).dsk "Tōngyòng Shùjùkù Guǎnlǐ Xìtǒng (YES) (1)"
  • 电脑算命.dsk "Diànnǎo Suànmìng"
  • 中华机系统盘cec-dos.dsk "Zhōnghuá Jī Xìtǒng Pán CEC-DOS"
  • 电子线路计算机辅助设计-01.dsk "Diànzǐ Xiànlù Jìsuànjī Fǔzhù Shèjì 01"
  • 电子线路计算机辅助设计-02.dsk "Diànzǐ Xiànlù Jìsuànjī Fǔzhù Shèjì 02"
  • 音乐黑板.dsk "Yīnyuè Hēibǎn"
  • 计算机咨询服务.dsk "Jìsuànjī Zīxún Fúwù"

Re: Chinese translation help needed [Re: R. Belmont] #115415 06/10/19 08:46 AM
Joined: Dec 2013
Posts: 128
X
xinyingho Offline
Senior Member
Offline
Senior Member
X
Joined: Dec 2013
Posts: 128
小学语文01.dsk (Language at Primary School, surely about Chinese Lessons during Primary School)
小学语文02.dsk
股票咨询系统(000000).dsk (Equity Investment Advisory System)
成龙.dsk (Jackie Chan)
飞斧神童01.dsk (litteraly "Flying Ax Child Prodigy", the title of an old Japanese movie of the 60s and an old HK comic of the 90s, I don't know if they're both related)
飞斧神童02.dsk
超级汉字文章编辑.dsk (Super Chinese Text Editor)
跟我学6502汇编.dsk (Learn 6502 Assembly With Me)
软件目录编印工具V2.2.dsk (Software Catalog Publication Tool)
福尔摩斯-01.dsk (Sherlock Holmes)
福尔摩斯-02.dsk
生物钟曲线及性格咨询.dsk (Biological Clock Curves and Personality Counseling, a bit literal translation here as "curves" should be understand as visual results from maths functions)
中华机打印盒.dsk (Print Cartridges for "Chinese Machine", "Chinese Machine" may refer to the Chinese Apple II-clone so maybe a driver software?)
通用数据库管理系统(YES) (1).dsk (Universal Database Management System)
电脑算命.dsk (Computer Fortune-Telling)
中华机系统盘cec-dos.dsk ("Chinese Machine" System Hard-Disk CEC-DOS)
电子线路计算机辅助设计-01.dsk (Electronic Circuit Computer Aided Design)
电子线路计算机辅助设计-02.dsk
音乐黑板.dsk (Music Blackboard, probably a software to learn music theory)
计算机咨询服务.dsk (Computer Consulting Services)

Edit: about Pinyin, I would change a few lines from Vas Crabb versions (mainly how spaces should be inserted to represent words):
飞斧神童01.dsk "Fēifǔ Shéntóng 01"
飞斧神童02.dsk "Fēifǔ Shéntóng 02"
软件目录编印工具V2.2.dsk "Ruǎnjiàn Mùlù Biānyìn Gōngjù V2.2"
中华机打印盒.dsk "Zhōnghuájī Dǎyìnhé"
中华机系统盘cec-dos.dsk "Zhōnghuájī Xìtǒngpán CEC-DOS"

Last edited by xinyingho; 06/10/19 08:58 AM.
Re: Chinese translation help needed [Re: R. Belmont] #115416 06/10/19 11:46 AM
Joined: Mar 2001
Posts: 16,274
R
R. Belmont Online Content OP
Very Senior Member
OP Online Content
Very Senior Member
R
Joined: Mar 2001
Posts: 16,274
Thanks to both of you! I'll get that software list put together now.

Re: Chinese translation help needed [Re: R. Belmont] #115418 06/10/19 12:34 PM
Joined: Dec 2013
Posts: 128
X
xinyingho Offline
Senior Member
Offline
Senior Member
X
Joined: Dec 2013
Posts: 128
This is for a new software list? I don't think it would be interesting to stick with Pinyin (Mandarin Chinese romanisation system) for titles, it would be better to stick to the original Chinese titles and add a proposed English translation as metadata.

The practice in China is to translate everything into Chinese, even old games that never got out there. When I discuss with Chinese people, they usually only know Chinese equivalent names. For instance, the NES / Famicom is only known as 红白机 Hóngbáijī, literally the "red and white machine" referring to the colour scheme used for the Famicom, although it has never been released in China, still being a third-world country at the time. As I understand it, the reasoning there is to have people that can't speak English be able to design it in a native way.

So, the issue of having Pinyin only, even stripped from their diacritics, is that nobody can understand the titles. English-only speaking people could somewhat read it in a very bad way but not understand them. Chinese-only speaking people would be able to read and understand them sometimes and some other times not even have a clue of what it's all about. Contrary to non-tonal languages like English, Japanese and Korean, Chinese is a tonal language where Pinyin diacritics are important clues to guess at what Chinese characters words in Pinyin may refer to. For titles, it's even more difficult because titles are short phrases without context. It can even be very hard to translate a Chinese title into English without having some context. So, for Chinese-speaking people, departing from Chinese characters to go to Pinyin and then remove their diacritics is like adding 2 layers of ambiguities.

In short, to maximise usability, it's important to keep somewhere the titles in Chinese characters for Chinese people and a translated English equivalents (even though imperfect they can be) as MAME is primarily developed and used by English-friendly people.

Edit: I would add that it would be equally important for Japanese titles to be kept written in kanjis and kanas somewhere as well. China and Japan had some political programmes that would lead to giving up Chinese characters in favour of Latin characters during the 20th century. It never actually happened for several practical reasons, some of them I listed above.

Last edited by xinyingho; 06/10/19 12:43 PM.
Re: Chinese translation help needed [Re: R. Belmont] #115419 06/11/19 04:12 AM
Joined: Feb 2004
Posts: 2,055
Vas Crabb Offline
Very Senior Member
Offline
Very Senior Member
Joined: Feb 2004
Posts: 2,055
There are practical reasons for using pinyin in software lists. Also, there are three different fields in the software list that you seem to be conflating: the description, the alternate title, and the ROM/image labels. The ROM/image labels are used by MAME as the filenames it searches for, and we restrict these to 7-bit ASCII (this is where we use pinyin with the diacritics stripped). The description is what's displayed in the internal UI, and also by command-line verbs like -listsoftware (we use Pinyin with diacritics intact here, for reasons I'll explain). The alternate title uses the original Chinese name of the software.

For filenames, we want to stick with stuff that's relatively easy for anyone to type and use in a command-line interface. Anything outside 7-bit ASCII will become difficult for someone to deal with. (Even within 7-bit ASCII there are characters that are prohibited by different filesystems and/or operating systems, and characters that are interpreted by shells. We're strict about prohibiting these in system/device ROM labels, we're not so strict about prohibiting them in labels in software lists.)

As for the description, we need people who don't understand the language to be able to recognise it, and to be able to refer to it over IRC, over the 'phone, or in a conversation. For better or for worse, Latin is the most-recognised alphabet in the world at the moment. Most people are going to be able to make some sense of a Latin transliteration. You and I can recognise a description written in Chinese Hanzi at a glance, but to the majority of the people on the team, a Hanzi (or Japanese kanji/kana, or Indian or Thai script) description is an impenetrable scribble. We do this for Japanese, Russian, etc. as well. There are also practical considerations for this.

On Linux and Mac (and certain versions of Windows 10 where it's broken), MAME doesn't support per-character font substitution. This means that Chinese descriptions won't display at all in the internal UI unless a font containing Chinese characters is explicitly selected. Also, there's no practical way for us to output Unicode to the Windows command prompt, so -listsoftware etc. won't work with Chinese descriptions. Pinyin with diacritics will get mangled as well, but at least you can see the initials/finals so you have some chance of guessing. (Bletch converted imgtool or something to use wchar_t output streams, and it's more broken on Windows than it was before - you get NUL characters interleaved everywhere. We can't convert MAME itself until we know it's going to work properly.)

The (kind of poorly named) alternate title in the alt_title attribute is not subject to these considerations, so you can put the original Chinese (or Japanese, Thai, Khmer, Arabic, etc.) title there. You want an option in your front-end to display alternate titles by default for software.

You can see practical examples of how we do different languages in existing software lists:


There are others as well, but I'm not going to list them all here.

Re: Chinese translation help needed [Re: Vas Crabb] #115420 06/11/19 07:08 AM
Joined: Dec 2013
Posts: 128
X
xinyingho Offline
Senior Member
Offline
Senior Member
X
Joined: Dec 2013
Posts: 128
Thanks for the explanations. The current choices to do things are quite sensible.

Originally Posted by Vas Crabb
Bletch converted imgtool or something to use wchar_t output streams, and it's more broken on Windows than it was before - you get NUL characters interleaved everywhere. We can't convert MAME itself until we know it's going to work properly.

wchar_t is compiler-dependent and, on Visual C++, is expected to be used with functions that works with UTF-16 little endian. UTF_16 forces every character to be on 2 bytes, so ASCII characters are represented with 1 meaningful byte and 1 zeroed byte while most Chinese characters (hanzi / kanji) used those 2 bytes. I don't know how Unicode characters that need 3-4 bytes to be encoded are handled.

Originally Posted by Vas Crabb
The (kind of poorly named) alternate title in the alt_title attribute is not subject to these considerations, so you can put the original Chinese (or Japanese, Thai, Khmer, Arabic, etc.) title there. You want an option in your front-end to display alternate titles by default for software.

Then it could be useful to have an additional info/@alt_title tag for a subjective English translation, maybe an info/@en_translation tag?

Re: Chinese translation help needed [Re: xinyingho] #115422 06/11/19 10:41 AM
Joined: Feb 2004
Posts: 2,055
Vas Crabb Offline
Very Senior Member
Offline
Very Senior Member
Joined: Feb 2004
Posts: 2,055
Originally Posted by xinyingho
wchar_t is compiler-dependent and, on Visual C++, is expected to be used with functions that works with UTF-16 little endian. UTF_16 forces every character to be on 2 bytes, so ASCII characters are represented with 1 meaningful byte and 1 zeroed byte while most Chinese characters (hanzi / kanji) used those 2 bytes. I don't know how Unicode characters that need 3-4 bytes to be encoded are handled.

On Windows, wchar_t is UTF-16, and on Linux and most other platforms it's UCS-4. MAME internally uses UTF-8 strings, but we have utility functions for converting to/from wchar_t. If you use std::wcout or std::wcerr, the standard library should do the right thing to get the wide characters onto the terminal or into the output file if redirected. It works properly on Linux and macOS, but for some reason on Windows it doesn't. Due to issues in the MinGW libraries or the Windows console subsystem itself, you get NUL characters all through the output. It's pretty easy to see by redirecting imgtool output to a file. MAME's current behaviour is to just output UTF-8 and allow it to be interpreted using the current ANSI codepage. It's broken, but it's less broken than what happens when you try to do the right thing and use std::wcout/std::wcerr. It also makes the -list* verbs produce valid UTF-8 when redirected to a file.

Re: Chinese translation help needed [Re: R. Belmont] #115423 06/11/19 11:37 AM
Joined: Mar 2001
Posts: 16,274
R
R. Belmont Online Content OP
Very Senior Member
OP Online Content
Very Senior Member
R
Joined: Mar 2001
Posts: 16,274
This is the MS blessed way to output UTF-16 to the console:

https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/setmode?view=vs-2019

You set the mode to UTF16 with _setmode() and then you can output with wprintf() or regular iostreams.

I did something entirely different in M1 to print Japanese characters on the Windows console but I don't remember what.


Who's Online Now
2 registered members (belegdol, 1 invisible), 61 guests, and 1 spider.
Key: Admin, Global Mod, Mod
ShoutChat Box
Comment Guidelines: Do post respectful and insightful comments. Don't flame, hate, spam.
Forum Statistics
Forums9
Topics8,673
Posts113,801
Members4,852
Most Online418
Aug 14th, 2019
Powered by UBB.threads™ PHP Forum Software 7.7.3