Unicode in RISC OS

Unicode is a means of supporting characters from many different writing systems of the world, and for them to be manipulated in a uniform manner.

RISC OS 4
RISC OS 4 does not support Unicode; its only addition is an ISO-8859-15 (Latin9) character set to support the Euro symbol at code 0xA4. However its implementation is not strictly correct in that in Latin1 it also replaces the international currency symbol with a Euro:

0x80           0xA4 ISO 8859-1 (Latin-1)            undefined       int'l currency symbol RISC OS 3 Latin-1               undefined (*)   int'l currency symbol RISC OS 4 Latin-1               Euro            Euro RISC OS 5 Latin-1               Euro            int'l currency symbol Microsoft Latin-1 (CP1252)      Euro            int'l currency symbol ISO 8859-15 (Latin-9)           undefined       Euro RISC OS 5 Latin-9               undefined       Euro

RISC OS 5
RISC OS 5 provides a Unicode Font Manager which is able to display Unicode characters and accept text in UTF-8, UTF-16 and UTF-32. Other parts of the RISC OS kernel and core modules support text described in UTF-8. A Japanese Input Method Editor is available as is a specification for other languages.

On currently released versions of RISC OS 5, printing in Unicode is broken. John-Mark Bell writes on the zap-users list (20 Jan 2007): There are two issues: 1) Printing Unicode to a PostScript printer will break as PDriverPS just    embeds the Fonts:Encodings.UTF8 encoding file directly in the PS                    output. This file is not valid PostScript.                                       2) Printing UTF-16 or UTF-32 to any printer driver will fail as they're                not expecting anything other than an 8 bit encoding. Therefore, they pass the string to the FontManager without specifying that it's                    UTF<16,32> and the FontManager ends up interpreting the individual bytes of each character code as individual characters. This can result in the FontManager seeing control codes in the text string. The bug's              in the Printer Drivers as they don't pass the encoding information on. UTF8 shouldn't be affected in this case, however, as FontManager control characters can't occur as continuation bytes.

Issue 1 can be avoided by using the PostScript 3 printer driver (instead of using the native RISC OS PostScript printer driver) which was developed by John Tytgat and Martin Würthner. The ROOL project has released updated Printer Manager software which fixes issue 2.