m |
|
Line 1: |
Line 1: |
- | <p>17th October 2002. © Peter Naulls.<br> | + | <p>17th October 2002. |
- | | + | |
- | 8th Janaury 2003 - brought up to date.</p>
| + | |
- | | + | |
- | <p>No reproduction without permission. Information in this document is given in good faith, but may be subject to errors
| + | |
- | or omissions, for which I cannot be responsible.</p>
| + | |
- | <hr>
| + | |
- | | + | |
- | <dl>
| + | |
- | <dd>
| + | |
- | <p><i>There has been a great deal of confusion over 32-bit in RISC OS
| + | |
- | over recent months. Not least, because of its technical nature, and
| + | |
- | secondly, because of the spin put on it by various parties. This
| + | |
- | document is in the same nature as my previous document -
| + | |
- | [http://www.riscos.info/unix/browser.html The RISC OS Browser Issue].
| + | |
- | In short, to bring together information which is already available, and to cut through
| + | |
- | the issues and make them clear and easily understood by everyone.</i></p>
| + | |
- | | + | |
- | <p><i>In order to make this document more palatable to non-programmers, certain
| + | |
- | technical issues have been simplified. I fully expect that programmers will be
| + | |
- | able to identify where this has been done, and appreciate the issue fully when
| + | |
- | it might concern them. Just be aware it is not my intention to simplify matters
| + | |
- | to the point of not doing them proper justice.</i></p>
| + | |
- | </dd>
| + | |
- | </dl>
| + | |
- | | + | |
- | <hr>
| + | |
- | | + | |
- | <h2>How it came about</h2>
| + | |
- | | + | |
- | <p>Once upon a time, when Acorn designed the original range of ARM chips, certain
| + | |
- | design decisions were made. One of these was to limit the range of addresses in
| + | |
- | which code can be executed to 64MB. This comes about, because the ARM's program counter
| + | |
- | (PC), which contains the location of the currently executing program is limited
| + | |
- | to 26-bits of the 32-bit ARM register 15. (ARM has 16 registers normally accessible,
| + | |
- | numbered R0 to R15). The remainder of the bits in this register reflect processor
| + | |
- | status.</p>
| + | |
- | | + | |
- | <p>At this point you may be wondering "hang on, but the RiscPC can use more than 64MB?".
| + | |
- | And yes it can. The reason is two-fold. First off, the 64MB limitation only
| + | |
- | applies to addresses where code is executed. All other registers can access data
| + | |
- | in the full 4GB range of memory. More importantly, all modern machines use a layer
| + | |
- | of memory mapping so that any <i>physical</i> address can appear at any <i>logical</i>
| + | |
- | | + | |
- | location in RAM at any given time. This is why applications always appear at address
| + | |
- | &8000, although many applications are running at once.</p>
| + | |
- | | + | |
- | <p>So the RiscPC (and all previous RISC OS machines) really are 32-bit machines in
| + | |
- | every sense of the word except the way their program counter works (and as we'll see
| + | |
- | shortly, for ARM6 onwards this isn't the whole story). But we do refer to all current
| + | |
- | versions of RISC OS as 26-bit, as they require the program counter to act in this
| + | |
- | fashion. And although not strictly correct, it is sometimes useful to refer to
| + | |
- | current machines as 26-bit to differentiate them from any new machines which uses
| + | |
- | a 32-bit RISC OS.</p>
| + | |
- | | + | |
- | <hr>
| + | |
- | | + | |
- | <h2>So why 32-bit?</h2>
| + | |
- | | + | |
- | <p>From ARM6 onwards, a 32-bit mode was added to ARM chips. In this mode, all(1) the bits of the PC were dedicated to the address
| + | |
- | of the currently executed mode. RISC OS continued to not take advantage
| + | |
- | of this mode, deferring the conversion to 32-bit until the present time. However,
| + | |
- | the Unix variants of ARM Linux and NetBSD/arm32 developed for RiscPCs shortly afterwards
| + | |
- | did take advantage of this mode.</p>
| + | |
- | | + | |
- | <p>In this mode, the status bits present in the program counter in 26-bit mode are
| + | |
- | now available in a new register - the current program status register, or CPSR, which
| + | |
- | is available via new instructions. On previous processors, such as ARM3, these instructions
| + | |
- | are NOPs and do nothing. The instructions in question - MSR and MRS work equally well
| + | |
- | in 26-bit and 32-bit modes.</p>
| + | |
- | | + | |
- | <p>So why do RISC OS machines need to move to 32-bit? Well, the StrongARM processor
| + | |
- | used in RiscPCs supports this 26-bit mode, as do many of the variants in the same
| + | |
- | generation, but the current generation of ARM chips - in particular XScale, ARM9,
| + | |
- | etc do not, as the manufacturers (ARM and Intel) have removed the 26-bit compatibility
| + | |
- | in preference for other logic such as Thumb support and various other embedded uses.</p>
| + | |
- | | + | |
- | <p>Therefore, if RISC OS wishes to move to newer processors, it must move to a 32-bit
| + | |
- | mode, if it wishes for a full hardware solution (the alternative is emulation on other
| + | |
- | hardware such as x86, which will be much slower).</p>
| + | |
- | | + | |
- | | + | |
- | <p>
| + | |
- | <font size=-1>
| + | |
- | <sup>1</sup> Well, not all the bits. As in 26-bit modes, the bottom 2
| + | |
- | bits are assumed to be zero, so all addresses lie on a multiple of 4. The use of these 2
| + | |
- | bits is beyond the scope of this article.
| + | |
- | </font>
| + | |
- | </p>
| + | |
- | | + | |
- | | + | |
- | <p></p>
| + | |
- | | + | |
- | <hr>
| + | |
- | | + | |
- | <h2>Moving to 32-bit</h2>
| + | |
- | | + | |
- | <p>It's been suggested that a move to a 32-bit RISC OS would lose a large amount
| + | |
- | of the existing software base. For many reasons, this isn't true.</p>
| + | |
- | | + | |
- | <p>For a 32-bit RISC OS, such as on the [http://www.iyonix.com/ RISC OS 5], the problem is that a given program may not immediately run on this
| + | |
- | OS, because of differences in processor mode.</p>
| + | |
- | | + | |
- | <p>Whether it will or not depends upon the type of program it is:</p>
| + | |
- | | + | |
- | <ul>
| + | |
- | <li><b>BASIC and other interpreted languages</b> - As long as they don't contain
| + | |
- | any assembler (and the majority will not - as high as 99% has been suggested)
| + | |
- | then these programs will run without incident without modification as they only rely
| + | |
- | upon the interpreter acting correctly.
| + | |
- | | + | |
- | <p></p>
| + | |
- | | + | |
- | <li><b>C Programs and other compiled languages</b> - A large number of RISC OS programs
| + | |
- | are written in C. These will also work correctly if recompiled to 32-bit using
| + | |
- | [http://www.drobe.co.uk/riscos/artifact449.html Castle's new 32-bit tools]
| + | |
- | or a [[GCCSDK|or a 32-bit GCC]], where appropriate.
| + | |
- | The programs will remain backwards compatible to 26-bit versions of RISC OS, as
| + | |
- | we'll see shortly.
| + | |
- | | + | |
- | <p></p>
| + | |
- | | + | |
- | <li><b>Programs written in assembler or programs producing ARM output</b> - these
| + | |
- | programs will need manual (or semi-automated) modification before they will run
| + | |
- | on a 32-bit RISC OS. This modification is not difficult, but it can be tedious.
| + | |
- | Also, some C programs can contain ARM code, and will need the same changes.
| + | |
- | </ul>
| + | |
- | | + | |
- | <p>At this point you might groan at the effort required to convert all these applications
| + | |
- | to 32-bit. But hold on there a moment - these are most certainly not the only ways
| + | |
- | existing applications can run on a 32-bit RISC OS. For the moment we'll stick with
| + | |
- | conversion since it's a logical place to start, and come to the other ways shortly.</p>
| + | |
- | | + | |
- | <p></p>
| + | |
- | | + | |
- | <hr>
| + | |
- | | + | |
- | <h3>APCS</h3>
| + | |
- | | + | |
- | <p>APCS, or ARM Procedure Call Standard, is at the heart of the issue of conversion
| + | |
- | to 32-bit. APCS defines a number of things, most of which we won't go into here,
| + | |
- | except to say that they concern the saving of processor flags. If you wish to know
| + | |
- | more, then refer to PRM 4.</p>
| + | |
- | | + | |
- | <p>There are quite a number of APCS variants, even within the same processor mode.
| + | |
- | I'll simplify the matter somewhat, by just talking about "APCS-26" and "APCS-32"
| + | |
- | which will refer to the two variants we care about. The former being using by
| + | |
- | existing 26-bit RISC OS, and the latter by a 32-bit RISC OS.</p>
| + | |
- | | + | |
- | <p>There are a number of specifics to conversion from APCS-26 to APCS-32, which
| + | |
- | are covered in the documentation with Castle's tools, and I'll briefly summarise
| + | |
- | here. These are simplications, but serve for illustrative purposes:</p>
| + | |
- | | + | |
- | <ul>
| + | |
- | <li>APCS-32 does not expect function calls to save processor flags. To do so
| + | |
- | would be an onerous burden, and in many cases, it isn't actually needed anyway.
| + | |
- | APCS-26 requires saving of flags, and this is facilitated by instructions that work
| + | |
- | incorrectly in 32-bit mode.
| + | |
- | | + | |
- | <p></p>
| + | |
- | | + | |
- | <li>For C code, this generally means the compiler doesn't attempt to restore
| + | |
- | flags with the ^ flag on LDM, or MOVS PC, LR - and instead use LDM without the ^
| + | |
- | or MOV PC, LR.
| + | |
- | | + | |
- | <p></p>
| + | |
- | | + | |
- | <li>Instructions which attempt to modify the processor state by using TEQP etc.
| + | |
- | are out. Instead, they should modify it via MSR/MRS. The exception here is
| + | |
- | for code which also must work on ARM3, and does not have these instructions.
| + | |
- | This code is reasonably rare, and can easily be accomodated by checking for 32-bit
| + | |
- | mode, and taking advantage of the NOP status of MSR/MRS on these processors.
| + | |
- | </ul>
| + | |
- | | + | |
- | | + | |
- | <p>A crucial point I wish to make is that APCS-32 code <b>will work correctly on 26-bit
| + | |
- | RISC OS <i>and</i> 32-bit RISC OS from ARM6 to XScale</b> and also ARM3 if care is taken.
| + | |
- | This means once a program is 32-bit, there need be only one version. Having a profusion
| + | |
- | of versions of programs would certainly be confusing.</p>
| + | |
- | | + | |
- | <p>So, to convert a C program to APCS-32 is simple - recompile it. Although you will
| + | |
- | need to recompile all its libraries too. For programs using Acorn C/C++, a recompile
| + | |
- | using the default options with Castle's tools is exactly what you want. </p>
| + | |
- | | + | |
- | <p></p>
| + | |
- | | + | |
- | <hr>
| + | |
- | | + | |
- | <h2>Assembler Conversion</h2>
| + | |
- | | + | |
- | <p>Manual assembler conversion can be rather tedious. Fortunately Castle's tools
| + | |
- | do provide some macros to help you out, and the assembler will produce warnings for
| + | |
- | non 32-bit code. It can be a little daunting at first, but with some practice, it
| + | |
- | becomes quite easy. In future, I hope to produce a reference guide which shows
| + | |
- | what instruction sequences to replace with what.</p>
| + | |
- | | + | |
- | <p>David Ruck has produced a tool, [http://www.armclub.org.uk/free/ Armalyser]
| + | |
- | which is able to produce a disassembly of ARM code - whether it's an application, module
| + | |
- | or other RISC OS binary format. Most importantly, it can output a format which lists
| + | |
- | non 32-bit instructions, and which can be fed directly back into an assembler once
| + | |
- | you have made changes. This is extremely useful if a program's source or libraries cannot
| + | |
- | easily be obtained for recompilation, or for checking your existing applications and modules.</p>
| + | |
- | | + | |
- | <p></p>
| + | |
- | | + | |
- | <hr>
| + | |
- | | + | |
- | <h2>Other methods</h2>
| + | |
- | | + | |
- | <p>Converting a program to 32-bit isn't the only way to have it run on a 32-bit RISC OS.
| + | |
- | This is by no means the full list, but it's the most obvious.</p>
| + | |
- | | + | |
- | <ul>
| + | |
- | <li><b>Emulation</b> - In this case, we provide a program that looks at each instruction
| + | |
- | in a program to be executed, and gets the host processor - that is, the 32-bit
| + | |
- | processor the machine is running on - to perform some equivalent action, either
| + | |
- | directly, or via traditional emulation means. There are quite a number of ways
| + | |
- | to implement this, but by and large they act similarly. The great advantage
| + | |
- | of emulation, is that is quite reliable, and reasonably easy to implement and
| + | |
- | get correct. The disadvantage is that it can be quite slow, with applications
| + | |
- | running substantially slower than a native 32-bit application.
| + | |
- | | + | |
- | <p>I have written my own proposal on how this might work. It actually refers to
| + | |
- | emulation of an ARM3 processor on a RiscPC for the purposes of Archimedes emulation
| + | |
- | in [http://arcem.sf.net/ ArcEm], but the problem is very similar
| + | |
- | and the ideas suggested are sound. Since it's quite short, it fails to address
| + | |
- | many of the specific problems, and is certainly far from being a complete specification.
| + | |
- | Nevertheless, you can read about [http://web.archive.org/web/20030621031148/www.riscos.info/32bit/ARMEmu.txt ARM on ARM emulation].</p>
| + | |
- | | + | |
- | <p></p>
| + | |
- | | + | |
- | <li><b>JITing</b> - Just in time compilation. This means the ARM instructions
| + | |
- | are dynamically compiled to a version that is suitable for directly running
| + | |
- | on the host machine. This has a significant speed advantage over emulation,
| + | |
- | potentially approaching a significant fraction of a natively run application.
| + | |
- | It can also deal with some dynamic sequences that may not be easily handled
| + | |
- | in dynamic reassembly.
| + | |
- | | + | |
- | <p>The problem with JITs is that they quite difficult to write, with the
| + | |
- | author needing a somewhat advanced knowledge of compilation techniques.</p>
| + | |
- | | + | |
- | <p>[http://www.aemulor.com/ Aemulor] is an example of a JIT.</p>
| + | |
- | | + | |
- | <p>Jason Tribbeck has also pointed me at a project he was considering
| + | |
- | with some JIT aspects to it, called <i>Cloe</i> (Code Lookahead
| + | |
- | Optimal Emulator). There are 4 documents:</p>
| + | |
- | | + | |
- | <ul>
| + | |
- | <li>[http://www.chios.org.uk/cloe/cloe1.html cloe1]</li>
| + | |
- | <li>[http://www.chios.org.uk/cloe/cloe2.html cloe2]</li>
| + | |
- | <li>[http://www.chios.org.uk/cloe/cloe3.html cloe3]</li>
| + | |
- | <li>[http://www.chios.org.uk/cloe/3b.html cloe3b]</li>
| + | |
- | </ul>
| + | |
- | | + | |
- | <p></p>
| + | |
- | | + | |
- | <li><b>Dynamic reassembly</b> - This is a method best suited to programs compiled
| + | |
- | from C, as they contain well structured code which is easily converted
| + | |
- | automatically. In short, a program is disassembled, and then the APCS-26
| + | |
- | only instructions are replaced with APCS-32 ones. There are also some issues
| + | |
- | with C runtime fixup, but these are much the same for all programs. Finally,
| + | |
- | the program is run.
| + | |
- | | + | |
- | <p>Armalyser already does most of the work for this to work.
| + | |
- | This type of scheme is only really failsafe for pure C programs, and it
| + | |
- | could quite reasonably refuse to go on if it encountered any difficult
| + | |
- | instruction sequences. But the great advantage is that programs run
| + | |
- | at full speed.</p>
| + | |
- | | + | |
- | <p>You may also notice that this process could be done one off, statically,
| + | |
- | then the new binary distributed, but this isn't always appropriate nor
| + | |
- | as transparent. However, it may be exactly the ticket for software authors
| + | |
- | of some programs.</p>
| + | |
- | | + | |
- | </ul>
| + | |
- | | + | |
- | <p></p>
| + | |
- | | + | |
- | <hr>
| + | |
- | | + | |
- | <h2>Conclusion</h2>
| + | |
- | | + | |
- | <p>To recap, I will restate some of the points made in this document:</p>
| + | |
- | | + | |
- | <ul>
| + | |
- | <li><p>Most programs can be made to work without great difficulty on a 32-bit RISC OS
| + | |
- | via one of a number of methods - many of which are mentioned here.</p>
| + | |
- | | + | |
- | <li><p>Programs compiled to be 32-bit will also correctly work on previous versions
| + | |
- | of RISC OS.</p>
| + | |
- | | + | |
- | <li><p>You can view the conversion to 32-bit a little like the way the Y2K was dealt
| + | |
- | with. If it's tackled now, it won't be an issue, when it becomes an issue.</p>
| + | |
- | | + | |
- | <li><p>If you're developing RISC OS applications right now, make them 32-bit, as
| + | |
- | that means less effort in the long run, and forward compatibility.</p>
| + | |
- | | + | |
- | <li><p>All my [http://www.riscos.info/unix/ Unix Ports] are fully 32-bit compatible.</p>
| + | |
- | </ul>
| + | |
- | | + | |
- | <p></p>
| + | |
- | | + | |
- | <p>As a reminder, the important bits when it comes to running programs on
| + | |
- | 32-bit RISC OS:</p>
| + | |
- | | + | |
- | | + | |
- | <ul>
| + | |
- | <li><p>C programs need to be recompiled with a 32-bit compiler.</p></li>
| + | |
- | | + | |
- | <li><p>Pure Basic programs will work unmodified.</p></li>
| + | |
- | | + | |
- | <li><p>ARM programs will need changes made to work correctly.</p></li>
| + | |
- | | + | |
- | </ul>
| + | |
- | | + | |
- | <p></p>
| + | |
- | | + | |
- | <hr>
| + | |
- | | + | |
- | <h2>Finally</h2>
| + | |
- | | + | |
- | <p>Some thoughts of my own that have somewhat less basis in fact:</p>
| + | |
- | | + | |
- | <ul>
| + | |
- | <li><p>Emulation (or emulation-like) 26/32 technologies such as the mentioned emulators
| + | |
- | and MicroDigital's Omega may actually hinder or slow the eventual conversion to
| + | |
- | 32-bit, as there will not be an immediate need to convert applications.</p>
| + | |
- | | + | |
- | <li><p>Software vendors may well charge for 32-bit upgrades to this software. This
| + | |
- | is quite reasonable given the effort involved, and I expect the charges
| + | |
- | to be quite minimal. It will also give a helpful cash injection into the
| + | |
- | market and encourage developers.</p>
| + | |
- | </ul>
| + | |
- | | + | |
- | <p></p>
| + | |
- | | + | |
- | <hr>
| + | |
17th October 2002.