[Rpcemu] RPCEmu 0.8.7

Jeremy Nicoll - ml roinfo jn.ml.roi.39 at wingsandbeaks.org.uk
Mon Sep 6 04:27:33 PDT 2010


Dave Symes <dave at triffid.co.uk> wrote:

> In article <mpro.l8aegt002aj0f02o0 at wingsandbeaks.org.uk.invalid>,
>    Jeremy Nicoll - ml roinfo <jn.ml.roi.39 at wingsandbeaks.org.uk> wrote:
> > Dave Symes <dave at triffid.co.uk> wrote:
> 
> > > While I'm awaiting some comment from the Select mailing list, I've
> > > done a very nasty hack that gets networking running on Select RO 4.39
> 
> > Nasty is right. Although you've explained the changes you've not
> > explained what they achieve.
> 
> I thought I did make a note Jeremy.

You miss my point.  I understood what you'd commented out, but didn't have a
good understanding of why /you/ thought those /specific/ changes - no more,
no less - were required.  How would you summarise in one sentence what you
did? 

Saying "remove all error checking from network configuration" might be what
you coded, but what you achieved might have been something like "prevent
slow initialisation of network interface from halting boot".

It's a statement like the latter that's more useful, not least because it
hints at what a final solution might be.  


In this case, supposing slow initialisation was a problem, the solution is
NOT to eliminate error checks, because who knows what else may go wrong?
Suppose you have future problems because you removed the error checks - will
you even remember that you removed them?  Will you tell anyone trying to
help you that you removed them, so all their assumptions about what's going
on will be wrong?

What would be needed is to fix the slowness, or introduce a small deliberate
delay so that booting waits a short time before going on.

You have to wonder why something might halt if error checking takes place,
but work if it doesn't.  On a real machine, one can predict reasonably well
how fast a fixed piece of hardware will initialise.  On an emulated one,
multi-tasking of the host OS may mean that elapsed times are greater than
they would be on real hardware, and elapsed times may vary widely depending
on what state the host machine is in and how busy it is.  I'd guess that the
reason your boot process works when error checking is removed is that some
interface has initialised too slowly.  Although it's not ready by the time
the error check would have been made, it probably IS ready by the time any
subsequent command tries to use the interface.

The proper fix might be to change the way the initialisation is handled - eg
to delay the error check by a few milliseconds or even a second or two.  Or
change the command that starts the interface, or have the developers change
the emulation itself.     



 
> The RPCEmu 4.39 boot gets to... 
> "adfs::Hardisc4.$.!Boot.Choices.Hardware.Disabled.Boot.Predesk.SetUpNet"
> and there's something wrong, because it hangs at at that point, and
> there's no way without either deleting that file  (Consequence no
> networking as SetUpNet is meant to load the Net configs) or doing the hack
> I did, to get the boot to complete, and get networking.
> 
> The line. "Run BootResources:!Internet" contained in SetUpNet is not doing
> it, because it can't find it, or some other reason, I obviously have no
> idea, because I'm just a computer user.

Everyone's "just a computer user" unless they make an effort to understand
more.  You post a lot on these mail lists... but do you take the trouble to
read other people's posts?   If you do, then over a period of time, you'd
learn things.


A good way to understand more is to look at the commands that get issued and
see what they do (eg read the help information for commands, ask on the
newsgroups etc).  Even a vague idea of what a command does is better than no
idea.

 
> I have no idea what the word "BootResources" is meant to be, there's no
> file or directory on my computer called BootResources)

The command was:

  Run BootResources:!Internet

A run command merely takes a filename as an argument.  When you see
something ending in a colon, it's an OS shorthand for, IIRC, a path
definition.  If you were to look at the system variable called

  BootResources$Path

you'd find it was (probably) a list of directories.  It gets set earlier in
!Boot, according to the file structure inside your !Boot.  

What this means is that RO will look in several places for an app called
!Internet, and run the first one it finds.  Also, when *Run is told to run
an app, what that means is that the app's !Run file (normally but not always
an Obey file) gets run, ie Obeyed.  


A REALLY GOOD WAY to see all of this is to use !Reporter and log booting,
and browse the logs.  

!Reporter logs everything in Obey files, not just the commands that get
executed.  It also logs the commented-out commands and explanatory comments.
 The latter help with understanding - supplied files sometimes contain
explanations of what they do - and those appear in the log beside the
commands.

[I do hope that when you commented lines out of some files, you also put a
date-stamped comment in to them saying what you were doing and why...]    
    
 
> Doing a *Show BootResources in a Task window presents nothing.

Tha's because the significant variable doesn't have that name.  If you issue
a 

  *show bootres*

for example, it will list all vars whose names /start/ "bootres".


> Doing a *Show in a Task window then searching it for "BootResources" just
> turns up a couple of references, but they are nothing to do with network
> configuration.

Indeed.  It's a path locating places in the boot file structure where
resources (for all sorts of things) may be found.  The internet parts of
that are in !internet inside one of those directories.

 
> Why, if the line is in an Obey file, is it not doing it?

You have no evidence that it's not doing it.  You just know that while
getting ready to do it or while doing it something seems to have gone wrong.

What went wrong?  Can't tell without seeing detailed logs.

> 
> My hack forces the OS to take note of the networking configs in those
> explicitly named files. two of which contain a "CheckError" reference
> which, if not barred out will prevent the files from being actioned.

Yes I know.  But how did you know that those configs were meant to be
executed in that order at that time?

 
> Again No idea what/why I'm just a computer user.

Claiming you're a naive computer user, but at the same time being prepared
to tinker with the insides of files you don't fully understand is really not
very sensible.

Commenting out 'checkerror' lines, which are clearly meant to check for
errors, strongly suggests that an error is happening.  I'd have been more
inclined to find out what that was than remove the error check.


> I'm sure you are right, unfortunately I'm just a computer user with a
> veneer of knowledge, so most of what you've written is just words, and
> while I may, from many years of hanging around here, understand individual
> words, put together with other script/coding words... I have no idea.

Why not make the effort then?





> One last thought...
> While Reporter is a wonderful app, I can't see how it could report on a
> Boot Sequence that has only just started, the SetUpNet stall is so early
> in the 4.39 boot sequence... Reporter isn't even a twinkle in the OSs eye
> at that point.

Wrong.

When you start a RO machine there are various hardware booting things that
get done first which you can't trace.  They (normally) culminate in !Boot
being run. 

If you Shift-boot your machine, at the point where !Boot would have run the
/machine/ is up - you have a star prompt and can issue commands.  If
something goes wrong before that (eg there's a failed podule card, or duff
memory, or a motherboard fault) then diagnosing and fixing that is harder.

But once the basic machine is going, any star command can be issued, and
that includes the commands that can start !Reporter.
  
!Boot is just a set of files on the selected boot disk and the machine will
certainly work without any of those commands being executed. It has to,
because all that !Boot is, is a lot of pre-defined commands to set up the
environment you're used to using, and the machine has to be working to be
able to run those commands.

If !Reporter is installed and configured to log booting, the module that
actually does the work gets loaded right at the very start of !Boot being
run, long before any of the 'boot sequence' starts.  If you have this
installed and have told it to log all commands, then the log file will show,
for example, each command in Obey files being considered for execution,
conditions evaluated in them, and commands being run or not.

So for example if a line in an Obey file says

  IfThere <somethingorother> Then do something

the log will show the value (ie filename) of <somethingorother> as well as
whether RO decided to execute the resultant command.  If the command was to
be executed, it will be traced too. 


Most *If commands look at the values of system variables.  These tend either
to have been set explicitly by *set commands in earlier parts of the boot
sequence (and will therefore also have been logged), or will have been set
by programs run within the boot sequence.  The latter aren't necessarily
logged, but one can introduce extra commands in supplied Obey files to
record the before- and after- states of things, once one knows where to
look.  


For example, in my VRPC boot sequence, near the start, the Reporter log
contains:

09:46:56.06 [022168C8/] /Boot:Library.ScanLibs

09:46:56.11 [022168C8/] @RunType_FFB
HostFS::HardDisc4.$.!boot.Library.ScanLibs

09:46:56.11 [022168C8/] Basic -quit
"HostFS::HardDisc4.$.!boot.Library.ScanLibs"

09:46:56.15 [BASIC/] Set LibraryArchive$Path
HostFS::HardDisc4.$.!boot.Library.Archive.

09:46:56.17 [BASIC/] Set LibraryGraphics$Path
HostFS::HardDisc4.$.!boot.Library.Graphics.

 ...    

I've configured !Reporter to show time and task info so each line starts
with the time something happened and then the internal RO taskname that
executed the command.  The first line showed:

  [022168C8/] /Boot:Library.ScanLibs

and several subsequent lines also have the [022168C8/] part.  That means
they all got issued by the same process - which in this case is the Obey
command processor running commands from an Obey file.  Later on in !Boot
when multiple wimp tasks are started it becomes important to be able to see
which lines in the log came from which task, because they're all jumbled up
together.  One tends to have to search logs looking for lines which have the
same task-id number.   


The command

  /Boot:Library.ScanLibs

starts with / which is a RO shorthand for Run.  The command tells RO to run
something called Library.ScanLibs, from somewhere in a path called Boot (ie
from a set of dirs defined by Boot$Path which will have been set a short
time earlier).  You then see:

  09:46:56.11 [022168C8/] 
   @RunType_FFB HostFS::HardDisc4.$.!boot.Library.ScanLibs

This tells you that RO has found a specific Library.ScanLibs file at: 

  HostFS::HardDisc4.$.!boot.Library.ScanLibs

and determined its filetype which is FFB.  The way RO knows what to do with
any file of type FFB is to execute the command (previously defined) in
system variable @RunType_FFB.    You'll see all of those vars if you do a
*show.  The next line in the log shows the specific filename plugged into
the template command for FFB (Basic) files:

  09:46:56.11 [022168C8/] 
    Basic -quit "HostFS::HardDisc4.$.!boot.Library.ScanLibs"

so that's what actually gets executed to run the ScanLibs program.  The next
few lines are commands issued by the BASIC program, eg:

  09:46:56.15 [BASIC/] 
   Set LibraryArchive$Path HostFS::HardDisc4.$.!boot.Library.Archive 

The [BASIC/] part of the line shows that it's the BASIC interpreter that's
issuing these commands rather than the Obey file command processor.  In this
case the BASIC program is asking RO to set the first of a list of system
variables.


!Reporter doesn't in this case log all the ins and outs of what the ScanLibs
BASIC program is doing and why, just the RO star commands it asked RO to
execute.  However if it were necessary one could add some special commands
to the BASIC program and see in the log very detailed information about what
it was doing...     




So for example while you wonder why an Obey file appears not to have run,
someone with a !Reporter log will be able to tell whether the Obey file
itself, or a line within it, or a line within a file called by a command in
the Obey file (etc) actually failed.    


I used to let !Reporter log every boot on my RPC and as part of the boot
sequence saved the logfile from every boot.  This meant that when a boot
failed I could look at its log, but also logs from previous successful boots
and compare them.  That wasn't trivial because booting does a lot of stuff
in parallel and there are small timing differences from one boot to another
that mean the exact times that a series of commands may get executed are not
always the same.  However failures to execute things are logged...     


No-one expects that people who don't understand the log files will be able
to diagnose their own problems.  But if you have a log file, you can always
send part of it to someone who does understand. 

-- 
Jeremy C B Nicoll - my opinions are my own.



More information about the Rpcemu mailing list