[gccsdk] Threading + Alt-Break = Trashed Computer
bavison at riscosopen.org
Sat Jul 12 18:28:38 PDT 2008
John Tytgat <John.Tytgat at aaug.net> wrote:
> In message <op.ud7brttdl0n5eg at balaptop.ba>
> "Ben Avison" <bavison at riscosopen.org> wrote:
> I don't follow. Currently when the watchdog kills the current task, the
> task's Exit handler gets called by the Wimp doing an OS_Exit. That seems
> to work and I think that's a good approach. Why can't we do the same
> for killing a task which happens not to be the current task at the moment
> Alt-Break is pressed ? The only thing what needs to be done as extra is
> to first make that task the current one by calling 'pageintask' (and
> dealing with sprite redirection + pdriver is probably not relevant in
> this case).
Calling OS_Exit is fine when you're killing the current task, since all other
tasks are sitting inside Wimp_Poll with their state safely squirreled away
inside their task blocks. However, when you're killing another task, many of
the current USR mode registers for the non-killed task and potentially a
load of other stuff (such as the stack for OS_ReadLine which transient
callbacks can execute inside) are all on the SVC stack. We need that
information in order to resume execution of the non-killed task, and the SVC
stack is thrown away when OS_Exit is called. And OS_Exit would bubble out to
the Wimp's exit handler, which itself exits by calling Wimp_Poll, and the
Wimp_Poll scheduler only knows how to resume a task which is sleeping at a
call to Wimp_Poll (or as a special case, Wimp_StartTask). There's a reason
why Wimp_Poll can only be called with an empty SVC stack. ;-)
Also, 'pageintask' only affects the application space memory mapping, so it
alone is insufficient to achieve the required task switch. We need at least
to switch the environment handlers too (since these are stored in kernel
workspace and the kernel has no concept of separate environment handlers for
each application slot).
>> It could be argued that this would be a retrograde step, because if an
>> application has a rogue exit handler which refuses to terminate while
>> the task is not the active task, then you would no longer be able to kill
>> it after such a change.
> But if we're considering the case of a task with rogue exit handler you
> won't be able to kill that task when it happens to be the current task.
> And that's the situation today since RISC OS 3.5. So why are we seeing
> this as an argument for killing the not-current task ?
It's true that if the current task has a rogue exit handler then you've
always been stuffed. (Though as part of my suggested change, I think it
be good to enable the user to override a rogue exit handler for any task,
including the current one.)
What I'm trying to say is that adding a new way for some rogue code to
crash the machine is not a good idea, and the user should at least retain an
option to force a non-current task to be killed (as we have had since
OS 3.5) if they absolutely demand it.
If this sounds far-fetched, imagine an application which installs and
removes some type of handler around calls to Wimp_Poll, and has an exit
handler which removes the handler. Now imagine that the exit handler has a
bug that means it goes into an infinite loop if the handler isn't installed
(maybe it used a non-X SWI to deregister the handler and an error is
generated, which would then call the exit handler again). Now, if the exit
handler is invoked when another task is active, then from the point of view
of the dying task, it's as though as OS_Exit was called inside Wimp_Poll -
provoking the error because the handler was removed for the duration of
Wimp_Poll. The error could not have been provoked under any earlier Wimp,
because as far as the task is concerned, Wimp_Poll simply never returned and
the exit handler was never called.
> I'm not sure if I fully understand your suggestion. Is it something like
> the first Alt-Break for a task gets its Exit handler called, and a 2nd
> Alt-Break the Wimp Exit handler gets called and the task gets killed
> straightaway ?
Yes, that's pretty much it. Maybe the user interface could be different for
the second case - something like "The application is not responding, do you
want to force a quit" as is commonly seen on other OSes. This could even be
triggered automatically from a CallAfter if the exit handler handn't
completed within 5 seconds, say.
More information about the gcc