Glk/Glulx threads

As the demand for more intuitive code continues the demands on our poor virtual machines increases too.

This has led me to wonder why we couldn’t and shouldn’t introduce some form of threading into Glk/Glulx. Currently the majority of the processing must be done in the minimal time. Can we usefully use the time the player is reading the response to their last command to do some forward planning?

On the VM front, what is stopping this from happening? What problems would there be in changing the behaviour of glk_select_poll() such that it returns all input events?

On the library front, what would be usefully served by threading? I never did any deep profiling of Fate (which its ridiculous start up time) but I suspect that in general conversation systems are slow. This was the case with Alabaster too (which lead to accelerated functions.) All conversation systems have some kind of algorithm to determine which “nodes” are accessible next, and they could quite feasibly be run while the player is reading the current node. I don’t know how the new I7 features are implemented, but the new relations or functional programming features might also be slow if used a lot.

What I’m proposing isn’t actually threading in Glulx… more so in Glk. Glk would need threads, one for Glulx, and the other for the UI it manages. Further proposals for threading purely within Glulx are left for the reader.

1 Like

While I would love being able to process some NPC AI while the player’s cursor winks, I believe it is just the amount of work to be done that stops your idea coming to fruition. Interpreters are supposed to be simple enough so they can be ported everywhere, and threading is definitely not simple.

No, threading isn’t simple, but most languages these days come with it built in (excepting perhaps C/C++?) But if we all worked together we could add it to Garglk too.

Threading is possible in most languages (I think Javascript is coming).

I suspect it wouldn’t be difficult to adapt most interpreters to work the way you suggest – with a UI thread working in the foreground as the game runs in the background. (For ease of implementation, you’d want to switch into a “locked” mode where the game cannot do any I/O at all, until an event comes in or the game triggers one.)

However, using this capability in a safe way could be rather tricky for the game author.

Web Workers are available in Firefox 3.5+, Safari 4+, and Chrome. They can be emulated with timers in IE, though we’d have to experiment to see if we can still have the benefits of processing while the player reads without noticeably locking the browser’s UI.

Good to hear you think this is feasible. Do you think the VM/Glk would need to enforce that, or could we just say that Outputing data while running background tasks would be undefined?

Also, why isn’t this locking needed now with the events that can be polled for?

I see this as more of a library/standard rules kind of feature. A simple use of it would be a task queue, where tasks can be enqueued, with a certain priority. Each task would need to be broken down into small steps, and once it has determined it has finished, would remove itself from the queue. When input returns, the game can query the queue to determine if a task it depends on has finished, and if it hasn’t, that task would be run non-stop until it has finished, just as it would be without threading.

Designing a queue like that wouldn’t be hard at all. Adapting chat systems, planners or AIs to use it would be harder of course, but the difficulty of the task shouldn’t stop us from setting this system up.

Locking isn’t a problem now because the only pollable event is the timer event. You can compute that without looking at the UI implementation – it doesn’t matter what you’ve printed or what the user has done with the UI since the last glk_select started.

For any UI-related event, it does matter, and specifying that interaction is not worth the effort. (Printing to a window where line input is occurring, e.g.) So yes, the library would have to enforce it.

Please note that I consider this idea, cool as it might be, way down the priority list.

I have considered introducing threading into the Windows Glk implementation before, specifically to have one separate thread handling the computationally expensive task of decoding and scaling graphics images. (This was when looking at Alabaster performance.) This would mean that a Glk graphics call would only put a request on a queue and return straight away, with the image being shown some unspecified time later.

This wouldn’t need any modification to the Glk specification, and would avoid an interpreter pause when graphics draw. In the end I haven’t done this as other optimization work sped up the graphics drawing sufficiently for it to seem not worth the effort.

However, speaking as someone who writes multi-threaded C++ for a living, I think threading is just too difficult to expect game (or even I7 extention) authors to be able to cope with it. It is conceivable that I7, with its somewhat “functional programming” approach to things, could be made to parallelize its internal algorithms, but I suspect that that will be scary hard, too: much harder than actually adding threading to the interpreter.

Well, we’ve been talking about having the UI and the VM in separate threads, not having multiple VM threads. VM multithreading is a much, much trickier problem, and I’m not touching it, no sir.

(Some Glk apps are already written with the UI and VM in separate threads. Or do specific UI tasks in threads, as in your image-decoding idea. However, the spec is written to make this unnecessary, and the VM can’t take advantage of it when it is the case. Dannii’s proposal is a simple extension to let the VM take advantage of it.)

Currently you could set up a timer, call glk_request_line_event(), do some processing, call glk_select_poll() and get the timer event, call glk_cancel_line_event() and check the buffer for a statement ending with a period and if it doesn’t, start from the beginning again, recycling the buffer. Essentially the only thing special about evtype_LineInput is that it will work when you press return, rather than manually checking for a complete command ending with a period. We could write games now which use processor time after asking for line input, we’d just have to change how commands must be entered.

The Glk spec already states that you can’t print to a window with a pending line input. Are there other potentially dangerous interactions?

To add on to what David said: I think you can assume that Glk libraries will use threads as needed to perform complex computational tasks.

Gargoyle currently uses separate threads for audio playback; OGGs are handed off to a library that spawns a new thread to decode, buffer, and play each sound. For that matter, UI events are generated in an external process with callbacks into the library as events occur.

Conceivably you could have a Glk call that would process an embedded script outside the VM, calculate the result on a separate thread, and then return the result as part of a new event type. The problem would be encapsulating the logic in a standard format and getting a meaningful result to fit into two 32-bit return values, one of which would have to identify which script / invocation was returning a value.

This could work somewhat like shaders for a GPU, but in the AI / pathfinding space instead of with 3D graphics. I have no idea how practical this is, though. One issue is that it could quickly become a mandatory feature, if games relied on it exclusively to calculate NPC movements or quips.

I’d like to keep things moving if I can.

It seems odd to me that the threading should happen in Glk rather than Glulx, but I guess that’s how it goes.
To use Quixe in a Worker thread I guess this would mean we would have Quixe+Glk in the Worker and GlkOte in the main thread (rather than Quixe/Glk+GlkOte as I had thought originally.)

It’s possible that different implementations may want to put the thread division in a slightly different place (maybe in the dispatch layer for example). I think that is ok as it shouldn’t impact on the user. The precise place where this division occurs won’t be at a level which Glulx can act upon. Because of this my proposal is not specific about the way that it should be coded, just how it will be used.

My proposal:
A new event function: glk_select_poll_all(), which as its name suggests, will poll and return any waiting event.
Two glk gestalt selectors: gestalt_PollAll and gestalt_SafeThreading. gestalt_PollAll only indicates that the function exists. gestalt_SafeThreading will return true only if it’s safe to do heavy calculations whilst there are waiting input events. Both gestalts must return true to use Glk threading. I thought about having only one gestalt selector for this, but its possible that glk_select_poll_all() could be useful without using the threading functionality.

Further questions:
Writing to a window with a waiting input request is already forbidden. Are there any other potentially dangerous interactions?
Will other IO systems (FyreVM) have problems with this proposal? Which really means… can they make their own comparable changes to allow concurrent threading?

The disadvantage I see to glk_select_poll_all() is that the interpreter is not able to take advantage of the platform-specific functions that park the calling thread in an idle state while waiting on input.

Forgive me for being dense, but I don’t see how to avoid busy-waiting in this scenario. I thought that was the reason that glk_select_poll() excluded input events in the first place: so authors would not be tempted to incorporate these loops in the game code, where they would be necessarily less efficient.

A different approach would be to give the Glulx interpreter a way to register callback functions, for instance a glk_select_notify(&function, glui32 type) call. Then you could set up the glk_select() loop as normal, and the Glk functions that generated that event type would pass it back to the interpreter directly instead of queuing it for later dispatch.

You would also need a glk_select_cancel() call to break the library out of a running input loop. This could be somewhat harder to do, as the Glk library would have to contrive a means of unblocking itself, having already called the OS function that suspended its thread until input was available. But Glk at least has direct access to OS functions, where the interpreter does not, so it would be easier to tackle the problem at the library level.

At that point, the glk_select() call would end prematurely. It would have to return some event type other than evtype_None, since that is disallowed; perhaps evtype_Cancel instead? The interpreter would see that event, realize it had computation results available, then do something about it before invoking glk_select() again.

I think this would give you the execution model you’re after:

  1. interpreter registers a callback for compute events
  2. interpreter queues up some asynchronous data processing functions
  3. interpreter parks itself in a standard glk_select() loop
  4. library generates a compute event, notifies interpreter
  5. interpreter handles the compute event
  6. interpreter calls glk_select_cancel()
  7. loop ends prematurely
  8. interpreter regenerates game state
  9. interpreter queues up more asynchronous data processing functions
  10. interpreter returns to the glk_select() loop

This assumes that callbacks from Glk to the interpreter are possible or even meaningful. Perhaps that’s where the whole idea falls apart; I simply don’t know.

FyreVM’s input model is closer to the Z-machine’s than Glk’s: the game makes a system call to read a line (or character), and control doesn’t return to the game until the input is ready. Something entirely new would have to be designed in order to allow the game to do background processing while waiting for input.

The assumption I’m working with is that the same amount of computation will be needed whether threading is available or not. What this proposal would allow is for that computation to be performed while the user is reading the response to their last command, rather than directly after entering a command. (This will probably mean that algorithms will need to be altered slightly so that they are pre-emptive.) Unless the user has very fast reflexes the computations would be finished before they enter a command, Glulx would stop calling glk_select_poll_all() and call glk_select() instead, letting you set the thread to idle etc.

I’m not sure what you mean by a busy-waiting scenario. But if the Glk library can’t guarantee that its UI will remain responsive (letting the user enter commands etc.) while Glulx is performing calculations then it must set gestalt_SafeThreading to false. This could even be changed as the program runs… we could suggest that each time before using the threading functionality the gestalt be checked.

That’s a much more complicated proposal. I’m not sure if it’s possible either… if Glulx had interrupts I think something like this might be.
But it would take a lot more to implement that what I’m suggesting. I think some Glk implementations are already set up in such a way that they would support this threading, so they’d only need to add the functions and gestalts.

What I am worried about are cases where this becomes the default game loop:

do
{
    partial_calculation();
    glk_select_poll_all(event);
}
while (event->type != evtype_None);

Once the calculation finishes, the loop will continue aggressively polling the OS for events. Absent timers, UI events are unlikely to fire more than once every 10-20 ms, so this becomes a tight loop that spikes the CPU and makes the battery cry. I realize that this isn’t what you intend, but there’s nothing to stop careless game authors or extension writers from abusing this call.

Right now you can’t write a functional game that does this, because in order to actually receive UI events, you need to call glk_select() which will yield CPU time until an event comes in.

I would point you to the way sound notify events work, since I think they are a useful example of a long-running procedure that may end at any time, even during user input. However, I don’t think they work particularly well as it stands, at least not in Gargoyle.

In a multi-threaded library, a notification event can be generated in the background, can even be queued up for dispatch, but the thread that needs to return the event to the VM - the one that called glk_select() - is suspended until a UI event comes in. Once that occurs, the thread wakes up, processes the input, checks the event queue, sees the sound notification, and returns it to the VM.

To see where this breaks down, imagine a sequence of sound effects chained together, e.g. a voice-acted conversation. You don’t want to hold the player hostage, so you call glk_select() so that they can move around, examine things, or even initiate a different conversation - in technical terms, you are receiving and processing line events while the sounds play.

Because of the way the OS suspends the main thread, there will almost always be a delay between the end of one sound and the start of another. The delay is controlled by the user - if he presses a key, clicks the mouse, or resizes the window, the UI thread will unblock, the sound notification will be dispatched, and the VM will play the next one.

This isn’t really a big deal for background, ambient music where you can just fade out to silence and the user may not be consciously aware of when it starts or stops. But it would be really weird for NPC barks.

Hmmm. Even with callbacks / interrupts into the VM, you’d need to be able to unblock the suspended thread somehow. And if that can be done, it probably should be done with sound notify events today. In other words, it’s a library bug. :blush:

So given that glk_select() already returns all events of interest, and does it in a way that doesn’t lead to busy-waiting, it seems like the ideal solution would be to somehow pass off the computation to Glk and wait for the evtype_Compute event to tell you when it’s finished. No?

Glk does have the concept of an interrupt handler, which I’d forgotten about since Gargoyle ignores it.

Which in turn suggests a callback model is at least technically feasible. The interpreter could pass a function to the Glk call that took no arguments, returned no result, and just flipped a volatile bit to indicate that events were waiting.

Then, in the interpreter process space, you would be free to do as much internal processing as you desired, so long as you checked the status of that bit every so often. Once you saw it, or once you ran out of processing to do, you could break out of your processing loop and call glk_select() to fetch the event.

It requires some changes at the library level. You would need a call that would spawn a new thread and return immediately, so that the interpreter could get on with its business of calculating.

That new thread would be pretty basic: it would just immediately call the OS function that suspended it pending new UI events. When it unsuspended, it would queue the event it had retrieved, call the handler, and exit.

The library would need some locks in place around the event queue, and some way to kill off the spawned thread, if the interpreter finished its calculations before a UI event appeared and called glk_select().

You could probably get away with a single new Glk call:
glk_select_notify(void (*func)(void));

Calling it with a NULL value would kill off the current listener thread, if any, then return. Calling it with any other value would kill off the current listener thread and start up a new one. Make it illegal to call glk_select() while a glk_select_notify() call was running, just to be safe. Then your code becomes:

volatile int events = 0;

void listener(void)
{
    events = 1;
}

void main(void)
{
    // set up event listener
    glk_select_notify(&listener);

    // start the work loop
    int result;
    do
    {
        result = partial_calculation();
    }
    while (!events && !result);

    // work finished or event waiting
    glk_select_notify(NULL);

    // fetch next Glk event
    glk_select(&event);
}

Seems doable to me.

I think you’re objecting to a problem, which although possible, is unlikely to ever occur. How many people do you know who have rewritten the main game loop? Who actually deals with Glk events at such a low level? It’s only the core library, or extension, writers. If this proposal gets accepted I’ll write a tasks extension because I’m not proposing this expecting individual authors to figure out how to add support for it themselves, I’d want to give them a complete system to use. And if careless authors do make a silly mistake, then it will most likely be caught later by testers/players. Authors could do a lot of stupid things. Do you check if they’re resizing huge images in a loop like that?

I have no doubt that an interrupt system could allow threading to be used safely. But again, it’s a huge change.

Every Glk interpreter deals with events at this low level. I’m actually less concerned about the Inform side, for the reasons you mention, but Inform is only one of many possible consumers for a Glk library.

Generally Glk disallows stupid things by declaring them illegal. And you don’t have to look very far in the Glk spec to find the concerns I’ve mentioned. In Section 1.1, after showing code for the most basic Glk program, we find this note:

I guess I also don’t see where new threads come in under your proposal. The library won’t spawn a new thread when glk_select_poll_all() executes, it will just poll the OS once for new events, repeating this whenever your extension relinquishes the CPU long enough to call it. And that decision about when and how often to poll becomes crucial: too frequently, and the efficiency hit is more pronounced; too infrequently, and the user may become frustrated at the unresponsiveness.

By adding a Glk call that explicitly spawns a thread, and a handler function that tracks new events, you also don’t add any new interpreter threads. But you do get full access to the main execution thread, plus a guarantee that a compliant library is listening for events on a secondary thread.

Plus a reasonable expectation that certain other functions (image resizing, sound decoding & playback) are taking place in independent threads, and that the library is responsible for synching up all those threads and maintaining a coherent event queue.

That seems to be about as threaded as you can make it without actually adding threads at the virtual machine level.

Is threading necessary? If the VM and UI were setup properly, a hidden interface could allow a “background” call to be made to the VM to continue working. When the player actually enters a command, a stop working call is sent to the VM and when that’s completed, the new command is sent to the VM. The state of the VM could have changed dramatically while the player was snoozing.

You don’t need threading for this.

David C.

The number of interpreter/custom-glk-consuming authors must surely be even smaller than the number of Inform/Superglus/Snack/Jacl authors! They would also be more switched on technology-wise. I don’t think we need to be too worried about them making such mistakes.

Generally Glk disallows stupid things by declaring them illegal. And you don’t have to look very far in the Glk spec to find the concerns I’ve mentioned. In Section 1.1, after showing code for the most basic Glk program, we find this note:

[/quote]
I don’t really see how this is relevant. Once you finish your computations you’d stop using glk_select_poll_all() and just call glk_select(). Also check Glk/Glulx threads - #9 by Dannii if people wanted to misuse Glk they could already!

You’re correct, this isn’t a proposal for new threads. But some (many?) Glk implementations already call their respective VMs in a thread. This proposal aims to make it easy to take advantage of that. For Glk imps. which are set up like that it won’t take long to add the new functions. Those that don’t call Glulx etc in a new thread won’t need to do anything, as the gestalts will return false automatically.

A proposal to add new VM-level threads might be something to aim for in the future, but the advantage of my proposal is that is should be achievable with very little effort, very soon.