Glk sound API plans

I had intended to get into the stylesheet question as my next Glk project. However, it occurs to me that none of the libraries I support have sound. So I can toss in new APIs without doing any hard work. :slight_smile:

By the same token, I don’t exactly know what the implementation issues are. So I’ll start with some questions.

My plan is to add two Glk calls:

schanid_t glk_schannel_create_ext(glui32 rock, glui32 volume);

This is just like glk_schannel_create(), except you can specify a start volume other than the default 0x10000. I don’t anticipate any implementation problems with this.

void glk_schannel_set_volume_ext(schanid_t chan, glui32 vol, glui32 duration, glui32 notify);

If duration and notify are zero, this is just like glk_schannel_set_volume().

If the duration is nonzero, the volume change occurs smoothly over the given number of milliseconds. (Specify 1000 for a one-second volume change.)

If the notify flag is nonzero, you’ll get an evtype_VolumeNotify when the volume change is complete (or shortly thereafter, since Glk events err on the side of late).

At most one volume change can be occurring on a sound channel at any time. If you call a volume-setting function while a previous volume change is in progress, the previous change is interrupted.


Open questions, for people who support sound-capable Glk libraries:

First, is this volume change stuff viable? (Both on the OS side and on the libmodplug side.) It’s clearly what audio I7 extensions need to stop being hacked-up monstrosities – no offense to Damusix, but there’s no good way to do volume changes now, and this would permit it. If it can actually be implemented.

If you interrupt a volume change, can I specify that the new volume change starts at wherever the volume was where it was interrupted? Can you implement that? The other options are to leave the behavior unspecified, or specify that the volume jumps to the end of the interrupted volume change before starting its new change. The latter is mandating clicks, and the former leaves the possibility of clicks open (which means authors have to avoid the whole situation).

Are there any libraries that support sound but don’t support sound notification events? I’d like to tighten that up, and say that notification events (sound and volume) are a mandatory feature rather than optional. This would make I7 extension code much simpler.

EDIT-ADD: I also considered an API call for “destroy this sound channel, after reducing the volume to zero over N milliseconds”. Clearly this isn’t necessary; you can make it happen using a volume change with notification. However, it might be convenient. If not all libraries wind up supporting volume notifications, it might be necessary after all. But maybe those libraries couldn’t provide the feature at all?

As far as Windows Glk goes, this seems doable. In Windows Glk all the various sound format decoders (including libmodplug) are used to produce raw waveform data, which is then fed into DirectSound (the bit of Windows that plays the sounds) via a rolling buffer. There are two obvious ways of tackling this with DirectSound:

  • Repeatedly change the volume in small increments on the DirectSound buffers directly (as there is a SetVolume() call). This call is what’s used at present, but might that give rise to clicks as mentioned by the recent poster on this subject? Hopefully not.
  • Scale the raw waveform data as it goes into DirectSound. More complicated, but possibly allowing more control.

Anyway, more investigation is needed, but it should be possible one way or the other. I should also have a look to see how Mike’s Windows TADS code handles fades. libmodplug seems to have some sort of volume related interface, but I’m not keen to use it here.

That seems okay.

Windows Glk supports the events, so I’m okay to make them mandatory. It’s nearly always better to make life harder for interpreter authors than for game authors.

Zarf,

I apologize in advance if this request is outside the scope of your post; I don’t know if this is the sort of thing that should / could be called via Glk or indeed if it is even feasible to implement on the interpreter end. (It is also possible that this feature already exists, but I just don’t know how to work it.)

It would be awesome if multiple sound files could be started simultaneously – i.e. played back in sync with one another. This would allow an author to use cross-fading between channels to implement effects (stereo panning being an obvious one, but pseudo - filtering and other special effects would also be possible). I’m not asking the various 'terp authors to implement this themselves, but if the sound libraries currently in use already have this capability, it would be nice to be able to access it from within the various IF languages via a Glk call.

The Windows TADS code just has a thread that periodically calls the DirectSound SetVolume() method, which suggests that what I was imagining for Windows Glk should be fine.

DavidK: Sounds good. I’ll email around the interpreter authors too, but it sounds like this plan will work.

It would be awesome if multiple sound files could be started simultaneously -- i.e. played back in sync with one another.

Good point. Thanks for bringing that up.

The problem of doing truly synchronized sound effects is a large problem, and the solutions generally require a heavy API. You want to be able to schedule samples for precise moments, know exactly when they will end, probably get regular callbacks to schedule new ones. I don’t want to get into that for Glk.

Your proposal covers a few simple cases. I think it’s definitely simple enough to handle; my only question is whether it offers enough to be worthwhile. Your examples are stereo panning, and – I think – sliding between a filtered and unfiltered form of a given sound? Like between outdoor footsteps and indoor, echoey ones?

The cost is that you have to have both variations running all the time (one at zero volume), because you can only start them together, not sync up a new one. My first feeling is that that’s a lot of overhead for a fairly limited trick.

Andrew,

Thank you for looking into this.

I don’t know anything about APIs so my naive thinking was: I request audio file sync, Zarf waves his hands and creates a Glk call with the appropriate parameters (based on whatever timing mechanism Glk uses) and adds it to the spec, interpreter authors wave their hands and hook this call into some existing code already contained in the sound library they’re using, and someone else wraps it all up in I7 phrases and releases it as an extension. Ok, I didn’t really believe it would be quite that simple but I figured it was worth a shot. (I also realize you’re wearing a minimum of two hats in this scenario.)

I agree, although a variety of effects could be bought for the same price.

As background, in the process of beta - testing an early version of Eliuk Blau’s Damusix, I began coding up a sense - passing extension compatible with it. Automatic volume changes when the player moves farther away from a sound source or closes an intervening door were relatively easy to implement, so I started to think about atmospheric effects. (I was specifically thinking of how Eric Eve’s “Nightfall” would feel with sound.) It’s funny you should mention dry vs. reverberant footsteps; it was the first effect that occurred to me. Since it would be normal to expect a break in the rhythm of one’s footfalls when going from an outdoor to an indoor location, this could be implemented by simply changing the file of the sound effect. But then I thought, “What if I wanted to implement a portable radio which plays continuous music?” I’d want the reverb to affect that sound as well.

The ability to crossfade a looped sound with another version of itself would allow for this and other effects. Putting the radio inside a box and then closing the box could result in not just a simple volume change, but a crossfade to a muffled (low pass filtered) version of the sound without interrupting the music. Games using relative directions could use panning to enhance the sense of location in space. As you point out, these effects would only work if both files were started at precisely the same time (with one playing at zero volume until the crossfade was initiated). Although it is extra overhead, Damusix offered 20 sound channels so I figured, why not try it? Unfortunately, there didn’t appear to be any way to control the start times to any usable degree of precision – even multiple replays of the same test script resulted in different timing offsets – so I scrapped the idea.

Since MOD formats are supported I thought that maybe they could be leveraged for this, because MODs play multiple samples using strict timing. However, based on what I’ve read about libModPlug, the library just takes the resources and renders the result to a buffer which is then streamed to the sound device; it doesn’t look like it’s possible to dynamically change the parameters of a MOD while it is playing using this library. (Please correct me if that’s wrong.) I’m not suggesting the library be changed – free, fast, and format - friendly are good things – it just won’t help with this proposal.

I admit that time spent on this by the various people involved would probably be better spent on something else. There’s no way of knowing whether or not authors will even make use of this feature or that players will embrace the use of sound in this way. (Although to be fair, we can’t find that out if the functionality doesn’t exist.) I’m just throwing it out there so it will be in the backs of your minds. Maybe at some point in the future a less painful solution will present itself.

Thanks again,

Hi,

I would like to point out that there is a general problem with volume and Glk: for the different Glk implementations the same volume value produces different real volumes in dBs.

I know that is quite out of the scope of Glk, as it depends so much on the different sound libraries used by the implementations. It obviously will also depend on the volume set on the speakers or the OS itself, but it is a problem for a programmer to find that with same OS settings and same speaker volume your game sounds louder or softer depending on interpreter.

I have found this issue when programming distance based sounds and cross-fades in Superglus. Once I had all set and tested with gargoyleGlk, I tested the game with winglk and it was almost impossible to hear anything except when volume was on maximum value (that for this game was 32768 if I can recall properly). Then I tried with ZMPP and the volume was quite different to the other two as well.

Probably there is not a good solution, but I wonder if anyone here can think of any (solution). I don’t know how the different libraries measure the volume, but there should be a way to standarize it all.

Damusix already does this.

Damusix already does this, too… since 2008.

(Spanish Damusix Documentation - Sorry for text in spanish)

EDIT:

Damusix also supports volume changes, global volume and Glk sound notifications.

Please “ear” (play) the spanish adventure “El Anillo III”, first I7 adventure that uses Damusix for I7 (beta). The adventure has a “Random Sound Environment-er” that plays different real-time sounds depending on the location (or region [I7]) where the player is.

Download:
sites.google.com/site/johanilato … edirects=0

Info (English):
groups.google.com/group/rec.arts … 5d294bd594

Info (Spanish):
wiki.caad.es/El_anillo_3

Saludos! :wink:

I’ve been abusing the Glulx machine since 2008 in a piece of code of mine.

Although I plan releasing it in a more complete status, given the amount of changes into the Glulx specification and the fine reception Andrew gives to suggestions and contributions from the community, I think it worth doing it now, even unfinished.

It’s abusing on several aspects. Graphics (complex animations), audio (realtime crossfades) and real-time events (DaCronox: a library that manages as many “virtual” Glk timers as needed, mine too).

I hope this demo could help on making decisions, as an use case.

I need to recompile it. I’ll post a link soon.

P.S: Damuxis is not a hack, nor is that abusing code. They’re just the result of trying using efficiently the actual Glk.

P.S.2: Andrew, I congratulate and support your decision of working full-time on Glulx development. Best wishes.

As I said, I mean no offense to Damusix. You built it to work the only way that is currently possible: by requesting very rapid timer events and changing the volume at very short intervals. That’s horribly inefficient and ugly – even though there’s no alternative! Once this API goes in, you’ll be able to make the same features work with a much lower CPU load.

Creating a sound channel with a specific volume is trivial.

The other new API call, lowering the volume over a specified number of milliseconds, poses some difficulties. It’s not a feature that SDL_mixer provides. I’d need to use OS timers for that, which would be workable under Windows and Linux but would mean a complete rewrite of the way I handle timers in OS X to avoid clashing with the Glk timer functionality.

OpenAL may offer a way to accomplish the same thing out of the box, but I’m not quite ready to go there.

Given the small number of IF games that use sound and the potential availability of Damusix for the ones that do, I don’t welcome the idea of more sound API functions. It’s a pile of work with a very small payoff.

I’d like to see an OpenAL implementation rather than a DirectSound one, since this would provide portability. Or better yet, SDL.

An interpreter using SDL for display and sound… This has potential.

Don’t Mix_FadeOutMusic(), Mix_FadeInMusic(), Mix_FadeInChannel(), Mix_FadeInChannelTimed() and Mix_FadeOutChannel() do exactly that? I’ve used them for the timed cross-fade implementation in QTads, and they work OK.

They do allow a timed fade, but not bounded by a specific volume - only to full or zero.

Damusix already does that :laughing: It does the following: to assign a sound to a channel …

  1. Channel is previously created.
  2. Damusix changes the channel volume.
  3. Eventually the sound plays.

Damusix stores in its kernel several information of its managed channels, including volume, repetitions, sound notifications, etc. So whenever the channels are initialized with the specified volume at the time of the assignment.

I understand that having a formal Glk call to do this can be optimal, but I want to say that Damusix already does by itself. :wink:

:smiley: Saludos!

EDITED:

OMG. Damusix already does this. :blush: Damusix’s fades can have arbitrary volumes… 25% to 80%, 100% to 50%, etc. You can even do that after a fade occurs another, producing a lot of fun effects.

:stuck_out_tongue: See ya!

There’s also Mix_RegisterEffect() to do it manually, but I guess this is where it will get hairy as you’d need to modify the sound data directly. Your effects processor would need to modify the amplitude of each waveform chunk it receives, keeping track of where it left off previously. This approach is for the brave though, which is why I never bothered :stuck_out_tongue: It does produce the best results however.

As long as we’re talking about the sound API…

I would like to be able to specify the starting point as an optional parameter for glk_schannel_play_ext. By default, the sound would start at the beginning, otherwise, it would track to the time passed by the parameter. Concomitantly, glk_schannel_stop could return the stopping point, allowing for a sound to be restarted in the same place after stopping (in other words, providing a “pause” functionality).

Another related feature–likely (much?) more work to implement, and not central to anything I’d like to do–would be to allow the notify flag in glk_schannel_play_ext to specify when the sound notification event would be fired, e.g. a flag of 0 would request no notification (as it currently does), while -1 could request a notification when the sound finishes playing, while a positive value would indicate the point at which the sound notification should be returned.

–Erik

Hi Eliuk! :slight_smile: I realize that Damusix already does all this. It rocks! I think you should just release it as-is; authors who want to use it can put the documentation through Google Translate or ask usage questions here.

From a philosophical perspective I thought that the motivation behind adding Glk API functions was to enable broadly new capabilities, rather than to make existing features more convenient and accessible. I see that focus slipping here and it’s not clear that the benefits to authors outweigh the hassle to library maintainers.

There are many more people creating extensions than working on Glk implementations; if it’s territory that extensions can cover, let them cover it.

This thread isn’t about Damusix, but I beg indulgence to add:

I agree with Ben–just release it now if the code is ready. People do their best to get by w/o reading documentation anyway! You could try putting the documentation up in wiki format at Google Code or IFWiki; maybe folks will be motivated to contribute to a group-mind translation.

–Erik