Glk time and date functions

You don’t happen to remember the name of that Chrome extension, do you? I spent some time googling for it this morning, but didn’t turn up anything. The overall slowness of Quixe works to make milliseconds almost–but not quite–good enough to answer most of my questions, but I’d sure appreciate the ability to go to microseconds if that were available.

–Erik

You can always do what I do, and test Quixe on a really slow machine. :slight_smile:

(Or do what everybody else does, and test a loop of 100 operations, or a display screen ten times as large, or whatever is appropriate.)

Perhaps we could add a separate monotonic time interface (Linux, Mac, Windows) to Glk?

That would allow nanosecond resolution for profiling without the baggage of converting the arbitrary number to a wall clock time.

It’s called chrome.Interval, but you also have to run Chrome with a command line argument. It’s not an easy replacement for Date because you’d have to change Glk’s code.

But yes, the main way to do profiling is to perform each task hundreds or even thousands of times. That will be more accurate as well (one time will be affected by garbage collection etc.)

See the “Some things to consider” section of calendar.perfplanet.com/2010/bul … enchmarks/
It also explains how to detect and then account for the minimum resolution.

I was imagining the finest resolution return value to be completely divorced from the rest of it. Glk returns the real time by filling in an array the user code provides, so it would be trivial for the user code to allow an extra word or two of memory for any interpreters who might wish to use them.

I’m not asking for any code changes whatsoever here, just a line in the Glk spec that says “an interpreter can return the picosecond+(1000*nanosecond) count in this last array element if it wants to, but may utterly ignore it if it wishes.”

I’d have to make minor code changes to my extension to allow (and expose “for debugging only”) the extra array element, but I’m happy to do so for the one or two interpreters in the world which may actually make use of it. We really only need one 'terp per platform to implement such a thing anyway.

Again, I’m imagining 'terps that are good for debugging games – the kind which would print a callstack trace on a runtime error, which is exactly what a normal terp should never do.

I can’t imagine that nanoseconds would ever be useful for profiling IF. Even for the best JITs which could compile to native code and use single clock cycles for VM operations, surely checking the time would be orders of magnitude slower.

And even if somehow they do get that fast, by repeating tasks fractions of a microsecond could still be calculated.

But that would require code changes; you’d have to change the prototype and the function call to reflect the storage space required for the additional return value.

If you compiled a library with those changes, and used it with a story file that wasn’t expecting the extra data, you’d overrun the arglist buffer and bad things would happen.

At least I think that’s the case; I haven’t ever needed to do much with the dispatch layer code.

Hm. Maybe I’m not thinking things through properly, but my present goal is to break down a long and complex algorithm (A*) to explore which parts of the code are likely to yield significant optimizations at the scale the algorithm actually runs. In other words, I might check to see whether storing a certain set of data in an I7 list (more convenient) or in an I6 array (less overhead) results in a significant speed difference given the kinds of situations the algorithm is likely to deal with. I’m reasonably sure that running the operation 100 times would show that storing the data in an I6 array is faster, but what I want to know is whether the difference when running it just once*, as when the game is actually played, is worth the added inconvenience of storing the data directly in I6. Is this not the right way to think about it?

–Erik

  • I’d repeat the actual operation a number of times to get an average–I’m not running just one time and then making a decision based on that single run.

EDITED to answer myself: Well, what I suppose folks are saying is that I should run each variant of the operation 100 times, then divide the results by 100 to arrive at my comparison. Which makes sense.

Pardon, I meant if an interpreter chose not to support it. The extra array element in the higher levels of code (i.e., game & extension) would remain unused.

Well, I think if profiling is what you’re after, a separate time call that consults a monotonic clock and returns a value in microseconds is a better way to go. (I agree with Dannii that nanosecond-level resolution doesn’t seem practical.)

It is pretty straightforward to do this on Windows using QueryPerformanceCounter; my laptop had a resolution around 2 microseconds. The problem (which David pointed out elsewhere) comes when you try to synchronize that counter time with the wall clock time. Windows can’t really do better than 1 millisecond in that case, and only then by making system-wide changes with a host of negative consequences.

However, for profiling purposes you don’t really care what the system time is. You can just query the API once to obtain the current microsecond counter, do some work, then query for the new value and compare the difference. (You will wind up measuring both the time to do the work and the time to query the microsecond counter, but you can do back-to-back queries of the counter to establish how long that takes and adjust the result accordingly.)

You could do this with the existing glktimeval_struct. You’d just need a new API function like void glk_counter_time(glktimeval_t *time); that would return the seconds / microseconds elapsed since some arbitrary (undefined) event - most likely system start or program start, but specifically not the Epoch. The resulting value wouldn’t be good for converting to a “real” date and time; it wouldn’t be good for anything but comparing to other counter time values. But it would be ideal for that purpose.

You could achieve the same effect with a special Glk library that repurposed glk_local_time() to return the high performance counter value instead. It would be broken for regular use but would offer microsecond resolution for profiling purposes.

In lieu of an API update, I’ll add the code to Gargoyle and send builds to anyone interested (Ron and Erik).

Cool–thanks, Ben!

–Erik

Just to be clear: I do not plan to add further timer functions to the API. I think it is possible to measure performance of anything we need, using what we’ve got.

I guess my simple solution wasn’t so simple after all. Thanks for the clarification, Ben.

I’ll still happily take a modified Gargoyle if you have it done already! :sunglasses:

Would you be willing to reserve a single selector for an undocumented mystery call?

I whipped up the following bit of code to copy-paste into an existing WIP. In the RULES ALL output, it prepends how much time elapsed since the previous rule. Paste it, RELEASE FOR TESTING, and bring it up in the modified Gargoyle. Transcript below. (Has quirks?)


Include (-
Array countofsecondsnew --> 3;  
Array countofsecondsprev --> 3;  
Array countofsecondsdiff --> 3;  

[ DebugRulebooks subs parameter i;
	spaces(2*process_rulebook_count);
	print "[", (RulePrintingRule) subs;
	if (parameter) print " / on O", parameter;
	print "]^";
];

[ DB_Rule R N blocked;
	if (R==0) return;
	glk($0160, countofsecondsnew); !  glk_current_time(timeval);
	countofsecondsdiff-->0 = countofsecondsnew-->0 - countofsecondsprev-->0;
	countofsecondsdiff-->1 = countofsecondsnew-->1 - countofsecondsprev-->1;
	countofsecondsdiff-->2 = countofsecondsnew-->2 - countofsecondsprev-->2;
	countofsecondsprev-->0 = countofsecondsnew-->0;
	countofsecondsprev-->1 = countofsecondsnew-->1;
	countofsecondsprev-->2 = countofsecondsnew-->2;
	print "[", (countofsecondsdiff-->0), ":", (countofsecondsdiff-->1), ":", (countofsecondsdiff-->2), "] ";
	print "[Rule ~", (RulePrintingRule) R, "~ ";
	#ifdef NUMBERED_RULES; print "(", N, ") "; #endif;
	if (blocked == false) "applies.]";
	"does not apply.]";
];-) instead of "Debugging" in "Rulebooks.i6t".

Transcript is longish. Not sure why I get an occasional negative value. Probably something silly I did.

From the code, it appears you are just subtracting prev_seconds from new_seconds, and prev_micro from new_micro. But if you have values like this in your countofseconds struct:

prev: 3494391 seconds 999999 micro
new: 3494392 seconds 000001 micro

You will calculate a difference of 1 second, -999998 micro. The difference should be 2 microseconds.

If you want to hack in functionality for testing, you might be better off with an undocumented mystery Glulx opcode. Or @debugtrap – I don’t think anybody’s ever used that.

As to whether I will reserve one… um, I think I’m going to stay hands-off here.

Well, I had the impression you handed out 256 call selectors to Carlos Sanchez for the asking. It doesn’t seem unreasonable to request one, and it would save me the trouble of maintaining separate builds.