I7 Line spacing rules EXPLAINED (with quick reference chart)

otistdog · January 22, 2024, 10:10pm

Since the rules around “paragraph control” are an enduring mystery and the subject of frequent posts on the forum, I spent some time analyzing the system. This post should be everything you need to know to get the line spacing behavior that you seek without unwanted surprises.

(Note that the following was derived via testing in 6M62, but 10.1 should behave largely the same.)

Reasons that line breaks are generated

There are three sources of line breaks in text output:

invocation of paragraph control text substitutions, per logic defined in the Standard Rules
automatic injection after text segments ending with ./!/? (or grammatical equivalent), per logic defined in I7 compiler code
automatic injection before evaluation of a rule (standalone or within a rulebook) after a say statement lacking certain substitutions has occurred, per logic defined in I7 template code

It is a complex system that is difficult to explain briefly. However, if an author is more interested in what the system does instead of why it does those things, then the rules can be laid out in a fairly compact way – which is what’s done in the table at the end of this post.

Core concepts

Text segments

The most important concept to understand is the idea of a text segment. Every say statement is composed of one or more text segments. A new segment is generated for any contiguous run of one or more alphanumeric and/or white space characters within a text (called strings here), or for any single substitution within the text.

Automatic punctuation line breaks

When the I7 compiler is translating a say statement into I6, it looks at the end of every text segment being created for a string. If the last character(s) of the string are sentence-ending punctuation (i.e. period, question mark or exclamation mark – or one of these followed by a close quote), then the I7 compiler adds a line break at the end of the segment… unless it is overridden by a paragraph control phrase, as described next.

Paragraph control substitutions

Some substitutions are designated as being for paragraph control. These are defined in “Section SR5/1/5 - Saying - Paragraph control” of the Standard Rules. They are: [line break] (LB), [no line break] (NLB), [run paragraph on] (RPO), [paragraph break] (PB), [conditional paragraph break] (CPB) and [run paragraph on with special look spacing] (RPOWSLS).

The to say... phrases for these substitutions are all specified in such a way that they cancel an automatic punctuation line break for any text segment that precedes them within the same say statement. They have no effect on a segment at the end of a previous say statement.

Other substitutions

No other substitution is capable of canceling an automatic punctuation line break. This includes those related to conditions ([if], [otherwise], [else], [end if]), those related to [one of] constructions, etc.

Author-supplied substitutions can be set up to cancel automatic punctuation line breaks, but only if they do not accept parameters.

The `say__p` and `PARA_CONTENTEXPECTED` flags

The paragraph control system tracks many boolean flags. The most important of these is say__p, which takes the form of an I6 global variable. Every say statement at the I7 level sets the say__p flag as its first effect – prior to any code related to text segments. This occurs even for say statements that include no text at all (such as say no line break;).

A significant secondary flag is called PARA_CONTENTEXPECTED. At the start of each text segment (whether string or substitution), a routine is run that checks the state of PARA_CONTENTEXPECTED. If it sees this flag set, the routine will set say__p and clear PARA_CONTENTEXPECTED.

Many paragraph control phrases clear say__p. Some paragraph control phrases clear say__p but also set PARA_CONTENTEXPECTED. If one of these occurs as the last segment of a say statement, say__p will be clear at the end of that say statement.

Pre-rule and inter-rule line breaks

Whenever a rule is about to be processed (either standalone or as part of a rulebook), then the state of say__p is checked. If the flag is set, then a line break is printed and the flag is cleared. Line breaks generated in this manner are here called rulebook breaks.

Rulebook breaks do not run the routine that checks the state of PARA_CONTENTEXPECTED. While processing rules, the first generated rulebook break will clear say__p, and it cannot be set again unless one of the rules executes a say statement.

EDIT: In trying to simplify this section, I went a bit too far. There is a significant distinction in default behavior for rulebooks that depends on whether or not the rulebook has a parameter, i.e. is “based” on something other than an action (the implicit default) or explicitly nothing. (See WWI 19.9 Basis of a rulebook for more.) For an <X> based rulebook of any other kind other than these two, the default is for all rulebook breaks to be skipped.

Summary table

This table tries to condense all of the above into a visual quick reference:

								TABLE 1: AUTHOR-VISIBLE EFFECTS OF PHRASES

										LB		NLB		RPO		PB		CPB		CCB		RPOWSLS		other
unconditional new_line?					+		-		-		+		-		+		-			-
conditional new_line?					-		-		-		+p		+p		-		-			-
overrides prev segment punct break?		+		+		+		+		+		+		+			-
suppresses rulebook breaks?				-		-		+		Lp		Lp		+		+			-


EFFECT KEY:

		+   = always
		-   = never
		L   = only if occurring at end of most recent say statement
		p	= effect applies only when say__p is set at start of segment

Special credit to @neroden, who outlined the idea of text segments in Nathanael Nerdode’s Cookbook (https://raw.githubusercontent.com/i7/extensions/10.1/Nathanael%20Nerode/Nathanael’s%20Cookbook-v6.i7x), and to @drpeterbatesuk, who worked out the effect of the undocumented -- running on designation for say phrases (Trouble with paragraph breaks - #8 by drpeterbatesuk). Any errors in the above are mine.

severedhand · January 23, 2024, 1:17am

Good, albeit hard to follow (because it’s inherently hard to follow) work.

In practice I have four kinds of line break problems that recur.

85% are elementary fixes when I see a missing line or an extra line during a big blob of text. One change and it’s fixed.

5% are caused by rulebooks producing extra lines, which can be tedious to work around.

5% are headachey things where many different mechanisms want to share a piece of text, and depending on which mechanism was used, the shared piece of text may appear correctly or not.

5% are black magic moments around the edges of weird stuff where no matter what I try, I end up with either no line break or two line breaks. At such times I try using Conditional Paragraph Break, and sometimes it’s the one magic trick that gets the spacing right. I can see from your column why that probably is: PB and CPB are identical except for that + / - difference in the first row.

-Wade

otistdog · January 23, 2024, 2:48am

Armed with the above information, some new options are available. For example, there’s nothing to stop one from redefining the phrases for following rulebooks:

To follow (RL - a rule), avoiding rulebook breaks:
	(- FollowRulebook({RL}, nothing, {phrase options}); -).

To follow (RL - a nothing based rule), avoiding rulebook breaks:
	(- FollowRulebook({RL}, nothing, {phrase options}); -).

To follow (RL - value of kind K based rule producing a value) for (V - K), including rulebook breaks:
	(- FollowRulebook({RL}, {V}, (~~{phrase options})); -).

These are backwards-compatible with all Standard Rules. If you write a text-emitting rulebook that produces a value and just want it to behave like a “regular” rulebook with respect to line breaks, you can now say:

follow the myspecialrules rules, including rulebook breaks;

Want your “regular” rulebook that doesn’t produce anything to stop emitting stray line breaks when it is called? Just say:

follow the myquietrules rules, avoiding rulebook breaks;

zarf · January 23, 2024, 4:33am

I’ve used definitions like that when rulebook breaks were getting under my skin.

The element of the system that really gripes me:

Whenever a rule is followed (either standalone or as part of a rulebook), then the state of say__p is checked. If the flag is set, then a line break is printed and the flag is cleared. Line breaks generated in this manner are here called rulebook breaks .

We could surely design a logically equivalent system that doesn’t print a linebreak here but merely keeps track of how many line breaks it should print before the next say statement. (That is, delay rulebook breaks until the next say.)

Then (a) we would never print newlines at inopportune times (like when there is no Glk stream active); (b) you could always squash rulebook breaks at print time, rather than having to use customized forms of FollowRulebook.

Dannii · January 23, 2024, 4:47am

How does say__pc fit in? If you ever want to truly avoid all rulebook breaks you have to do something like this:

! Run the glk event handling rules (but disable rules debugging because it crashes if keyboard input events are pending)
@push debug_rules; @push say__p; @push say__pc;
debug_rules = false; ClearParagraphing(1);
FollowRulebook(GLK_EVENT_HANDLING_RB, Glk_Event_Struct_type, true);
@pull say__pc; @pull say__p; @pull debug_rules;

It would be nice if there could be just one value to push/pull, but that’s unlikely to be something that we could change. (Unless they’re single flags, then perhaps they could be combined into a bitfield. But Zarf’s idea of actually counting would be even better, and probably precludes a bitfield (unless we split a 32bit word into parts.))

otistdog · January 23, 2024, 5:39am

say__pc is a bitmap that is used to track five flags:

PARA_COMPLETED
PARA_PROMPTSKIP
PARA_SUPPRESSPROMPTSKIP
PARA_NORULEBOOKBREAKS
PARA_CONTENTEXPECTED

There is an explanation of these in the template files, but based on an inspection of the actual template code, the explanation seems to be at least partly out-of-date.

Rulebook breaks are caused solely by routine RulebookParBreak(), which is a simple routine that conditionally calls DivideParagraphPoint() (aka DPP). DPP has the logic that tries to determine whether a line break is appropriate at that point in the text. It only prints a line break in response to say__p being set, as described above – the state of say__pc is altered by DPP but does not directly affect its choice.

otistdog · January 23, 2024, 6:17am

Yes, this approach would make a lot of sense to me. It seems like each say statement should:

process any pending line break requests
emit its own text, if any
make a request for zero to two line breaks to follow it before additional text

The default request in step 3 would be for zero following line breaks. The same compiler logic that currently checks for sentence-terminating punctuation could stay in place, but instead of injecting a line break it could inject a statement to indicate a request for two following line breaks.

Certain situations (like printing command clarifications or room names) would want a request for one following line break. This could be called [single break] (one conditional line break). The current [command clarification break] could be a synonym.

Authors would also want phrases to execute immediate, mid-say line breaks – to me, these would be [line break] (one unconditional line break) and [paragraph break] (two unconditional line breaks).

It would seem fine to keep [run paragraph on] and have it mean “change the request for following line breaks to zero,” but I think it would still function as desired without doing anything at the I6 level, simply by preventing the say statement from terminating with sentence-ending punctuation. (Likewise [no line break].)

Add logic to erase the current following lines request, and it is easy to cover the special situation of ensuring a blank line before the command prompt. The “special look spacing” case might be handled as easily as First carry out going: say single break.

I wanted to try to put a proof-of-concept of this together, but it depends on changes to compiler logic. It wouldn’t surprise me if this theoretical design had shortcomings when it came to actual application – particularly with respect to how it would interact with printing happening in template code or inclusions.

zarf · January 23, 2024, 6:22am

Yes, it will certainly be a case of “prove your code logically equivalent to the old system, then run the I7 test suite and watch logic start weeping.”

otistdog · January 23, 2024, 1:21pm

On the topic of say__pc, one thing that I’ve noticed is that the substitution [command clarification break] indirectly invokes the I6 routine ClearParagraphing(), which zeroes the say__pc global and therefore clears all of its bitmap flags.

It’s not clear to me why this behavior would be desirable, as it affects flags that are applicable to edge case behavior (the “going look break” and ensuring a blank line before the command prompt) that do not seem like they should be affected by that substitution. I have a suspicion that it is a vestigial call left over from an earlier era of the paragraph control system.

The following definition is an attempt to prevent any unexpected side effects:

To say command clarification break -- running on
	(documented at phs_clarifbreak):
	(- new_line; RunParagraphOn(); -).

It has not been extensively tested; but I have not seen problems in basic testing.

zarf · January 23, 2024, 7:41pm

I suspect that [command clarification break] is, in practice, always followed by action output. Any edge cases that come up would look weird anyway. (E.g if the action prints nothing at all, neither success nor failure.)