-Format of Inform 6 Debugging Information Files
-
-Version 1.0
-
-0: Introduction
-
-This is a specification of the Version 1 format for the debugging information
-files emitted by the Inform 6 compiler. It replaces Version 0, which is
-documented in Section 12.5 of the Inform Technical Manual.
-
-1: Overview
-
-Debugging information files are written in XML and encoded in UTF-8. They
-therefore begin with the following declaration:
-
- <?xml version="1.0" encoding="UTF-8"?>
-
-Beyond the usual requirements for well-formed XML, the file adheres to the
-conventions that all numbers are written in decimal, all strings are
-case-sensitive, and all excerpts from binary files are Base64-encoded.
-
-2: The Top Level
-
-The root element is given by the tag <inform-story-file> with three attributes,
-the version of the debug file format being used, the name of the program that
-produced the file, and that program's version. For instance,
-
- <inform-story-file version="1.0" content-creator="Inform"
- content-creator-version="6.33">
- ...
- </inform-story-file>
-
-The elements from Sections 3--8 may appear in the ellipses.
-
-3: Story File Prefix
-
-The story file prefix contains a Base64 encoding of the story file's first bytes
-so that a debugging tool can easily check whether the story and the debug
-information file are mismatched. For example, the prefix for a Glulx story
-might appear as
-
- <story-file-prefix>
- R2x1bAADAQEACqEAAAwsAAAMLAAAAQAAAAAAPAAIo2Jc
- 6B2XSW5mbwABAAA2LjMyMC4zOAABMTIxMDE1wQAAMA==
- </story-file-prefix>
-
-The story file prefix is mandatory, but its length is unspecified. Version 6.33
-of the Inform compiler records 64 bytes, which seems sufficient.
-
-4: Story File Sections
-
-Story file sections partition the story file according to how the data will be
-used. For the Inform 6 compiler, this partitioning is the same as the one that
-the `z' flag prints.
-
-A record for a story file section gives a name for that section, its beginning
-address (inclusive), and its end address (exclusive):
-
- <story-file-section>
- <type>abbreviations table</type>
- <address>64</address>
- <end-address>128</end-address>
- </story-file-section>
-
-The names currently in use include those from Section 12.5 of the Inform
-Technical Manual:
-
- abbreviations table
- header extension (Z-code only)
- alphabets table (Z-code only)
- Unicode table (Z-code only)
- property defaults
- object tree
- common properties
- class numbers
- individual properties (Z-code only)
- global variables
- array space
- grammar table
- actions table
- parsing routines (Z-code only)
- adjectives table (Z-code only)
- dictionary
- code area
- strings area
-
-plus one addition for Z-code:
-
- abbreviations
-
-two additions for Glulx:
-
- memory layout id
- string decoding table
-
-and three additions for both targets:
-
- header
- identifier names
- zero padding
-
-Names may repeat; Glulx story files, for example, sometimes have two zero
-padding sections.
-
-A compiler that does not wish to subdivide the story file should emit one
-section for the entirety and give it the name
-
- story
-
-5: Source Files
-
-Source files are encoded as in the example below. Each file has a unique index,
-which is used by other elements when referring to source code locations; these
-indices count from zero. The file's path is recorded in two forms, first as it
-was given to the compiler via a command-line argument or include directive but
-without any path abbreviations like `>' (the form suitable for presentation to a
-human) and second after resolution to an absolute path (the form suitable for
-loading the file contents). All paths are written according to the conventions
-of the host OS. The language is, at present, either "Inform 6" or "Inform 7".
-More languages may added in the future.
-
- <source index="0">
- <given-path>example.inf</given-path>
- <resolved-path>/home/user/directory/example.inf</resolved-path>
- <language>Inform 6</language>
- </source>
-
-If the source file is known to appear in the story's Blorb, its chunk number
-will also be recorded:
-
- <source index="0">
- <given-path>example.inf</given-path>
- <resolved-path>/home/user/directory/example.inf</resolved-path>
- <language>Inform 6</language>
- <blorb-chunk-number>18</blorb-chunk-number>
- </source>
-
-6: Table Entries; Grammar Lines
-
-Table entries are data defined by particular parts of the source code, but
-without any corresponding identifiers. The <table-entry> element notes the
-entry's type, the address where it begins (inclusive), the address where it ends
-(exclusive), and the defining source code location(s), if any:
-
- <table-entry>
- <type>grammar line</type>
- <address>1004</address>
- <end-address>1030</end-address>
- <source-code-location>...</source-code-location>
- </table-entry>
-
-Version 6.33 of the Inform compiler only emits <table-entry> tags for grammar
-lines; these data are all located in the grammar table section.
-
-7: Named Values; Constants, Attributes, Properties, Actions, Fake Actions,
- Objects, Classes, Arrays, and Routines
-
-Records for named values store their identifier, their value, and the source
-code location(s) of their definition, if any. For instance,
-
- <constant>
- <identifier>MAX_SCORE</identifier>
- <value>40</value>
- <source-code-location>...</source-code-location>
- </constant>
-
-would represent a named constant. Attributes, properties, actions, fake
-actions, objects, arrays, and routines are also names for numbers, and differ
-only in their use; they are represented in the same format under the tags
-<attribute>, <property>, <action>, <fake-action>, <object>, <array>, and
-<routine>. (Moreover, unlike Version 0 of the debug information format, fake
-actions are not recorded as both fake actions and actions.)
-
-The records for constants include some extra entries for the system constants
-tabulated in Section 12.2 of the Inform Technical Manual, even though these are
-not created by Constant directives. Entries for #undefed constants are also
-included, but necessarily without values.
-
-Some records for objects will represent class objects. In that case, they will
-be given with the tag <class> rather than <object> and include an additional
-child to indicate their class number:
-
- <class>
- <identifier>lamp</identifier>
- <class-number>5</class-number>
- <value>1560</value>
- <source-code-location>...</source-code-location>
- </class>
-
-Records for arrays also have extra children, which record their size, their
-element size, and the intended semantics for their zeroth element:
-
- <array>
- <identifier>route</identifier>
- <value>1500</value>
- <byte-count>20</byte-count>
- <bytes-per-element>4</bytes-per-element>
- <zeroth-element-holds-length>true</zeroth-element-holds-length>
- <source-code-location>...</source-code-location>
- </array>
-
-And finally, <routine> records contain an <address> and a <byte-count> element,
-along with any number of the <local-variable> and <sequence-point> elements,
-which are described in Sections 9 and 10. The address is provided because the
-identifier's value may be packed.
-
-Sometimes what would otherwise be a named value is in fact anonymous; unnamed
-objects, embedded routines, some replaced routines, veneer properties, and the
-Infix attribute are all examples. In such a case, the <identifier> subelement
-will carry the XML attribute
-
- artificial
-
-to indicate that the compiler is providing a sensible name of its own, which
-could be presented to a human, but is not actually an identifier. For instance:
-
- <routine>
- <identifier artificial="true">lantern.time_left</identifier>
- <value>1820</value>
- <byte-count>80</byte-count>
- <source-code-location>...</source-code-location>
- ...
- </routine>
-
-Artificial identifiers may contain characters, like the full stop in
-``lantern.time_left'', that would not be legal in the source language.
-
-8: Global Variables
-
-Globals are similar to named values, except that they are not interpreted as a
-fixed value, but rather have an address where their value can be found. Their
-records therefore contain an <address> tag in place of the <value> tag, as in:
-
- <global-variable>
- <identifier>darkness_witnessed</identifier>
- <address>1520</address>
- <source-code-location>...</source-code-location>
- </global-variable>
-
-9: Local Variables
-
-The format for local variables mimics the format for global variables, except
-that a source code location is never included, and their memory locations are
-not given by address. For Z-code, locals are specified by index:
-
- <local-variable>
- <identifier>parameter</identifier>
- <index>1</index>
- </local-variable>
-
-whereas for Glulx they are specified by frame offset:
-
- <local-variable>
- <identifier>parameter</identifier>
- <frame-offset>4</frame-offset>
- </local-variable>
-
-If a local variable identifier is only in scope for part of a routine, it's
-scope will be encoded as a beginning instruction address (inclusive) and an
-ending instruction address (exclusive):
-
- <local-variable>
- <identifier>rulebook</identifier>
- <index>0</index>
- <scope-address>1628</scope-address>
- <end-scope-address>1678</end-scope-address>
- </local-variable>
-
-Identifiers with noncontiguous scopes are recorded as one <local-variable>
-element per contiguous region. It is possible for the same identifier to map to
-different variables, so long as the corresponding scopes are disjoint.
-
-10: Sequence Points
-
-Sequence points are stored as an instruction address and the corresponding
-single location in the source code:
-
- <sequence-point>
- <address>1628</address>
- <source-code-location>...</source-code-location>
- </sequence-point>
-
-The source code location will always be exactly one position with overlapping
-endpoints.
-
-Sequence points are defined as in Section 12.4 of the Inform Technical Manual,
-but with the further stipulation that labels do not influence their source code
-locations, as they did in Version 0 of the debug information format. For
-instance, in code like
-
- say__p = 1; ParaContent(); .L_Say59; .LSayX59;
- t_0 = 0;
-
-the sequence points are to be placed like this:
-
- <*> say__p = 1; <*> ParaContent(); .L_Say59; .LSayX59;
- <*> t_0 = 0;
-
-rather than like this:
-
- <*> say__p = 1; <*> ParaContent(); <*> .L_Say59; .LSayX59;
- t_0 = 0;
-
-11: Source Code Locations
-
-Most source code locations take the following format, which describes their
-file, the line and character number where they begin (inclusive), the line and
-character number where they end (exclusive), and the file positions (in bytes)
-corresponding to those endpoints:
-
- <source-code-location>
- <file-index>0</file-index>
- <line>1024</line>
- <character>4</character>
- <file-position>44153</file-position>
- <end-line>1025</end-line>
- <end-character>1</end-character>
- <end-file-position>44186</end-file-position>
- </source-code-location>
-
-Line numbers and character numbers begin at one, but file positions count from
-zero.
-
-In the special case where the endpoints coincide, as happens with sequence
-points, the end elements may be elided:
-
- <source-code-location>
- <file-index>0</file-index>
- <line>1024</line>
- <character>4</character>
- <file-position>44153</file-position>
- </source-code-location>
-
-At the other extreme, sometimes definitions span several source files or appear
-in two different languages. The former case is dealt with by including multiple
-code location elements and indexing them to indicate order:
-
- <!-- First Part of Inform 6 Definition -->
- <source-code-location index="0">
- <!-- Assuming file 0 was given with the language "Inform 6" -->
- <file-index>0</file-index>
- <line>1024</line>
- <character>4</character>
- <file-position>44153</file-position>
- <end-line>1025</end-line>
- <end-character>1</end-character>
- <end-file-position>44186</end-file-position>
- </source-code-location>
- <!-- Second Part of Inform 6 Definition -->
- <source-code-location index="1">
- <!-- Assuming file 1 was given with the language "Inform 6" -->
- <file-index>1</file-index>
- <line>1</line>
- <character>0</character>
- <file-position>0</file-position>
- <end-line>3</end-line>
- <end-character>1</end-character>
- <end-file-position>59</end-file-position>
- </source-code-location>
-
-The latter case is also handled with multiple elements. Note that indexing is
-only used to indicated order among locations in the same language.
-
- <!-- Inform 7 Definition -->
- <source-code-location>
- <!-- Assuming file 2 was given with the language "Inform 7" -->
- <file-index>2</file-index>
- <line>12</line>
- <character>0</character>
- <file-position>308</file-position>
- <end-line>12</end-line>
- <end-character>112</end-character>
- <end-file-position>420</end-file-position>
- </source-code-location>
- <!-- Inform 6 Definition -->
- <source-code-location>
- <!-- Assuming file 0 was given with the language "Inform 6" -->
- <file-index>0</file-index>
- <line>1024</line>
- <character>4</character>
- <file-position>44153</file-position>
- <end-line>1025</end-line>
- <end-character>1</end-character>
- <end-file-position>44186</end-file-position>
- </source-code-location>
-
---
-This file is part of Inform.
-
-Inform is free software: you can redistribute it and/or modify it
-under the terms of the GNU General Public License as published by the
-Free Software Foundation, either version 3 of the License, or (at your
-option) any later version.
-
-Inform is distributed in the hope that it will be useful, but WITHOUT
-ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
-FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
-for more details.
-
-You should have received a copy of the GNU General Public License
-along with Inform. If not, see https://gnu.org/licenses/
\ No newline at end of file