Trainz/refs/ACS Text Format

(Redirected from Trainz/Permitted characters)

The ACS Text Format is an Auran-defined international languages supporting text format used for storing config.txt files and other generic key/value data. At the simplest level, an ACS Text file is a standard text file encoded in UTF-8.  

logo
Fundamentals for Trainz Trainees
TOC | BeginningsFun | AM&C | Creation | InBook Refs ORP Refs:  • Index • Containers • Kinds • Tags | Appendixes  • Vers
 Glossary
 HKeys-CM
 HKeys-DVR
 HKeys-SUR
 HKeys-WIN
 Mouse use
 Notations
It was historically necessary that an ACS Text file start with a Unicode BOM sequence (Byte Order Mark), which was often initialized by the exporter or importer modules in the process of making an asset. However while that is generally still recommended it has not been strictly required since the 2005 release of TRS2006 and the abandonment of the practice of bundling GMAX with Trainz.
  • But many BOM code lines will be noticeable in many an downloaded older asset's config.txt files as a invisible code occupying the first line of the file when the asset and it's config.txt is opened for edit.

Basic File Structure

edit
In programmer jargon
  1. A 'key' (or 'keyword') is an 'specific' human and machine readable 'identifier', specifically of an unique or enumerated nature belonging to a set of such legally specifiable identifiers. These keys are the lexical identifiers used in translating meanings from human to machine CPU. If the term is not legal, the machine's software has no way of interpreting and giving meaning to the spelled (or misspelled) term.[note 1] A simple reality is that if a term is misspelled, the computer cannot match the value, and doesn't know what to do with the construct (line). This is a fault or error, normally generating a halt to processing, or at minimum, a list documenting the term mismatch.
  2. A 'value' — These are usually tightly defined types, from basic machine dependent data elements to combinations of those into more complex data types. For example, a character is a simple 8-bit byte value in some coding schemes, but in Unicode a two byte sequence. A string is a succession of characters in an array. Clearly the latter size is twice that of the string with chars of single bytes.
  3. 'unordered' lists are lines of data which can be rearranged without adversely affecting meaning. — You'll quickly see in Trainz config files, a tag or container can appear anywhere inside its proper container, the config being one such container, containing others. The order of data appearance is independent of when it is listed and read.
  4. 'processing scope' — means within the momentary subprogram, subroutine or handler of such and such a specific case (i.e. some predicted and handled context' so a particular 'process')
  5. "ACS Text files contain an unordered list of key-value pairs. In any given processing scope, each key must be unique." 
  • By this, the N3V programmers mean some tag names are not unique to a particular Kinds or containers but are reused inside such contexts as reasonably unique by context, and that the context handler software — the subroutine processing such a container and its values, knows what it is doing and that it is not the other L-value — R-value legal pair of that other legal context. New users should bookmark and refer frequently to a Index of Tags and Containers table and the parent-child relationships therein.
  • For a concrete and common example: Today the tag 'region' is a config container (config file) legal value ONLY IN a kind map asset (since ca. V2.5 - V2.7, TR06 era), so it is group inside KIND map with other tags specific to or needed to define a map. Further, other config containers will be experienced that also have region as a legal tag name, but will have lesser TBVs. Which is to say, if the config/asset is updated to higher TBV, 'region' will generate a fault instead of a warning.
  • THAT, older content use a now illegal string value as the R-value like 'United Kingdom', 'United States' or 'Australia' in virtually EVERY KIND OF ASSET of the day — but which in higher TBVs post-TR06—creates an fault message, for the keyword is now only legal inside the map config container which needs a kuid to a kind region which defines things like base (center) latitude, longitude, hemisphere, water color, base altitude, default junction levers, and regional and period specific lists of road vehicles, so (oops!) Italian or Russian cars from 2021 don't show up on an North American road of 1925 (Ooops!) or winter fall and spring are all toast because "Michigan's map folder" is mislocated somewhere South of the equator (Oooops!) and sunrise and sunset reflect that region, not the upper United States. Oh-my OOooops!

 

Technical geek-speech...
the tricksy phrases here are direct quotes of the N3V Wiki source page linked below at page bottom; the interpretation and definition phrases we hope will make the geek-speak comprehensible and understandable so the lay person can internalize the meaning and make the implications 'mastered' and 'commandable'.

On the 'programmer's' so called context-free grammar for lay readers:

  • The EBNF string '::=' is an generic assignment operator of the value on the right side ('R-value') to the term (variable key or keyword) on the left ('L-value'). (This is primarily because differing interpretations of assignment and equality operators occur between major languages.[note 2])

In the below abstraction, the '|' (pipe character) is used to mean the Boolean 'OR' operator, so the first line below defining the delimiter (separation operator) of a general key can be translated:

The 'singlespace' delimiter is defined either by a space or a tab character in ASCII code. Such can be aggregated (added to a reasonable length) as the EBNF key '<whitespace>'. (In Trainz, assets opened for editing generally display with many multiple spaces ('whitespace padding') between L-value and R-value parameters, such that R-values align with their first character on the 41st column in the unpacked config.txt file).

A key-value pair can be declared in Extended Backus—Naur Form (EBNF) (Metasyntax forming a context-free grammar) as follows:

  • Observe the following chain top to bottom defines terms or syntax for those lines which follow... the whole is independent.
<singlespace> ::= ' ' | <TAB> ;
<whitespace> ::= <singlespace> | { <singlespace> } ;
<line-start> ::= { <whitespace> | <EOL> } ;
<key-value-pair> ::= <line-start> <key> <whitespace> <value> <EOL> ;
<acs-text-file> ::= [ <unicode-bom> ] { <key-value-pair> } <line-start> ;
  • (I've been programming since 1976, and even my eyes glaze over reading and trying to understand that... so we write for you here in the Wikibook!)
  1. It is important to note that the <value> may span multiple lines in some cases. In Trainz, values are set off by either double-quotes (" glyph) or paired curly-braces glyphs: '{' and '}', always. Trainz values can also be undefined or NIL. Assets demonstrating these are usually optional, a few are newer tags that newer software will default for older content, which has no option in its genesis versions' day.
  2. Translate '[' ... ']' as optional.
  3. N3V's standard stated as <acs-text-file> ::= [ <unicode-bom> ] { <key-value-pair> } <line-start> ;
    1. Begins with either a blank line, or (OPTIONALLY...) a line with a unicode byte order marker[note 3]
In layperson terms

Auran/N3V have set up a system of data structures where there are pairs and values, and some values are complex types which can include other simple and complex types.

These later are called data structures like structs, unions, or arrays in many languages—especially C-language derived computer languages and are generally just lumped together as 'containers' in Trainz. When you see a container containing a container you are dealing with a structure part of which is another structure type, such as arrays (e.g. passenger lists product types) or bogeys (trucks, which specify wheels, X,y,z factors of mount points, type of mounting, etc., but all data which is related to the other factors so 'is grouped together' since it is used together and that's the way we humans think of it best too.)

Permitted Keyname characters

edit
This section refers to the creation of keynames, or tags in Trainz. The average user has nothing to do with such, but a Content Creator, particularly script writer's can, do, and likely will.
  • Nonetheless, the prohibitions also extend to filenames, so making a username violating this standard is likely to fail on validation. But go ahead, if you like Faults messages!

A key (key name) is a sequence of unicode characters with a maximum size of 511 bytes. Control characters (ASCII 0..31) and the space character (ASCII 32) are not permitted. Uppercase ASCII characters (ASCII 64..89) are not permitted. The open-brace character (ASCII 123) is not permitted as the first character of a key. The close-brace character (ASCII 125) is not permitted.

For future compatibility, it is strongly recommended that implementations limit keys to the following set of characters when constructing an config.txt compliant with this 'ACS Text file standard. Implementations must accept all valid characters when parsing an ACS Text file.

  • 'a' .. 'z'
  • '0' .. '9'
  • '-'
  • '/'
  • '_'

Note these key omissions: '\', ':', ';', '`', '~',, '@', '*', '#', '$', '%', '^', '&', '{', '[', ')', ']'

Values

edit

Each value may include zero or more UTF-8 characters. Several types of value are available with unique encodings. Value types are automatically identified based on the tags and/or contents of the value, no type information is written into the file.

<value> ::= <null-value> | <numeric-value> | <numeric-array-value> | <string-value> | <kuid-value> | <container-value> ;
Warning:  Excepting multiline unprocessed text block tags such as description and license, any value laden string value may not contain a trailing whitespace character.
Such will produce both an warning and an Fault message such as:

Error: No selection for tag 'category-region' in 'mocrossing'.

Warning: 'US ' is not a valid value for tag 'category-region' (This was written: "US ", single quotes added in error messages). This tag is now empty and a new value must be selected.




Null Value

edit

A null value is literally a zero-length value.

<null-value> ::= [ <whitespace> ] ;

Numeric Values

edit

A numeric value is an integer or decimal representation.

<number> ::= [ "-" ] <digit> { <digit> } [ "." <digit> { <digit> } ] ;
<numeric-value> ::= <number> [ <whitespace> ] ;

Numeric Array Values

edit

A numeric array value is a sequence of multiple numbers, separated by commas.

<array-separator> ::= [ <whitespace> ] "," [ <whitespace> ] ;
<numeric-array-value> ::= <number> <array-separator> { <number> <array-separator> } <number> [ <whitespace> ] ;


String Values

edit

A string value is a sequence of zero or more UTF-8 characters, surrounded by quotation marks (ASCII 34.) The allowed occurences of the double quote character (ASCII 34) ('"'), the Colon (':'), the forward slash (ASCII 47) ('/'), and the backslash character (ASCII 92) ('\') are tightly proscribed. All are Forbidden as part of a string value except when that value is a Windows OS pathspec, and technically, such parameters are Strings not String Values, which are simple terms by comparison.

Examples: Some Road intersections can auto-reskin, by selection of an enumerated string value numeric indexing the appropriate texture library asset. The term {'string value numeric' simply means it is a number field passed into software as a string, because the software for determining a type and size of a data element is more complicated than saying enter numeric data in between quotes.

In short, it simplifies intake and communication of values.
 • Others, may just take a name, in signs and general buildings, and most often older Train Station models.
 • Conversely, some assets use string value passing of values to control things far more complex, such as the mix of numerics and textual values used to set up the late Andi06's invisible Train Stations where numbers define platform passenger limits, starting numbers, and station names just like every interactive station since TR2004 but also: platform height, platform curve, platform length (Being invisible, it has to be covered by a spline or blocky object that matches), Number of tracks and their designations (in some), and no one likes simulated passengers floating over the rails of a curving track, nor those standing ankle deep in paving, or hovering 6-10 inches above the platform decks.


The below real world snippet has several good examples. Firstly, notice that the real string values are used to define VARIABLES in the TRAINZ data model—which can also be classed as values the programmers and program code are concerned with. Their role, is in fact, to pass information to that code to affect operations, either in digital models appearance, but also in digital models behavior if the asset is in some way programmable.

load4

 {
   size                                2
   initial-count                       0
   
   attachment-points
   {
     0                                 "a.gen_goods2"
     1                                 "a.gen_goods9"
   }
   
   allowed-products
   {
     0                                 <kuid:57344:10003>
   }
   
   conflicts-with-queues
   {
     0                                 "load0"
     1                                 "load1"
     2                                 "load2 "
     3                                 "load3"
   }
 }
}


The Load containers and subcontainers above are ALL STRING VALUES, and NONE are supposed to follow STRING processing RULES. In fact, one above here (and a second in this asset in a related subcontainer) is FAULTY and won't load in Trainz because it breaks the string value processing routine's parsing rules... can you spot it? Answer given below in this NOTE[note 4]

Strings

edit

Strings, are distinct from String Values — both are enumerated values of a different nature, ... their context and expectable lengths — by the coding (or rules, algorithms whatever) that are applied to them which may be extended over multiple lines are Strings. Key being the length, allowed, and the prohibition of the String Values above to incorporate/enclose newlines and terminal whitespace characters of any kind. Strings may continue over multiple lines and may include whitespace (including the newline or CR-LF ASCII EOL sequence) and terminating whitespace characters (Spaces or Tabs) before the close-quote. The delimiting value is the enclosing quote marks.

<string-value> ::= <double-quote> { <string-character> } <double-quote> [ <whitespace> ] ;

String Values refinements

edit

Trainz does not directly prohibit the colon, the forward slash and back slash characters but Windows does, so any pathspec with any of the three has to refer to a directory hierarchy.[note 5] Concurrently, Trainz won't allow certain characters in filespecs, which Windows would be perfectly happy to utilize in folder or file names. (.e.g. '/', '\', '(' and ')' (parenthesis) can be part of usernames and when that is true, their folders will never have them in the OS foldernames when opened for edit, but substitute an underscore '_'. In reading files, Trainz's CM's also have fits if a filename includes the semi-colon: ';', but will tolerate it as part of a intake (originating) folder name) like: "File kuid2 56063 101000 1;v2-1;Boxcar traincar;40ftBoxcar_LehighValley1000_LARS-aRus" (Just now plucked from archive folder's name on this system.[note 6])

KUID Values

edit

A KUID value is a single KUID in a kuid valid format.

<kuid-value> ::= <KUID> [ <whitespace> ] ;

Container Values

edit

A container value nests additional key-value pairs over multiple lines.

<container-value> ::= [ <EOL> ] "{" <EOL> { <key-value-pair> } <line-start> "}" ;

Key Ordering

edit

The 'ACS Text Format' is technically an unordered key-value soup, however all existing implementations are order-preserving for trivial (read, write) operations. It is recommended that future implementations maintain this convention.

Format Violations

edit

The effect of a format violation detected during parsing is undefined. An implementation which follows this specification must not generate a file which violates any aspect of this specification.