Template:Chset-cell-unified/doc

This template is the metatemplate behind {{chset-ctrl}}, {{chset-ctrl3}}, {{chset-ctrl4}}, {{chset-cell}}, {{chset-cell3}}, and {{chset-cell4}}. The intention is to implement them using this template and thus make it easier to keep them in sync.

Usage

edit

Used with Template:chset-tableformat to indicate a table cell.

  • First row:
    • Parameter char: the character in question. May link to the appropriate article or Wiktionary page if appropriate. Only provide for a non-control, non-whitespace printing character. If there are alternative characters separate with a slash. If it is a sequence of characters put them next to each other.
    • Parameter ctrl: XX, name of a whitespace, control, format, separator or otherwise non-printing character (e.g., SP, LF, HT, NBSP, ZWNJ, PDO), with link to appropriate article if it exists. Do not provide at the same time as char. This just does template:sc2 so you can use that if you need to combine a control with a normal character. You can also use lower-case letters to get tinier text to fit a longer string in.
    • Parameter fn: printed in normal (small) size after the letter. This is useful to add a reference or template:efn footnote to the glyph.
  • Second row:
    • Parameter unic: hhhh, Unicode value in hexadecimal, 4 digits for most codepoints (those on the Basic Multilingual Plane) and 5 otherwise, (e.g., 0020, 1D44A).
      • A little-used feature is that if the char field is blank, the matching Unicode character is placed there, but this only works if this is just a hex number.
      • If there are multiple mappings separate them with a slash (such as 0020/00A0), if this translates to a series of characters separate them with a space.
      • Set to   for a character without a Unicode mapping. Alternatively, if a Private Use Area mapping is in established/documented use for such a character (e.g. the Apple logo in Mac OS Roman) then it may be given, but don't make them up.
      • Set to LEAD for a lead byte (rather than a character). L is not a hex digit so this is unambiguous (or use the hex code to indicate something about what lead byte this is, for example in UTF-8).
  • Subsequent rows:
    • Parameter deci: arbitrary text drawn in bold, for displaying input methods. This is most often a decimal number for the Windows Alt code input.
    • Parameter octl: a second line of arbitrary text drawn in bold. You probably should not use this unless the input method really uses a second form.
    • Parameter kuten: arbitrary text not in bold. For JIS (men)kuten, GB quwei, KS hangyol or equivalent code (English: (plane-)row-cell, or (plane-)section-position).
      • This is a important identifier for characters in CJK DBCSs such as JIS X 0208 (more so than e.g. deci, which is not usually used for a DBCS).
      • (d(d)-)d(d)-d(d) (two or three numbers of up to two digits each, e.g., 91-1, 2-2-1). Generally numbers 1 through 94 correspond with encoding bytes of either 0x21 through 0x7E, or 0xA1 through 0xFE.
      • For a lead byte, specify underscores in place of subsequent numbers, this may look something like 16-_.
      • For visual consistency, may be set to - for a byte which is not within the lead/trail byte range, but which is in the same line as those which are.

You should use the same entries for every cell in a table (or at least in a table row), otherwise they will not line up horizontally. Use   if a field should be blank.

Examples

edit

A few examples:

{| {{chset-tableformat}}
<!-- ctrl4 plus kuten -->
|{{chset-color-misc}}|{{chset-cell-unified|unic=3000|ctrl=[[space character|IDSP]]|deci=33|octl=041|kuten=1-1}}
<!-- ctrl4 -->
|{{chset-color-misc}}|{{chset-cell-unified|unic=00A0|ctrl=[[non-breaking space|NBSP]]|deci=160|octl=240}}
<!-- cell4 plus kuten -->
|{{chset-color-graph}}|{{chset-cell-unified|unic=26E3|char=[[⛣]]|deci=33|octl=041|kuten=91-1}}
<!-- cell4 plus kuten -->
|{{chset-color-graph}}|{{chset-cell-unified|unic=26E3|deci=33|octl=041|kuten=91-1}}
<!-- cell4 -->
|{{chset-color-ext-punct}}|{{chset-cell-unified|unic=00A1|char=[[inverted exclamation mark|¡]]|deci=161|octl=241}}
<!-- cell4 -->
|{{chset-color-ext-punct}}|{{chset-cell-unified|unic=00A1|deci=161|octl=241}}
<!-- ctrl3 plus kuten -->
|{{chset-color-misc}}|{{chset-cell-unified|unic=3000|ctrl=[[space character|IDSP]]|deci=33|kuten=1-1}}
<!-- ctrl3 -->
|{{chset-color-misc}}|{{chset-cell-unified|unic=00A0|ctrl=[[non-breaking space|NBSP]]|deci=160}}
<!-- cell3 plus kuten -->
|{{chset-color-graph}}|{{chset-cell-unified|unic=26E3|char=[[⛣]]|deci=33|kuten=91-1}}
<!-- cell3 plus kuten -->
|{{chset-color-graph}}|{{chset-cell-unified|unic=26E3|deci=33|kuten=91-1}}
<!-- cell3 -->
|{{chset-color-ext-punct}}|{{chset-cell-unified|unic=00A1|char=[[inverted exclamation mark|¡]]|deci=161}}
<!-- cell3 -->
|{{chset-color-ext-punct}}|{{chset-cell-unified|unic=00A1|deci=161}}
<!-- ctrl plus kuten -->
|{{chset-color-misc}}|{{chset-cell-unified|unic=3000|ctrl=[[space character|IDSP]]|kuten=1-1}}
<!-- ctrl -->
|{{chset-color-misc}}|{{chset-cell-unified|unic=00A0|ctrl=[[non-breaking space|NBSP]]}}
<!-- cell plus kuten -->
|{{chset-color-graph}}|{{chset-cell-unified|unic=26E3|char=[[⛣]]|kuten=91-1}}
<!-- cell plus kuten -->
|{{chset-color-graph}}|{{chset-cell-unified|unic=26E3|kuten=91-1}}
<!-- cell -->
|{{chset-color-ext-punct}}|{{chset-cell-unified|unic=00A1|char=[[inverted exclamation mark|¡]]|fn={{efn|A footnote next to character}}}}
<!-- cell -->
|{{chset-color-ext-punct}}|{{chset-cell-unified|unic=00A1}}{{efn|A trailing footnote}}
|}
IDSP
3000
33
041
1-1
NBSP
00A0
160
240

26E3
33
041
91-1

26E3
33
041
91-1
¡
00A1
161
241
¡
00A1
161
241
IDSP
3000
33
1-1
NBSP
00A0
160

26E3
33
91-1

26E3
33
91-1
¡
00A1
161
¡
00A1
161
IDSP
3000
1-1
NBSP
00A0

26E3
91-1

26E3
91-1
¡[note 1]
00A1
¡
00A1
[note 2]
  1. A footnote next to character
  2. A trailing footnote

Chset family of templates

edit

See ISO 8859-1, Windows-1252, and EBCDIC for examples of usage.

edit

Character row header

edit

Character cell colors

edit

For generating colours for cells by Unicode category, this script may be helpful.

Certain colours are in the process of being phased out:

  • Template:chset-color-ext-punct — Extended punctuation character cell color. Intended to represent non-ASCII punctuation, this is not a category used by Unicode (and there are no corresponding ext-digit, ext-graph etc). Currently renders the same as punct.

In addition to these, boxed and slightly shaded variants of these exist in order to indicate some kind of additional information (depending on the article) like, for example, a derivation from a base codepage, a variance of definition of the corresponding codepage in different sources (to be explained in the article) or in different revisions of a code page:

and

Please note that the boxed variants must not be used, if a cell, which is not to be marked, is surrounded by four cells, which need to be marked, as this would make the central cell appear marked as well. The shaded variants do not exhibit this problem.

For as long as there is no need to differentiate one or a few of the cells in a group from the other cells in the same group, refer to the normal (that is, the non-"box"- or "var"-type) templates further above.

Character cell contents

edit