Tcl Programming/Introduction

Introduction

edit

So what is Tcl?

edit

The name Tcl is derived from "Tool Command Language" and is pronounced "tickle". Tcl is a radically simple open-source interpreted programming language that provides common facilities such as variables, procedures, and control structures as well as many useful features that are not found in any other major language. Tcl runs on almost all modern operating systems such as Unix, Macintosh, and Windows (including Windows Mobile).

While Tcl is flexible enough to be used in almost any application imaginable, it does excel in a few key areas, including: automated interaction with external programs, embedding as a library into application programs, language design, and general scripting.

Tcl was created in 1988 by John Ousterhout and is distributed under a BSD style license (which allows you everything GPL does, plus closing your source code). The current stable version, in February 2008, is 8.5.1 (8.4.18 in the older 8.4 branch).

The first major GUI extension that works with Tcl is Tk, a toolkit that aims to rapid GUI development. That is why Tcl is now more commonly called Tcl/Tk.

The language features far-reaching introspection, and the syntax, while simple, is very different from the Fortran/Algol/C++/Java world. Although Tcl is a string based language there are quite a few object-oriented extensions for it like Snit, incr Tcl, and XOTcl to name a few.

Tcl was originally developed as a reusable command language for experimental computer aided design (CAD) tools. The interpreter is implemented as a C library that could be linked into any application. It is very easy to add new functions to the Tcl interpreter, so it is an ideal reusable "macro language" that can be integrated into many applications.

However, Tcl is a programming language in its own right, which can be roughly described as a cross-breed between

  • LISP/Scheme (mainly for its tail-recursion capabilities),
  • C (control structure keywords, expr syntax) and
  • Unix shells (but with more powerful structuring).

One language, many styles

edit

Although a language where "everything is a command" appears like it must be "imperative" and "procedural", the flexibility of Tcl allows one to use functional or object-oriented styles of programming very easily. See "Tcl examples" below for ideas what one can do.

The traditional, "procedural" approach would be

proc mean list {
   set sum 0.
   foreach element $list {set sum [expr {$sum + $element}]}
   return [expr {$sum / [llength $list]}]
}


Here is yet another style (not very fast on long lists, but depends on nothing but Tcl). It works by building up an expression, where the elements of the lists are joined with a plus sign, and then evaluating that:

proc mean list {expr double([join $list +])/[llength $list]}

From Tcl 8.5, with math operators exposed as commands, and the expand operator, this style is better:

proc mean list {expr {[tcl::mathop::+ {*}$list]/double([llength $list])}}

or, if you have imported the tcl::mathop operators, just

proc mean list {expr {[+ {*}$list]/double([llength $list])}}

Note that all of the above are valid stand alone Tcl scripts.

It is also very easy to implement other programming languages (be they (reverse) polish notation, or whatever) in Tcl for experimenting. One might call Tcl a "CS Lab". For instance, here's how to compute the average of a list of numbers in Tcl (after first writing somewhat more Tcl to implement a J-like functional language - see Tacit programming in examples):

Def mean = fork /. sum llength

or, one could implement a RPN language similar to FORTH or Postscript and write:

 : mean  dup sum swap size double / ;


A more practical aspect is that Tcl is very open for "language-oriented programming" - when solving a problem, specify a (little) language which most simply describes and solves that problem - then go implement that language...

Why should I use Tcl?

edit

Good question. The general recommendation is: "Use the best tool for the job". A good craftsman has a good set of tools, and knows how to use them best.

Tcl is a competitor to other scripting languages like awk, Perl, Python, PHP, Visual Basic, Lua, Ruby, and whatever else will come along. Each of these has strengths and weaknesses, and when some are similar in suitability, it finally becomes a matter of taste.

Points in favour of Tcl are:

  • simplest syntax (which can be easily extended)
  • cross-platform availability: Mac, Unix, Windows, ...
  • strong internationalization support: everything is a Unicode string
  • robust, well-tested code base
  • the Tk GUI toolkit speaks Tcl natively
  • BSD license, which allows open-source use like GPL, as well as closed-source
  • a very helpful community, reachable via newsgroup, Wiki, or chat :)

Tcl is not the best solution for every problem. It is however a valuable experience to find out what is possible with Tcl.

Example: a tiny web server

edit

Before spoon-feeding the bits and pieces of Tcl, a slightly longer example might be appropriate, just so you get the feeling how it looks. The following is, in 41 lines of code, a complete little web server that serves static content (HTML pages, images), but also provides a subset of CGI functionality: if an URL ends with .tcl, a Tcl interpreter is called with it, and the results (a dynamically generated HTML page) served.

Note that no extension package was needed - Tcl can, with the socket command, do such tasks already pretty nicely. A socket is a channel that can be written to with puts. The fcopy copies asynchronously (in the background) from one channel to another, where the source is either a process pipe (the "exec tclsh" part) or an open file.

This server was tested to work pretty well even on 200MHz Windows 95 over a 56k modem, and serving several clients concurrently. Also, because of the brevity of the code, this is an educational example for how (part of) HTTP works.

# DustMotePlus - with a subset of CGI support
set root      c:/html
set default   index.htm
set port      80
set encoding  iso8859-1
proc bgerror msg {puts stdout "bgerror: $msg\n$::errorInfo"}
proc answer {sock host2 port2} {
    fileevent $sock readable [list serve $sock]
}
proc serve sock {
    fconfigure $sock -blocking 0
    gets $sock line
    if {[fblocked $sock]} {
        return
    }
    fileevent $sock readable ""
    set tail /
    regexp {(/[^ ?]*)(\?[^ ]*)?} $line -> tail args
    if {[string match */ $tail]} {
        append tail $::default
    }
    set name [string map {%20 " " .. NOTALLOWED} $::root$tail]
    if {[file readable $name]} {
        puts $sock "HTTP/1.0 200 OK"
        if {[file extension $name] eq ".tcl"} {
            set ::env(QUERY_STRING) [string range $args 1 end]
            set name [list |tclsh $name]
        } else {
            puts $sock "Content-Type: text/html;charset=$::encoding\n"
        }
        set inchan [open $name]
        fconfigure $inchan -translation binary
        fconfigure $sock   -translation binary
        fcopy $inchan $sock -command [list done $inchan $sock]
    } else {
        puts $sock "HTTP/1.0 404 Not found\n"
        close $sock
    }
}
proc done {file sock bytes {msg {}}} {
    close $file
    close $sock
}
socket -server answer $port
puts "Server ready..."
vwait forever

And here's a little "CGI" script I tested it with (save as time.tcl):

# time.tcl - tiny CGI script.
if {![info exists env(QUERY_STRING)]} {
    set env(QUERY_STRING) ""
}
puts "Content-type: text/html\n"
puts "<html><head><title>Tiny CGI time server</title></head>
<body><h1>Time server</h1>
Time now is: [clock format [clock seconds]]
<br>
Query was: $env(QUERY_STRING)
<hr>
<a href=index.htm>Index</a>
</body></html>"

Where to get Tcl/Tk

edit

On most Linux systems, Tcl/Tk is already installed. You can find out by typing tclsh at a console prompt (xterm or such). If a "%" prompt appears, you're already set. Just to make sure, type info pa at the % prompt to see the patchlevel (e.g. 8.4.9) and info na to see where the executable is located in the file system.

Tcl is an open source project. The sources are available from http://tcl.sourceforge.net/ if you want to build it yourself.

For all major platforms, you can download a binary ActiveTcl distribution from ActiveState. Besides Tcl and Tk, this also contains many popular extensions - it's called the canonical "Batteries Included" distribution.

Alternatively, you can get Tclkit: a Tcl/Tk installation wrapped in a single file, which you don't need to unwrap. When run, the file mounts itself as a virtual file system, allowing access to all its parts.

January 2006, saw the release of a new and promising one-file vfs distribution of Tcl; eTcl. Free binaries for Linux, Windows, and Windows Mobile 2003 can be downloaded from http://www.evolane.com/software/etcl/index.html . Especially on PocketPCs, this provides several features that have so far been missing from other ports: sockets, window "retreat", and can be extended by providing a startup script, and by installing pure-Tcl libraries.

First steps

edit

To see whether your installation works, you might save the following text to a file hello.tcl and run it (type tclsh hello.tcl at a console on Linux, double-click on Windows):

package require Tk
pack [label .l -text "Hello world!"]

It should bring up a little grey window with the greeting.

To make a script directly executable (on Unix/Linux, and Cygwin on Windows), use this first line (the # being flush left):

#!/usr/bin/env tclsh

or (in an older, deprecated tricky style):

#! /bin/sh
# the next line restarts using tclsh \
exec tclsh "$0" ${1+"$@"}

This way, the shell can determine which executable to run the script with.

An even simpler way, and highly recommended for beginners as well as experienced users, is to start up tclsh or wish interactively. You will see a % prompt in the console, can type commands to it, and watch its responses. Even error messages are very helpful here, and don't cause a program abort - don't be afraid to try whatever you like! Example:

$ tclsh
info patchlevel
8.4.12
expr 6*7
42
expr 42/0
divide by zero

You can even write programs interactively, best as one-liners:

proc ! x {expr {$x<=2? $x: $x*[! [incr x -1]]}}
! 5
120

For more examples, see the chapter "A quick tour".


Syntax

edit

Syntax is just the rules how a language is structured. A simple syntax of English could say (ignoring punctuation for the moment):

  • A text consists of one or more sentences
  • A sentence consists of one or more words

Simple as this is, it also describes Tcl's syntax very well - if you say "script" for "text", and "command" for "sentence". There's also the difference that a Tcl word can again contain a script or a command. So

if {$x < 0} {set x 0}

is a command consisting of three words: if, a condition in braces, a command (also consisting of three words) in braces.

Take this for example

is a well-formed Tcl command: it calls Take (which must have been defined before) with the three arguments "this", "for", and "example". It is up to the command how it interprets its arguments, e.g.

puts acos(-1)

will write the string "acos(-1)" to the stdout channel, and return the empty string "", while

expr acos(-1)

will compute the arc cosine of -1 and return 3.14159265359 (an approximation of Pi), or

string length acos(-1)

will invoke the string command, which again dispatches to its length sub-command, which determines the length of the second argument and returns 8.

Quick summary

edit

A Tcl script is a string that is a sequence of commands, separated by newlines or semicolons.

A command is a string that is a list of words, separated by blanks. The first word is the name of the command, the other words are passed to it as its arguments. In Tcl, "everything is a command" - even what in other languages would be called declaration, definition, or control structure. A command can interpret its arguments in any way it wants - in particular, it can implement a different language, like expr.

A word is a string that is a simple word, or one that begins with { and ends with the matching } (braces), or one that begins with " and ends with the matching ". Braced words are not evaluated by the parser. In quoted words, substitutions can occur before the command is called:

  • $[A-Za-z0-9_]+ substitutes the value of the given variable. Or, if the variable name contains characters outside that regular expression, another layer of bracing helps the parser to get it right:
puts "Guten Morgen, ${Schüler}!"

If the code would say $Schüler, this would be parsed as the value of variable $Sch, immediately followed by the constant string üler.

  • (Part of) a word can be an embedded script: a string in [] brackets whose contents are evaluated as a script (see above) before the current command is called.

In short: Scripts and commands contain words. Words can again contain scripts and commands. (This can lead to words more than a page long...)

Arithmetic and logic expressions are not part of the Tcl language itself, but the language of the expr command (also used in some arguments of the if, for, while commands) is basically equivalent to C's expressions, with infix operators and functions. See separate chapter on expr below.


The man page: 11 rules

edit

Here is the complete manpage for Tcl (8.4) with the "endekalogue", the 11 rules. (From 8.5 onward there is a twelfth rule regarding the {*} feature).

The following rules define the syntax and semantics of the Tcl language:

(1) Commands A Tcl script is a string containing one or more commands. Semi-colons and newlines are command separators unless quoted as described below. Close brackets are command terminators during command substitution (see below) unless quoted.

(2) Evaluation A command is evaluated in two steps. First, the Tcl interpreter breaks the command into words and performs substitutions as described below. These substitutions are performed in the same way for all commands. The first word is used to locate a command procedure to carry out the command, then all of the words of the command are passed to the command procedure. The command procedure is free to interpret each of its words in any way it likes, such as an integer, variable name, list, or Tcl script. Different commands interpret their words differently.

(3) Words Words of a command are separated by white space (except for newlines, which are command separators).

(4) Double quotes If the first character of a word is double-quote (") then the word is terminated by the next double-quote character. If semi-colons, close brackets, or white space characters (including newlines) appear between the quotes then they are treated as ordinary characters and included in the word. Command substitution, variable substitution, and backslash substitution are performed on the characters between the quotes as described below. The double-quotes are not retained as part of the word.

(5) Braces If the first character of a word is an open brace ({) then the word is terminated by the matching close brace (}). Braces nest within the word: for each additional open brace there must be an additional close brace (however, if an open brace or close brace within the word is quoted with a backslash then it is not counted in locating the matching close brace). No substitutions are performed on the characters between the braces except for backslash-newline substitutions described below, nor do semi-colons, newlines, close brackets, or white space receive any special interpretation. The word will consist of exactly the characters between the outer braces, not including the braces themselves.

(6) Command substitution If a word contains an open bracket ([) then Tcl performs command substitution. To do this it invokes the Tcl interpreter recursively to process the characters following the open bracket as a Tcl script. The script may contain any number of commands and must be terminated by a close bracket (]). The result of the script (i.e. the result of its last command) is substituted into the word in place of the brackets and all of the characters between them. There may be any number of command substitutions in a single word. Command substitution is not performed on words enclosed in braces.

(7) Variable substitution If a word contains a dollar-sign ($) then Tcl performs variable substitution: the dollar-sign and the following characters are replaced in the word by the value of a variable. Variable substitution may take any of the following forms:

$name

Name is the name of a scalar variable; the name is a sequence of one or more characters that are a letter, digit, underscore, or namespace separators (two or more colons).

$name(index)

Name gives the name of an array variable and index gives the name of an element within that array. Name must contain only letters, digits, underscores, and namespace separators, and may be an empty string. Command substitutions, variable substitutions, and backslash substitutions are performed on the characters of index.

${name}

Name is the name of a scalar variable. It may contain any characters whatsoever except for close braces. There may be any number of variable substitutions in a single word. Variable substitution is not performed on words enclosed in braces.

(8) Backslash substitution If a backslash (\) appears within a word then backslash substitution occurs. In all cases but those described below the backslash is dropped and the following character is treated as an ordinary character and included in the word. This allows characters such as double quotes, close brackets, and dollar signs to be included in words without triggering special processing. The following table lists the backslash sequences that are handled specially, along with the value that replaces each sequence.

\a
Audible alert (bell) (0x7).
\b
Backspace (0x8).
\f
Form feed (0xc).
\n
Newline (0xa).
\r
Carriage-return (0xd).
\t
Tab (0x9).
\v
Vertical tab (0xb).
\<newline>whiteSpace
A single space character replaces the backslash, newline, and all spaces and tabs after the newline. This backslash sequence is unique in that it is replaced in a separate pre-pass before the command is actually parsed. This means that it will be replaced even when it occurs between braces, and the resulting space will be treated as a word separator if it isn't in braces or quotes.
\\
Literal backslash (\), no special effect.
\ooo
The digits ooo (one, two, or three of them) give an eight-bit octal value for the Unicode character that will be inserted. The upper bits of the Unicode character will be 0.
\xhh
The hexadecimal digits hh give an eight-bit hexadecimal value for the Unicode character that will be inserted. Any number of hexadecimal digits may be present; however, all but the last two are ignored (the result is always a one-byte quantity). The upper bits of the Unicode character will be 0.
\uhhhh
The hexadecimal digits hhhh (one, two, three, or four of them) give a sixteen-bit hexadecimal value for the Unicode character that will be inserted.

Backslash substitution is not performed on words enclosed in braces, except for backslash-newline as described above.

(9) Comments If a hash character (#) appears at a point where Tcl is expecting the first character of the first word of a command, then the hash character and the characters that follow it, up through the next newline, are treated as a comment and ignored. The comment character only has significance when it appears at the beginning of a command.

(10) Order of substitution Each character is processed exactly once by the Tcl interpreter as part of creating the words of a command. For example, if variable substitution occurs then no further substitutions are performed on the value of the variable; the value is inserted into the word verbatim. If command substitution occurs then the nested command is processed entirely by the recursive call to the Tcl interpreter; no substitutions are performed before making the recursive call and no additional substitutions are performed on the result of the nested script. Substitutions take place from left to right, and each substitution is evaluated completely before attempting to evaluate the next. Thus, a sequence like

set y [set x 0][incr x][incr x]

will always set the variable y to the value, 012.

(11) Substitution and word boundaries Substitutions do not affect the word boundaries of a command. For example, during variable substitution the entire value of the variable becomes part of a single word, even if the variable's value contains spaces.

Comments

edit

The first rule for comments is simple: comments start with # where the first word of a command is expected, and continue to the end of line (which can be extended, by a trailing backslash, to the following line):

# This is a comment \
going over three lines \
with backslash continuation

One of the problems new users of Tcl meet sooner or later is that comments behave in an unexpected way. For example, if you comment out part of code like this:

# if {$condition} {
    puts "condition met!"
# }

This happens to work, but any unbalanced braces in comments may lead to unexpected syntax errors. The reason is that Tcl's grouping (determining word boundaries) happens before the # characters are considered.

To add a comment behind a command on the same line, just add a semicolon:

puts "this is the command" ;# that is the comment

Comments are only taken as such where a command is expected. In data (like the comparison values in switch), a # is just a literal character:

if $condition {# good place
   switch -- $x {
       #bad_place {because switch tests against it}
       some_value {do something; # good place again}
   }
}

To comment out multiple lines of code, it is easiest to use "if 0":

if 0 {
    puts "This code will not be executed"
    This block is never parsed, so can contain almost any code
    - except unbalanced braces :)
}

Data types

edit

In Tcl, all values are strings, and the phrase "Everything is a string" is often used to illustrate this fact. But just as 2 can be interpreted in English as "the number 2" or "the character representing the number 2", two different functions in Tcl can interpret the same value in two different ways. The command expr, for example, interprets "2" as a number, but the command string length interprets "2" as a single character. All values in Tcl can be interpreted either as characters or something else that the characters represent. The important thing to remember is that every value in Tcl is a string of characters, and each string of characters might be interpreted as something else, depending on the context. This will become more clear in the examples below. For performance reasons, versions of Tcl since 8.0 keep track of both the string value and how that string value was last interpreted. This section covers the various "types" of things that Tcl values (strings) get interpreted as.

Strings

edit

A string is a sequence of zero or more characters (where all 16-bit Unicodes are accepted in almost all situations, see in more detail below). The size of strings is automatically administered, so you only have to worry about that if the string length exceeds the virtual memory size.

In contrast to many other languages, strings in Tcl don't need quotes for markup. The following is perfectly valid:

set greeting Hello!

Quotes (or braces) are rather used for grouping:

set example "this is one word"
set another {this is another}

The difference is that inside quotes, substitutions (like of variables, embedded commands, or backslashes) are performed, while in braces, they are not (similar to single quotes in shells, but nestable):

set amount 42
puts "You owe me $amount" ;#--> You owe me 42
puts {You owe me $amount} ;#--> You owe me $amount

In source code, quoted or braced strings can span multiple lines, and the physical newlines are part of the string too:

set test "hello
world
in three lines"

To reverse a string, we let an index i first point at its end, and decrementing i until it's zero, append the indexed character to the end of the result res:

proc sreverse str {
set res ""
for {set i [string length $str]} {$i > 0} {} {
    append res [string index $str [incr i -1]]
} 
set res
}

sreverse "A man, a plan, a canal - Panama"
amanaP - lanac a ,nalp a ,nam A


Hex-dumping a string:

proc hexdump string {
    binary scan $string H* hex
    regexp -all -inline .. $hex
}

hexdump hello
68 65 6c 6c 6f

Finding a substring in a string can be done in various ways:

string first  $substr  $str ;# returns the position from 0, or -1 if not found
string match *$substr* $str ;# returns 1 if found, 0 if not
regexp $substr  $str ;# the same

The matching is done with exact match in string first, with glob-style match in string match, and as a regular expression in regexp. If there are characters in substr that are special to glob or regular expressions, using string first is recommended.

Lists

edit

Many strings are also well-formed lists. Every simple word is a list of length one, and elements of longer lists are separated by whitespace. For instance, a string that corresponds to a list of three elements:

set example {foo bar grill}

Strings with unbalanced quotes or braces, or non-space characters directly following closing braces, cannot be parsed as lists directly. You can explicitly split them to make a list.

The "constructor" for lists is of course called list. It's recommended to use when elements come from variable or command substitution (braces won't do that). As Tcl commands are lists anyway, the following is a full substitute for the list command:

proc list args {set args}

Lists can contain lists again, to any depth, which makes modelling of matrixes and trees easy. Here's a string that represents a 4 x 4 unit matrix as a list of lists. The outer braces group the entire thing into one string, which includes the literal inner braces and whitespace, including the literal newlines. The list parser then interprets the inner braces as delimiting nested lists.

{{1 0 0 0}
 {0 1 0 0}
 {0 0 1 0}
 {0 0 0 1}}

The newlines are valid list element separators, too.

Tcl's list operations are demonstrated in some examples:

set      x {foo bar}
llength  $x        ;#--> 2
lappend  x  grill  ;#--> foo bar grill
lindex   $x 1      ;#--> bar (indexing starts at 0)
lsearch  $x grill  ;#--> 2 (the position, counting from 0)
lsort    $x        ;#--> bar foo grill
linsert  $x 2 and  ;#--> foo bar and grill
lreplace $x 1 1 bar, ;#--> foo bar, grill

Note that only lappend, above is mutating. To change an element of a list (of a list...) in place, the lset command is useful - just give as many indexes as needed:

set test {{a b} {c d}}
{a b} {c d}
lset test 1 1 x
{a b} {c x}

The lindex command also takes multiple indexes:

lindex $test 1 1
x

Example: To find out whether an element is contained in a list (from Tcl 8.5, there's the in operator for that):

proc in {list el} {expr {[lsearch -exact $list $el] >= 0}}
in {a b c} b
1
in {a b c} d
#ignore this line, which is only here because there is currently a bug in wikibooks rendering which makes the 0 on the following line disappear when it is alone 
0

Example: remove an element from a list variable by value (converse to lappend), if present:

proc lremove {_list el} {
  upvar 1 $_list list
  set pos [lsearch -exact $list $el]
  set list [lreplace $list $pos $pos]
}

set t {foo bar grill}
foo bar grill
lremove t bar
foo grill
set t
foo grill

A simpler alternative, which also removes all occurrences of el:

proc lremove {_list el} {
  upvar 1 $_list list
  set list [lsearch -all -inline -not -exact $list $el]
}

Example: To draw a random element from a list L, we first determine its length (using llength), multiply that with a random number > 0.0 and < 1.0, truncate that to integer (so it lies between 0 and length-1), and use that for indexing (lindex) into the list:

proc ldraw L {
   lindex $L [expr {int(rand()*[llength $L])}]
}

Example: Transposing a matrix (swapping rows and columns), using integers as generated variable names:

proc transpose matrix {
   foreach row $matrix {
       set i 0
       foreach el $row {lappend [incr i] $el}
   }
   set res {}
   set i 0
   foreach e [lindex $matrix 0] {lappend res [set [incr i]]}
   set res
}

transpose {{1 2} {3 4} {5 6}}
{1 3 5} {2 4 6}

Example: pretty-printing a list of lists which represents a table:

proc fmtable table {
   set maxs {}
   foreach item [lindex $table 0] {
       lappend maxs [string length $item]
   }
   foreach row [lrange $table 1 end] {
       set i 0
       foreach item $row max $maxs {
           if {[string length $item]>$max} {
               lset maxs $i [string length $item]
           }
           incr i
       }
   }
   set head +
   foreach max $maxs {append head -[string repeat - $max]-+}
   set res $head\n
   foreach row $table {
       append res |
       foreach item $row max $maxs {append res [format " %-${max}s |" $item]}
       append res \n
   }
   append res $head
}

Testing:

fmtable {
   {1 short "long field content"}
   {2 "another long one" short}
   {3 "" hello}
}
+---+------------------+--------------------+
| 1 | short            | long field content |
| 2 | another long one | short              |
| 3 |                  | hello              |
+---+------------------+--------------------+

Enumerations: Lists can also be used to implement enumerations (mappings from symbols to non-negative integers). Example for a nice wrapper around lsearch/lindex:

proc makeEnum {name values} {
   interp alias {} $name: {} lsearch $values
   interp alias {} $name@ {} lindex $values
}

makeEnum fruit {apple blueberry cherry date elderberry}

This assigns "apple" to 0, "blueberry" to 1, etc.

fruit: date
3
fruit@ 2
cherry

Numbers

edit

Numbers are strings that can be parsed as such. Tcl supports integers (32-bit or even 64-bit wide) and "double" floating-point numbers. From Tcl 8.5 on, bignums (integers of arbitrarily large precision) are supported. Arithmetics is done with the expr command, which takes basically the same syntax of operators (including ternary x?y:z), parens, and math functions as C. See below for detailed discussion of expr.

Control the display format of numbers with the format command which does appropriate rounding:

expr 2/3.
0.666666666667
format %.2f [expr 2/3.]
0.67

Up to the 8.4 version (the present version is 8.5), Tcl honored the C convention that an integer starting with 0 is parsed as octal, so

0377 == 0xFF == 255

This changes in 8.5, though - too often people stumbled over "08" meant as hour or month, raised a syntax error, because 8 is no valid octal number. In the future you'd have to write 0o377 if you really mean octal. You can do number base conversions with the format command, where the format is %x for hex, %d for decimal, %o for octal, and the input number should have the C-like markup to indicate its base:

format %x 255
ff
format %d 0xff
255
format %o 255
377
format %d 0377
255

Variables with integer value can be most efficiently modified with the incr command:

incr i    ;# default increment is 1
incr j 2
incr i -1 ;# decrement with negative value
incr j $j ;# double the value

The maximal positive integer can be determined from the hexadecimal form, with a 7 in front, followed by several "F" characters. Tcl 8.4 can use "wide integers" of 64 bits, and the maximum integer there is

expr 0x7fffffffffffffff
9223372036854775807

Demonstration: one more, and it turns into the minimum integer:

expr 0x8000000000000000
-9223372036854775808

Bignums: from Tcl 8.5, integers can be of arbitrary size, so there is no maximum integer anymore. Say, you want a big factorial:

proc tcl::mathfunc::fac x {expr {$x < 2? 1: $x * fac($x-1)}}

expr fac(100)


93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000

IEEE special floating-point values: Also from 8.5, Tcl supports a few special values for floating-point numbers, namely Inf (infinity) and NaN (Not a Number):

set i [expr 1/0.]
Inf
expr {$i+$i}
Inf
expr {$i+1 == $i}
1
set j NaN ;# special because it isn't equal to itself
NaN
expr {$j == $j}
#ignore this line, which is only here because there is currently a bug in wikibooks rendering which makes the 0 on the following line disappear when it is alone 
0

Booleans

edit

Tcl supports booleans as numbers in a similar fashion to C, with 0 being false and any other number being true. It also supports them as the strings "true", "false", "yes" and "no" and few others (see below). The canonical "true" value (as returned by Boolean expressions) is 1.

foreach b {0 1 2 13 true false on off no yes n y a} {puts "$b -> [expr {$b?1:0}]"}
0 -> 0
1 -> 1
2 -> 1
13 -> 1
true -> 1
false -> 0
on -> 1
off -> 0
no -> 0
yes -> 1
n -> 0
y -> 1
expected boolean value but got "a"

Characters

edit

Characters are abstractions of writing elements (e.g. letters, digits, punctuation characters, Chinese ideographs, ligatures...). In Tcl since 8.1, characters are internally represented with Unicode, which can be seen as unsigned integers between 0 and 65535 (recent Unicode versions have even crossed that boundary, but the Tcl implementation currently uses a maximum of 16 bits). Any Unicode U+XXXX can be specified as a character constant with an \uXXXX escape. It is recommended to only use ASCII characters (\u0000-\u007f) in Tcl scripts directly, and escape all others.

Convert between numeric Unicode and characters with

set char [format %c $int]
set int  [scan $char %c]

Watch out that int values above 65535 produce 'decreasing' characters again, while negative int even produces two bogus characters. format does not warn, so better test before calling it.

Sequences of characters are called strings (see above). There is no special data type for a single character, so a single character is just a string on length 1 (everything is a string). In UTF-8, which is used internally by Tcl, the encoding of a single character may take from one to three bytes of space. To determine the bytelength of a single character:

string bytelength $c ;# assuming [string length $c]==1

String routines can be applied to single characters too, e.g [string toupper] etc. Find out whether a character is in a given set (a character string) with

expr {[string first $char $set]>=0}

As Unicodes for characters fall in distinct ranges, checking whether a character's code lies within a range allows a more-or-less rough classification of its category:

proc inRange {from to char} {
    # generic range checker
    set int [scan $char %c]
    expr {$int>=$from && $int <= $to}
}
interp alias {} isGreek {}    inRange 0x0386 0x03D6
interp alias {} isCyrillic {} inRange 0x0400 0x04F9
interp alias {} isHangul {}   inRange 0xAC00 0xD7A3

This is a useful helper to convert all characters beyond the ASCII set to their \u.... escapes (so the resulting string is strict ASCII):

proc u2x s {
   set res ""
   foreach c [split $s ""] {
     scan $c %c int
     append res [expr {$int<128? $c :"\\u[format %04.4X $int]"}]
   }
   set res
}

Internal representation

edit

In the main Tcl implementation, which is written in C, each value has both a string representation (UTF-8 encoded) and a structured representation. This is an implementation detail which allows for better performance, but has no semantic impact on the language. Tcl tracks both representations, making sure that if one is changed, the other one is updated to reflect the change the next time it is used. For example, if the string representation of a value is "8", and the value was last used as a number in an [expr] command, the structured representation will be a numeric type like a signed integer or a double-precision floating point number. If the value "one two three" was last used in one of the list commands, the structured representation will be a list structure. There are various other "types" on the C side which may be used as the structured representation. As of Tcl 8.5, only the most recent structured representation of a value is stored, and it is replaced with a different representation when necessary. This "dual-porting" of values helps avoid, repeated parsing or "stringification", which otherwise would happen often because each time a value is encountered in source code, it is interpreted as a string prior to being interpreted in its current context. But to the programmer, the view that "everything is a string" is still maintained.

These values are stored in reference-counted structures termed objects (a term that has many meanings). From the perspective of all code that uses values (as opposed to code implementing a particular representation), they are immutable. In practice, this is implemented using a copy-on-write strategy.

Variables

edit

Variables can be local or global, and scalar or array. Their names can be any string not containing a colon (which is reserved for use in namespace separators) but for the convenience of $-dereference one usually uses names of the pattern [A-Za-z0-9_]+, i.e. one or more letters, digits, or underscores.

Variables need not be declared beforehand. They are created when first assigned a value, if they did not exist before, and can be unset when no longer needed:

set foo    42     ;# creates the scalar variable foo
set bar(1) grill  ;# creates the array bar and its element 1
set baz    $foo   ;# assigns to baz the value of foo
set baz [set foo] ;# the same effect
info exists foo   ;# returns 1 if the variable foo exists, else 0
unset foo         ;# deletes the variable foo

Retrieving a variable's value with the $foo notation is only syntactic sugar for [set foo]. The latter is more powerful though, as it can be nested, for deeper dereferencing:

set foo   42
set bar   foo
set grill bar
puts [set [set [set grill]]] ;# gives 42

Some people might expect $$$grill to deliver the same result, but it doesn't, because of the Tcl parser. When it encounters the first and second $ sign, it tries to find a variable name (consisting of one or more letters, digits, or underscores) in vain, so these $ signs are left literally as they are. The third $ allows substitution of the variable grill, but no backtracking to the previous $'s takes place. So the evaluation result of $$$grill is $$bar. Nested [set] commands give the user more control.

Local vs. global

edit

A local variable exists only in the procedure where it is defined, and is freed as soon as the procedure finishes. By default, all variables used in a proc are local.

Global variables exist outside of procedures, as long as they are not explicitly unset. They may be needed for long-living data, or implicit communication between different procedures, but in general it's safer and more efficient to use globals as sparingly as possible. Example of a very simple bank with only one account:

 set balance 0 ;# this creates and initializes a global variable

 proc deposit {amount} {
    global balance
    set balance [expr {$balance + $amount}]
 }

 proc withdraw {amount} {
    set ::balance [expr {$::balance - $amount}]
 }

This illustrates two ways of referring to global variables - either with the global command, or by qualifying the variable name with the :: prefix. The variable amount is local in both procedures, and its value is that of the first argument to the respective procedure.

Introspection:

info vars ;#-- lists all visible variables
info locals
info globals

To make all global variables visible in a procedure (not recommended):

eval global [info globals]

Scalar vs. array

edit

All of the value types discussed above in Data types can be put into a scalar variable, which is the normal kind.

Arrays are collections of variables, indexed by a key that can be any string, and in fact implemented as hash tables. What other languages call "arrays" (vectors of values indexed by an integer), would in Tcl rather be lists. Some illustrations:

#-- The key is specified in parens after the array name
set         capital(France) Paris

#-- The key can also be substituted from a variable:
set                  country France
puts       $capital($country)

#-- Setting several elements at once:
array set   capital         {Italy Rome  Germany Berlin}

#-- Retrieve all keys:
array names capital    ;#-- Germany Italy France -- quasi-random order

#-- Retrieve keys matching a glob pattern:
array names capital F* ;#-- France

A fanciful array name is "" (the empty string, therefore we might call this the "anonymous array" :) which makes nice reading:

set (example) 1
puts $(example)

Note that arrays themselves are not values. They can be passed in and out of procedures not as $capital (which would try to retrieve the value), but by reference. The dict type (available from Tcl 8.5) might be better suited for these purposes, while otherwise providing hash table functionality, too.

System variables

edit

At startup, tclsh provides the following global variables:

argc
number of arguments on the command line
argv
list of the arguments on the command line
argv0
name of the executable or script (first word on command line)
auto_index
array with instructions from where to load further commands
auto_oldpath
(same as auto_path ?)
auto_path
list of paths to search for packages
env
array, mirrors the environment variables
errorCode
type of the last error, or {}, e.g. ARITH DIVZERO {divide by zero}
errorInfo
last error message, or {}
tcl_interactive
1 if interpreter is interactive, else 0
tcl_libPath
list of library paths
tcl_library
path of the Tcl system library directory
tcl_patchLevel
detailed version number, e.g. 8.4.11
tcl_platform
array with information on the operating system
tcl_rcFileName
name of the initial resource file
tcl_version
brief version number, e.g. 8.4

One can use temporary environment variables to control a Tcl script from the command line, at least in Unixoid systems including Cygwin. Example scriptlet:

set foo 42
if [info exists env(DO)] {eval $env(DO)}
puts foo=$foo

This script will typically report

 foo=42

To remote-control it without editing, set the DO variable before the call:

DO='set foo 4711' tclsh myscript.tcl

which will evidently report

foo=4711

Dereferencing variables

edit

A reference is something that refers, or points, to another something (if you pardon the scientific expression). In C, references are done with *pointers* (memory addresses); in Tcl, references are strings (everything is a string), namely names of variables, which via a hash table can be resolved (dereferenced) to the "other something" they point to:

puts foo       ;# just the string foo
puts $foo      ;# dereference variable with name of foo
puts [set foo] ;# the same

This can be done more than one time with nested set commands. Compare the following C and Tcl programs, that do the same (trivial) job, and exhibit remarkable similarity:

#include <stdio.h>
int main(void) {
  int    i =      42;
  int *  ip =     &i;
  int ** ipp =   &ip;
  int ***ippp = &ipp;
  printf("hello, %d\n", ***ippp);
  return 0;
}

...and Tcl:

set i    42
set ip   i
set ipp  ip
set ippp ipp
puts "hello, [set [set [set [set ippp]]]]"

The asterisks in C correspond to calls to set in Tcl dereferencing. There is no corresponding operator to the C & because, in Tcl, special markup is not needed in declaring references. The correspondence is not perfect; there are four set calls and only three asterisks. This is because mentioning a variable in C is an implicit dereference. In this case, the dereference is used to pass its value into printf. Tcl makes all four dereferences explicit (thus, if you only had 3 set calls, you'd see hello, i). A single dereference is used so frequently that it is typically abbreviated with $varname, e.g.

puts "hello, [set [set [set $ippp]]]"

has set where C uses asterisks, and $ for the last (default) dereference.

The hashtable for variable names is either global, for code evaluated in that scope, or local to a proc. You can still "import" references to variables in scopes that are "higher" in the call stack, with the upvar and global commands. (The latter being automatic in C if the names are unique. If there are identical names in C, the innermost scope wins).

Variable traces

edit

One special feature of Tcl is that you can associate traces with variables (scalars, arrays, or array elements) that are evaluated optionally when the variable is read, written to, or unset.

Debugging is one obvious use for that. But there are more possibilities. For instance, you can introduce constants where any attempt to change their value raises an error:

proc const {name value} {
  uplevel 1 [list set $name $value]
  uplevel 1 [list trace var $name w {error constant ;#} ]
}

const x 11
incr x
can't set "x": constant

The trace callback gets three words appended: the name of the variable; the array key (if the variable is an array, else ""), and the mode:

  • r - read
  • w - write
  • u - unset

If the trace is just a single command like above, and you don't want to handle these three, use a comment ";#" to shield them off.

Another possibility is tying local objects (or procs) with a variable - if the variable is unset, the object/proc is destroyed/renamed away.