T-Write

Programming language based closely on Turing machines with a Turing-complete macro language based on expression rewriting.

Lexy stuff

Whitespace between tokens is ignored.

Names consist of letters, digits, -, ., _, '. Like Haskell, whether a name is capitalized determines how it's interpreted. For this purpose, _ is treated as lowercase, while -, ., ', and digits are treated as uppercase.

Comments start with % and last until the end of the line.

Types and values

Basic types include:

A tuple (value1, value2, ...) is equivalent to a dictionary with keys 0, 1, etc. A list is a linked list of (head, rest) pairs.

Expressions

Expressions include:

lowercase name
The value of a variable bound in the current scope
capitalized name
A symbol literal
{ key : value ; key : value }
A dictionary. There may be any number of semicolon-separated key : value pairs, including zero. A trailing semicolon is allowed and is ignored. The keys are patterns and the values are expressions. If a key pattern matches multiple values, then all relevant values are in the dictionary. If key patterns overlap, the first one listed takes priority.
( value , value )
A tuple. There may be any number of comma-separated values, including zero or one. A trailing comma is allowed and is ignored. The values are all expressions. This is simply a shortcut for the equivalent dictionary literal.
( : value1 , value2 : )
A linked list. Equivalent to (value1, (value2, ()))
( : value1 , value2 : last )
A linked list with a last element specified. Equivalent to (value1, (value2, last))
[ : value , integer : ]
A linked list with the specified number of copies of an item.
expression *
Rewrite the expression (explained below).
expression + dictionary literal
Replace or add entries to a dictionary. It is a type error if expression is not a dictionary.
* pattern
Pattern literal.
= expression
Same as expression (for consistency with patterns).

Patterns

Pattern syntax is a superset of expression syntax.

_
Wildcard
variable @ pattern
Matches pattern and binds variable. A variable cannot be bound twice in the same scope.
variable @
Shortcut for variable @ _
capitalized name
Matches an exact symbol.
{ key : value ; key : value }
Matches a dictionary. Keys and values are both patterns. The dictionary must contain all keys that match the patterns, and it may not contain more. If a key has a pattern that can match multiple things, then the corresponding value cannot bind variables. Variables bound in a key are only in scope for that entry. The key patterns may not involve underscores. Note that a dictionary literal with overlapping keys might not match the equivalent pattern; e.g., {A:1; [A,B]:2} does not match {A:1; [A,B]:2} because that pattern would require A to be both 1 and 2.
{ key : value ; key : value ; _ }
Same as a dictionary pattern, except that it also allows the dictionary to contain more entries.
{ : variable @ optional pattern }
If the key in a dictionary pattern is omitted and the value binds a variable, the key is implied to be the variable name with a . in front of it.
( pattern1 , pattern2 )
( : pattern1 , pattern2 : )
( : pattern1 , pattern2 : end pattern )
[ : pattern , integer : ]
These still expand to the same things as they do in expressions; patterns can be used as expected inside these constructs. (:items:_) can be used to match lists that start a certain way, without requiring that it's the whole list.
[ : pattern , integer # integer : ]
Matches a list of a particular type with a specified range of lengths. The lower bound may be omitted, in which case it is implied to be 0. The upper bound may not be omitted.
[ pattern , pattern ]
Union. Comma-separated list of zero or more patterns. If a variable is bound in one pattern, it must be bound in all of the patterns.
optional integer # optional integer
An integer within the specified range, inclusive.
= expression
If the expression results in a pattern, use that pattern. If the expression results in any other value, test that value for equality (e.g., can be used to escape dictionaries with overlapping keys).
expression *
expression + dictionary literal
Do the same thing as if there were = in front of the expression.
lowercase name
Matches the value of a variable already bound in the current scope, acting the same as if = were in front of the variable name.
* pattern
Same as pattern by itself (for consistency)
expression ~ ( pattern1 , pattern2 )
Matches if expression matches against pattern1, and pattern2 matches the original input. (Intended for simple descriptions of Turing machines, where the output depends on both the current tape symbol and the state.)

Postfix operators have a higher precedence than prefix operators.

If an infinite dictionary (a dictionary with a key containing _) is matched for equality for another dictionary, the test will fail.

Patterns cannot match other patterns except using a wildcard (_).

Ranges

There is a shortcut for lists of similar identifiers, like Item1, Item2, Item3, or A, B, C, D, E. This can be used for capitalized identifiers in comma-separated lists (tuples, linked lists, unions). Include the common part, and replace the changing part with the range in angle brackets.

The following forms are allowed:

<upper>
A through Z
<lower>
a through z
<digit>
0 through 9
<udigit>, <ldigit>
Uppercase or lowercase (respectively) hexidecimal digits
<byte>
0 through 255
<sbyte>
-128 through 127
<ubyte>, <lbyte>
Uppercase or lowercase (respectively) pairs of hexidecimal digits
< start , end >
Start through end, where start and end are numbers, uppercase letters, or lowercase letters
< start , second , end >
Start through end, also specifying the second item (which the step is deduced from). May include numbers with decimal points.

For instance, ['<upper>', '<lower>'] is equivalent to ['A','B','C' ... 'Y','Z','a','b','c' ... 'z'], and (<0,2,20>) is equivalent to (0,2,4,6,8,10,12,14,16,18,20). (Note that the latter is also the "multiply by 2" function, with a limited domain; this is one potential use for this feature.)

You can also use them in a dictionary, in which case both the key and the value must be ranges; corresponding elements of each range are used.

Programs

A program is of the form capabilities_pattern : io_pattern , tape_pattern , state_pattern : dictionary expression. The dictionary expression has the primary rules for the program; it maps states to actions; the * postfix operator looks up values from this dictionary. The capabilities pattern (and its following colon) may be omitted. If the IO pattern is omitted, it is assumed to be the same as the tape pattern; otherwise, it must be a subset of the tape pattern.

The capabilities pattern determines the capabilities of the program. Implementations may reject programs that require capabilities that they don't support, or use the pattern to determine which implementation to choose. Capabilities are described as a dictionary that contains at least the following keys:

Mem: [Tape, Stack, State, Finite]
Determines the type of memory available. Tape may use Move: rules (Turing complete); Stack may use Stack: rules (push-down automata); and State may have _ in its state, IO, and tape patterns (Turing complete). Finite can do none of these (finite state machine). Any of these may use Write: rules (all except Tape have a one-symbol tape).
Nondeterm: [True, False]
Determines if nondeterminism is allowed.
In: [Tape, None, Interact, Both]
Determines how input is received. For Tape, input is on the tape when the program starts (for memory type Tape, the current position is at the start of the input; for other memory types, the input is one symbol); otherwise, the tape is initially empty. For None, no input is received. For Interact, IO:In operations may be used; if both input and output are set to Interact, IO is interactive. If In: Both is specified, the program accepts input both interactively and on its tape.
Out: [Tape, Bool, Interact, Both]
Determines how output is given; similar to input. For Tape, the output is the section of the tape including the current pointer, extending left and right to the first blank encountered (or the current symbol, for memory types other than Tape). For Bool, the output is true if the program uses Halt and false if it uses Reject. For Interact, IO:Out operations may be used; if only output is Interact, this can be used to generate infinite output and still let the user see it. If Out: Both is used, the program gives output both interactively and on its tape.

Implementations should support programs that only accept these keys. If the capabilities pattern does not contain _, options listed first in the program are preferred. Variables bound in the capabilites pattern are available throughout the program.

tape_pattern is the type of symbol on the tape. It must not contain _ (including implicitly). Variables bound here are bound throughout the program. io_pattern is the type of symbol used for input and output, whether interactive or through the tape; it may not contain _ or an unbounded #. Variables bound in io_pattern aren't available elsewhere; variables should be bound in tape_pattern instead. state_pattern is the type of state used by certain built-in rules; it may not bind any variables or contain _. Everything matched by the state_pattern must be matched by a key in the main dictionary.

The variables io_pattern, tape_pattern, and state_pattern are bound throughout the code to refer to the patterns at the beginning of the program.

The initial state is Start. For each state, the current state is looked up in the dictionary literal, and the result is used as the next state, unless it matches one of the built-in rules:

Halt
Halt in an accepting state. Always supported.
Reject
Halt in a rejecting state. Always supported.
{Write: t@tape_pattern; Next: s@state_pattern}
Write symbol t, and go to state s. Always supported.
{Move: n@#; Next: s@state_pattern}
Move n spaces to the right (can be negative), and go to state s. Only supported with Mem: Tape.
{Stack: which@[Push,Pop]; Next: s@state_pattern}
Moves an element between the current tape symbol and the stack. Requires Mem: Stack.
{IO: In; Next: s@state_pattern}
Reads a symbol of input into the current tape cell. Requires In: Interact.
{IO: Out; Next: s@state_pattern}
Outputs the current symbol on the tape. Requires Out: Interact.
{Write: t@tape_pattern; Move: n@#; Next: s@state_pattern}
{Write: t@tape_pattern; IO: Out; Next: s@state_pattern}
{Write: t@tape_pattern; Stack: Push; Next: s@state_pattern}
Writes a symbol to the tape, then performs the specified action. Has the same requirements as performing the action normally.
{Choose: (state1@state_pattern, state2@state_pattern)}
Fork a new instance of the program for state1 and state2. The entire machine accepts the input if any copy of it accepts. Only supported with Nondeterm: True.

Built-in rules take precedence over user-defined rules. Built-in rules are ignored by the * operator. Built-in rules that are not supported by an implementation are treated as ordinary states; this allows programs to work on multiple types of machines. It is a type error if the rewritten expression is not a key in the main dictionary.

If looking up CharCode results in a number for all tape patterns, that number may be used for input and output (if supported).

Note that the rewriting may be done at compile time. If it does not terminate for some (state, tape) combination, the compiler is not guaranteed to terminate either.

Appendix: Brainf*** translation

Skeleton


current @ 0 # 255, #: {
	Incr: (<1,255>,0);
	Decr: (255,<0,254>);
	{Lookup:key@; In:{key:value@; _}}: value;
	{If:0; Then:_; Else:e@}: e;
	{If:_; Then:t@; Else:_}: t;
	{Modify:op@; Next:n@#}: {Write: {Lookup:current; In:op}*; Next:n};
	{LoopFrom:start@#; To:end@#}: {If:current; Then:start; Else:end};
	Start: 0;
	code goes here
	last opnum + 1: Halt
}

Translation:

-opnum: {Modify:Decr; Next:next opnum}
+opnum: {Modify:Incr; Next:next opnum}
<opnum: {Move:-1; Next:next opnum}
>opnum: {Move:1; Next:next opnum}
[ ... ][first opnum,second opnum]: {LoopFrom:first opnum + 1; To:second opnum + 1}
.{IO:Out; Next:next opnum}
,{IO:In; Next:next opnum}

Interpreters

Interpreter written in Haskell (source only; download and run; 49K compressed; also includes TurExp)