Programming language based closely on Turing machines with a Turing-complete macro language based on expression rewriting.
The main data type in T-Write is a combination between a dictionary and a function that can use pattern matching. A T-Write program is one of these dictionaries/functions, and there is an operator that allows looking things up in the main dictionary of the program. This by itself would be Turing-complete, but this dictionary is then used to construct the states for a Turing machine (or a finite-state automaton).
Whitespace between tokens is ignored.
Names consist of letters, digits, -
, .
, _
, '
. Like Haskell, whether a name starts with a capital letter determines how it's interpreted. For this purpose, _
is treated as lowercase, while -
, .
, '
, and digits are treated as uppercase. The name _
is reserved for use as a wildcard.
Comments start with %
and last until the end of the line. Also, in the newer version, if a comment at the beginning of the program starts with %%
(two parentheses and a space), then the rest of the line will be printed when the program starts (this can be used to add a description of what the program does).
Basic types include:
A tuple (value1, value2, ...) is equivalent to a dictionary with keys 0, 1, etc. A list is a linked list of (head, rest) pairs.
Expressions include:
{
key :
value ;
key :
value }
(
value ,
value )
0
, 1
, etc.). There may be any number of comma-separated values, including zero or one. (Note that this means parentheses are not used for precedence disambiguation like in most languages; a single parenthesized expression always creates a 1-tuple.) A trailing comma is allowed and is ignored. The values are all expressions. This is simply a shortcut for the equivalent dictionary literal.
(
:
value1 ,
value2 :
)
(value1, (value2, ()))
(
:
value1 ,
value2 :
last )
(x y . z)
). Equivalent to (value1, (value2, last))
[
:
value ,
integer :
]
*
+
dictionary literal
*
pattern
=
expression
Pattern syntax is a superset of expression syntax.
_
_
can match infinitely many things, there are some situations where _
is not allowed.
@
pattern
@
@ _
{
key :
value ;
key :
value }
_
wildcards. Note that a dictionary literal with overlapping keys might not match the equivalent pattern; e.g., {A:1; [A,B]:2}
does not match {A:1; [A,B]:2}
because that pattern would require A to be both 1 and 2.
{
key :
value ;
key :
value ;
_
}
{
:
variable @
optional pattern }
.
in front of it. That is, {: foo@pattern}
is equivalent to {.foo: foo@pattern}
.
(
pattern1 ,
pattern2 )
(
:
pattern1 ,
pattern2 :
)
(
:
pattern1 ,
pattern2 :
end pattern )
[
:
pattern ,
integer :
]
(:items:_)
can be used to match lists that start a certain way, without requiring that it's the whole list.
[
:
pattern ,
integer #
integer :
]
[
pattern ,
pattern ]
#
optional integer
#
means "any integer". Note that integers are symbols, and there isn't any way for a symbol to exist that isn't in the source code (once all syntactic sugar is expanded), so in any given program, #
can only match a finite number of things.
=
expression
*
+
dictionary literal
=
in front of the expression (e.g., foo*
as a pattern means the same thing as =foo*
).
=
were in front of the variable name.
*
pattern
~
(
pattern1 ,
pattern2 )
Postfix operators have a higher precedence than prefix operators.
If an infinite dictionary (a dictionary with a key containing _
) is matched for equality for another dictionary, the test will fail.
Patterns cannot match other patterns except using a wildcard (_
).
There is a shortcut for lists of similar identifiers, like Item1, Item2, Item3
, or A, B, C, D, E
. This can be used for capitalized identifiers in comma-separated lists (tuples, linked lists, unions). Include the common part, and replace the changing part with the range in angle brackets.
The following forms are allowed:
<upper>
A
through Z
<lower>
a
through z
<digit>
0
through 9
<udigit>
, <ldigit>
<byte>
0
through 255
<sbyte>
-128
through 127
<ubyte>
, <lbyte>
<
start ,
end >
<
start ,
second ,
end >
For instance, ['<upper>', '<lower>']
is equivalent to ['A','B','C' ... 'Y','Z','a','b','c' ... 'z']
, and (<0,2,20>)
is equivalent to (0,2,4,6,8,10,12,14,16,18,20)
. (Note that the latter is also the "multiply by 2" function, with a limited domain; this is one potential use for this feature.)
You can also use them in a dictionary, in which case both the key and the value must be ranges; corresponding elements of each range are used.
A program is of the form capabilities_pattern :
io_pattern ,
tape_pattern ,
state_pattern :
dictionary_expression. The dictionary expression has the primary rules for the program; it maps states to actions, and the *
postfix operator looks up values from this dictionary. The capabilities pattern (and its following colon) may be omitted. If the IO pattern is omitted, it is assumed to be the same as the tape pattern; otherwise, it must be a subset of the tape pattern.
The capabilities pattern determines the capabilities of the program. Implementations may reject programs that require capabilities that they don't support, or use the pattern to determine which implementation to choose. Capabilities are described as a dictionary that contains at least the following keys:
Mem: [Tape, Stack, State, Finite]
Tape
may use Move:
rules (Turing complete); Stack
may use Stack:
rules (push-down automaton); and State
may have _
in its state, IO, and tape patterns (Turing complete). Finite
can do none of these (finite state machine). Any of these may use Write:
rules (all except Tape
have a one-symbol tape). (The current version of the interpreter doesn't support Mem: Stack
.)
Nondeterm: [True, False]
Nondeterm: True
.)
In: [Tape, None, Interact, Both]
Tape
, input is on the tape when the program starts (for memory type Tape
, the current position is at the start of the input; for other memory types, the input is one symbol); otherwise, the tape is initially empty. For None
, no input is received. For Interact
, IO:In
operations may be used; if both input and output are set to Interact
, IO is interactive. If In: Both
is specified, the program accepts input both interactively and on its tape.
Out: [Tape, Bool, Interact, Both]
Tape
, the output is the section of the tape including the current pointer, extending left and right to the first blank encountered (or the current symbol, for memory types other than Tape
). For Bool
, the output is true if the program uses Halt
and false if it uses Reject
. For Interact
, IO:Out
operations may be used; if only output is Interact
, this can be used to generate infinite output and still let the user see it. If Out: Both
is used, the program gives output both interactively and on its tape.
Implementations should support programs that only accept these keys. If the capabilities pattern does not contain _
, options listed first in the program are preferred. Variables bound in the capabilities pattern are available throughout the program.
tape_pattern is the type of symbol on the tape. It must not contain _
(including implicitly). Variables bound here are bound throughout the program. io_pattern is the type of symbol used for input and output, whether interactive or through the tape; it may not contain _
or an unbounded #
. Variables bound in io_pattern aren't available elsewhere; variables should be bound in tape_pattern instead. state_pattern is the type of state used by certain built-in rules; it may not bind any variables or contain _
. Everything matched by the state_pattern must be matched by a key in the main dictionary.
The first possible value for tape_pattern is considered to be a blank value; parts of the tape that don't have input are filled with this value.
The variables io_pattern
, tape_pattern
, and state_pattern
are bound throughout the code to refer to the patterns at the beginning of the program.
The initial state is Start
. For each state, the current state is looked up in the dictionary literal, and the result is used as the next state, unless it matches one of the built-in rules:
Halt
Reject
{Write: t@tape_pattern; Next: s@state_pattern}
{Move: n@#; Next: s@state_pattern}
Mem: Tape
.
{Stack: which@[Push,Pop]; Next: s@state_pattern}
Mem: Stack
.
{IO: In; Next: s@state_pattern}
In: Interact
.
{IO: Out; Next: s@state_pattern}
Out: Interact
.
{Write: t@tape_pattern; Move: n@#; Next: s@state_pattern}
{Write: t@tape_pattern; IO: Out; Next: s@state_pattern}
{Write: t@tape_pattern; Stack: Push; Next: s@state_pattern}
{Choose: (state1@state_pattern, state2@state_pattern)}
Nondeterm: True
.
Built-in rules take precedence over user-defined rules. Built-in rules are ignored by the *
operator. Built-in rules that are not supported by an implementation are treated as ordinary states; this allows programs to work on multiple types of machines. It is a type error if the rewritten expression is not a key in the main dictionary.
Note that the rewriting may be done at compile time. If it does not terminate for some (state, tape) combination, the compiler is not guaranteed to terminate either.
By default, input is expected to be in the form of valid expressions. If you instead want character-based IO, you can define a rule in the main dictionary for CharCode
which maps to a number (Unicode value) for each tape value; this is used to construct a lookup table between characters and values.
Template:
current @ 0 # 255, #: {
Incr: (<1,255>,0);
Decr: (255,<0,254>);
{Lookup:key@; In:{key:value@; _}}: value;
{If:0; Then:_; Else:e@}: e;
{If:_; Then:t@; Else:_}: t;
{Modify:op@; Next:n@#}: {Write: {Lookup:current; In:op}*; Next:n};
{LoopFrom:start@#; To:end@#}: {If:current; Then:start; Else:end};
Start: 0;
code goes here
last opnum + 1: Halt
}
Commands:
- | opnum: {Modify:Decr; Next:next opnum}
|
+ | opnum: {Modify:Incr; Next:next opnum}
|
< | opnum: {Move:-1; Next:next opnum}
|
> | opnum: {Move:1; Next:next opnum}
|
[ ... ] | [first opnum,second opnum]: {LoopFrom:first opnum + 1; To:second opnum + 1}
|
. | {IO:Out; Next:next opnum}
|
, | {IO:In; Next:next opnum}
|
Interpreter written in Haskell (updated 2020; source only, download and run; 56K compressed, 412KiB uncompressed; also includes TurExp)
Original interpreter written in Haskell (not compatible with modern versions of GHC; source only; download and run; 49K compressed; also includes TurExp)
The interpreter can either be run interactively, by simply typing ./twrite
, or with command-line arguments, by typing ./twrite filename tape-args
. Command-line arguments are used for the Tape
input type; if the program uses the Interactive
input type, input comes from standard input. The -d
option can also be specified to show debugging information (note that debug mode ignores CharCode
).