Design goal: To create a programming language using primarily the Miscellaneous Symbols Unicode characters (U+2600 through U+26FF). Also used are Enclosed Alphanumerics (starting at U+2460) and Dingbats (starting at U+2700). No ASCII is used except whitespace. (Basically I just looked at the various characters there and tried to figure out what they could mean in a programming language.)
Version 2.0 makes it easier to program in by adding two jump commands, decimal input/output, and a numeric comparison command; it also gives some more output options that weren't available in the first version.
A program consists of a sequence of instructions and labels, most of which are one character long. There are one accumulator, four special-purpose stacks (black call, white call, pointer, and subscript), and twenty-six general-purpose variables, all described below.
All whitespace in the program (including newlines) is ignored. The parenthesized alphanumerics ⑴-⒇ and ⒜-⒵ (U+2474 through U+2487 and U+249C through U+24B5) are also ignored, and can be used to add comments to the program. Starting in version 2.0, variation selectors 1-16 (U+FE00 through U+FE0F) are also ignored.
The accumulator stores a single non-negative integer.
♮
(U+266E MUSIC NATURAL SIGN)
♯
(U+266F MUSIC SHARP SIGN)
♭
(U+266D MUSIC FLAT SIGN)
☂
if accumulator is zero; otherwise, subtract 1 from accumulator
♙
, ♘
, ♗
, ♖
, ♕
, ♔
(U+2659 WHITE CHESS PAWN through U+2654 WHITE CHESS KING)
♟
, ♞
, ♝
, ♜
, ♛
, ♚
(U+265F BLACK CHESS PAWN through U+265A BLACK CHESS KING)
☂
if accumulator is not divisible by n; otherwise, divide accumulator by n, where n = 2, 3, 5, 7, 11, 13, depending on the symbol used
⚀
, ⚁
, ⚂
, ⚃
, ⚄
, ⚅
(U+2680 DIE FACE-1 through U+2685 DIE FACE-6)
☂
(U+2602 UMBRELLA)
☀
(U+2600 BLACK SUN WITH RAYS)
☂
. If there is a sunshine between an operation that causes an error and the umbrella it would otherwise jump to, rather than jumping to the umbrella, it will simply continue to the next statement.
Like many programming languages, Symbols has an unconditional jump instruction, and keeps track of jumps on a stack so that calls may return to the caller. However, unlike most other programming languages, there are different sets of commands for forward ("black") and backward ("white") jumps, and there are two separate call stacks as well.
⚐
(U+2690 WHITE FLAG)
☏
(U+260F WHITE TELEPHONE)
⚐
(or the beginning of the program if there is no such label).
♡
(U+2661 WHITE HEART SUIT)
☜
(U+261C WHITE LEFT POINTING INDEX) [2.0 only]
⚑
(U+2691 BLACK FLAG)
☎
(U+260E BLACK TELEPHONE)
⚑
(or the end of the program if there is no such label).
♥
(U+2665 BLACK HEART SUIT)
☛
(U+261C BLACK RIGHT POINTING INDEX) [2.0 only]
☯
(U+262F YIN YANG)
☠
(U+2620 SKULL AND CROSSBONES) [2.0 only]
If an attempt is made to pop a stack that is empty (including with ☯
), the result is undefined.
The only type of data that can be stored in a variable is an array. The only type of element that an array can have is an array. The standard way to store a number is as the length of an array. All arrays must be manually allocated and deallocated except for empty (zero-length) arrays. There is a stack called the pointer stack, which pointers to arrays can be pushed onto; however, an array must be in a variable to get an element from it or query its size. The commands for doing so are described in the next section.
✎
(U+270E LOWER RIGHT PENCIL)
♲
(U+2672 UNIVERSAL RECYCLING SYMBOL)
There are twenty-six variables, A through Z, that store pointers to arrays. All variables initially point to the empty array.
Access to variables is controlled by the subscript stack, a stack that can hold non-negative integers and a special value called a mark. If the subscript stack is not empty when a variable is accessed, each element of the subscript stack will be popped and the current array subscripted (first element is zero) until a mark (or the bottom of the stack) is reached; the mark is also popped. For instance, if 2, 3, mark, 4, 5 are pushed onto the stack, and the variable A is accessed using any command, that command will actually work on A[5][4], and the subscript stack will become 2, 3.
☃
(U+2603 SNOWMAN)
☁
(U+2601 CLOUD)
Ⓐ
-Ⓩ
(U+24B6 through U+24CF CIRCLED LATIN CAPITAL LETTER A-Z)
ⓐ
-ⓩ
(U+24D0 through U+24E9 CIRCLED LATIN SMALL LETTER A-Z)
✂Ⓐ
-Ⓩ
(U+2702 BLACK SCISSORS followed by U+24B6 through U+24CF CIRCLED LATIN CAPITAL LETTER A-Z)
☢ⓐ
-ⓩ
(U+2622 RADIOACTIVE SIGN followed by U+24D0 through U+24E9 CIRCLED LATIN SMALL LETTER A-Z)
⚖Ⓐ
-Ⓩ
(U+2696 SCALES followed by U+24B6 through U+24CF CIRCLED LATIN CAPITAL LETTER A-Z) [2.0 only]
❝
(U+275D HEAVY DOUBLE TURNED COMMA QUOTATION MARK ORNAMENT)
❞
(U+275E HEAVY DOUBLE COMMA QUOTATION MARK ORNAMENT)
①
-⑳
(U+2460 CIRCLED DIGIT ONE through U+2473 CIRCLED NUMBER TWENTY) [2.0 only]
❝
or ❞
statement. ❝
will act as normal (still pushes an array onto the pointer stack) but also put the number in the accumulator or go to the next umbrella if the input isn't a valid number. ❞
outputs the accumulator and doesn't pop the pointer stack.
☽
(U+263D FIRST QUARTER MOON) [2.0 only]
❞
commands after this until the next ☾
.
☾
(U+263E LAST QUARTER MOON) [2.0 only]
☄
(U+2604 COMET) [2.0 only]
☮
(U+262E PEACE SYMBOL) [2.0 only]
♫
, ♩
, ♪
, ♬
(U+2669 QUARTER NOTE through U+266C BEAMED SIXTEENTH NOTES) [2.0 only]
☮
can be helpful for playing multiple notes in sequence. All notes are silenced when the program ends, so a program that just plays a note and ends without waiting won't work.
⚛
(U+269B ATOM SYMBOL) [2.0 only]
Insert:
Output:
Old Java version (1.0): Downloadable interpreter written in Java