BigINTERCAL (esoteric programming language)

BigINTERCAL is an esoteric programming language that I made inspired by INTERCAL and bitch.

The biggest change from INTERCAL is that, whereas INTERCAL has one-spot variables that are 16 bits and two-spot variables that are 32 bits, BigINTERCAL has infinity-spot variables that are an infinite number of bits (at least, until your computer runs out of memory). The downside is that once I made the first infinity-spot variable, I couldn't fit in any other variables, not even the one-spot or two-spot variables from INTERCAL. Yeah, I could move things around, maybe move all the bits to just the even-numbered bits, but Hilbert's hotel tried that and they're never going to recover from the infinite number of one-star reviews (★) they got, so I decided it just wasn't worth it. Since there's only one variable, it doesn't need a number, so naturally it's notated with two numbers: |1\1.

There are also a number of other changes from INTERCAL, and none of the later extensions are implemented except for COME FROM.

Expressions
Statements
System library
Errors
Compatibility
Top-left wisdom tooth: 2-adic fractions
Top-right wisdom tooth: Complete grammar
Interpreter

Expressions

The following are all the valid expressions in BigINTERCAL. Anything else is not a valid expression. For instance, 🙄 is not a valid expression in BigINTERCAL.

Like in INTERCAL, there is no precedence; ambiguous expressions must be surrounded by sparks (' … ') or rabbit ears (" … ").

Constants

Constants are written in decimal and preceded by a mesh (#). Unlike in INTERCAL, there is no restriction on the range numbers can have.

Please note that BigINTERCAL is not a confusing enough situation to confuse the mesh with the interleave operator.

Variable

Variable is written with an infinity-spot, otherwise known as a spike (|), followed by a fraction. This fraction is written as two numbers separated by a slat (/). The fraction must be in reduced form, and must have an odd denominator. Negative fractions can also be used; these are indicated by replacing the slat with a backslat (\). The fraction is interpreted as a 2-adic number, and the bits from the variable corresponding to the 1 bits in the fraction are selected. That is, something like |1\3 is equivalent to |1\1~#1\3, or would be if fractions were allowed in select expressions, and selects every other bit (including the least-significant bit). (See the wisdom tooth for more information.)

Just like INTERCAL allows a spark and a spot to combine to form a wow (!), BigINTERCAL allows a spark and a spike to combine to form a spoke (¦).

Unary operators

BigINTERCAL has the same three unary operators as INTERCAL: and (&), or (V), and xor (∀). Unary operators are placed directly after the mesh of a constant, the spike of a variable, or the opening spark or rabbit ears of a grouping expression. All of these operators perform the specified bitwise operator on consecutive bits of a number: the least significant bit of the result is the least significant bit of the input combined with the second least significant, the second least significant is the second least significant combined with the third least significant, and so on. The most significant bit of the result is the most significant bit of the input combined with the least significant bit of the input, but that bit is way off at infinity where the computer can't see it, so it doesn't matter.

The xor operator is normally represented by a bookworm (∀), which should not be an issue because we have Unicode now and so there are never going to be any character encoding issues ever again. However, some people may find it inconvenient to type, so the bookworm can be approximated with the why money sign (¥). Furthermore, the why money sign and the backslat (\) are basically just two different ways of writing the same character, so BigINTERCAL treats them as equivalent. Note that this means that the xor of the variable can be written as |\1\1.

Interleave

The interleave operator, represented by a change (¢), rearranges the bits of its operands such that the least significant bit of the result in the least significant bit of the last operand, then the least significant bit of the other operands in reverse order, then the second least significant bit of each operand, and so on. In INTERCAL, interleave was a binary operator; however, BigINTERCAL can count much higher than two so it allows any number of operands. (It's still binary in the sense that it operates on bits, though, regardless of how many operands it has.)

If you don't like change, you can type big money ($) instead, but since dollars and cents aren't the same thing, the next integer in the program (generally either a literal or the numerator of a fraction) must have a spot (.) before its last two digits.

Select

Select is a binary operator, represented by a sqiggle (~). It selects all the bits in its first operand that correspond to one bits in its second operand, and then shifts them over so they're all together in the least significant part of the number. Aside from the fact that this works on bigints, it's exactly the same as in INTERCAL. How boring.

Do note that using a squiggle (ũ) instead of a sqiggle (~) is a syntax error.

Statements

General form

A BigINTERCAL program is a sequence of statements. Each statement starts with a statement identifier and optionally some qualifiers, as follows:

An optional label, which is a positive integer enclosed in wax-wane pairs (( … )). Labels must be unique within a program. Labels in the range 1000-1999 should be avoided, because these are used by the system library.
A required statement identifier, which is one of the keywords DO, PLEASE, or PLEASE DO. No whitespace is allowed within a statement identifier (other than between PLEASE and DO); because the exception proves the rule, whitespace is allowed absolutely anywhere else, including within a number or inside any other word. Because BigINTERCAL is doing a bigger job than INTERCAL, at least 1/3 of all statement identifiers must include PLEASE (but no more than 2/3).
Optionally the keyword NOT or N'T, which means that the statement won't run unless it's REINSTATEd.
Optionally, a probability to execute the statement, which is an integer preceded by a double-oh-seven (%). Because this is BigINTERCAL, big probabilities (or rather, precise probabilities) are allowed; instead of the number being out of 100, it's out of 10^{number of digits in the number}. If you want an actual perecentage that's less than ten, you can use the triple-oh-seven (‰) instead (which is equivalent to a double-oh-seven followed by an oh (0)).

These must be in the order shown, except that NOT and % can be in either order if both are included.

Calculate

The most basic statement is calculate, which is just the statement identifier and an expression. Because there's only one variable, the destination variable doesn't need to be (and can't be) specified.

`STASH` and `RETRIEVE`

Like INTERCAL, BigINTERCAL lets you stash a variable (pushing it onto a stack) and retrieve it later. Since there's only one variable, the variable doesn't need to be (can't be) specified; just use DO STASH or DO RETRIEVE.

`IGNORE` and `REMEMBER`

IGNORE prevents some bits of the variable from changing. The word IGNORE should be followed by an expression; bits in the variable corresponding to 1 bits in the expression are prevented from changing. (E.g., IGNORE #3 makes it so that the two least significant bits can't change.) If some bits are already being IGNOREd, those bits stay ignored.

REMEMBER expression causes the bits that are 1 in expression to no longer be ignored.

RETRIEVE acts like an assignment with regards to ignored bits.

`NEXT`, `RESUME`, and `FORGET`

I put too big of an integer into FORGET and it forgot the entire NEXT stack, so these commands are not in the language. Besides, COME FROM makes them pretty much obsolete, and NEXT is basically the same as call in Intel assembly language, so it's not that weird anymore.

`COME FROM`

COME FROM has the form DO COME FROM (label). After the statement (label) is executed (or skipped due to NOT/ABSTAIN/randomness), control will transfer to the COME FROM statement. If (label) doesn't exist, this is a no-op (this is to make using the system library easier).

Since NEXT doesn't exist in BigINTERCAL, a new form of COME FROM was added to make subroutines easier: DO COME FROM (label) AFTER (label). All COME FROM statements without an AFTER start out active, and all COME FROM AFTER statements start out inactive. If there's a statement DO COME FROM (x) AFTER (y), then after (y) runs or is skipped, that COME FROM will become active and any other COME FROM statements with the same label will become inactive. This means that you can make a subroutine called in multiple places like this:

DO STUFF...
(1) DO SOMETHING
PLEASE COME FROM (3) AFTER (1)
DO MORE STUFF...
(2) DO SOMETHING ELSE
PLEASE COME FROM (3) AFTER (2)
DO MORE STUFF...

PLEASE COME FROM (1)
PLEASE COME FROM (2)
DO THE SUBROUTINE HERE...
(3) DO THE LAST LINE OF THE SUBROUTINE

If more than one active, non-abstained COME FROM refers to the same line, it's a runtime error if that line ever gets executed. Do note that line labels must be unique, which means that subroutines used in multiple places have to have multiple COME FROMs at the beginning.

`ABSTAIN FROM` and `REINSTATE`

ABSTAIN FROM causes some statements not to execute, same as if they had the NOT qualifier. REINSTATE undoes the effect of ABSTAIN FROM or NOT. Both of these can take a label or an intersection (+)-separated list of gerunds (e.g. CALCULATING, STASHING, etc.). This works the same as in INTERCAL, except that there's no restriction on abstaining from or reinstating GIVING UP (or lines that give up), and you can abstain from or reinstate all syntax errors using the gerund SPLATTING.

There is also a new form of abstaining: DO ABSTAIN FROM expression plural-gerund. This evaluates the expression, and uses that to determine how many statements to skip; for instance, DO ABSTAIN FROM #3 CALCULATINGS abstains from the next three calculate statements that would have been executed. Statements that would already have been skipped due to NOT or another abstain don't count towards the number skipped (although statements skipped due to randomness do). If a statement type has already been number-abstained, number-abstaining it again adds to the number of times it'll be abstained from, e.g., DO ABSTAIN FROM #2 STASHINGS PLEASE ABSTAIN FROM #3 STASHINGS is the same as DO ABSTAIN FROM #5 STASHINGS.

DO REINSTATE expression plural-gerund does the same thing in reverse, only affecting statements that are already abstained from. If a gerund is number-abstained and number-reinstated at the same time, the number-abstain only affects statements that aren't normal-abstained and vice versa; they don't cancel each other out. There is no way to cancel a numbered abstain or reinstate.

For plural gerunds with multiple words, the plural goes on the first word, so COMINGS FROM and GIVINGS UP, not *COMING FROMS or *GIVING UPS.

For COME FROM statements, coming from the specified line and encountering the statement normally (regardless of whether it's active) count towards numbered abstains and reinstates. Abstaining is separate from COME FROM activeness; reinstating an inactive COME FROM will not make it active.

`READ OUT`

DO READ OUT expression outputs the value of expression. Multiple expressions can be given, separated by intersections (+); in that case, the values are read out on the same line, separated by spaces.

In INTERCAL, numeric output used Roman numerals. However, Roman numerals don't extend well to arbitrary bigints, so BigINTERCAL upgrades to positional notation, but using the same alphabet (or rather, its modern version). Numbers are output in bijective base 26, where 1 is A, 2 is B, … 26 is Z, 27 is AA. This means that you can also output words; just take each letter in the word, convert it to a number, and add them all together, multiplying by 26 before each addition, so e.g. "BIG" would be (2 × 26 + 9) × 26 + 7 = #1593, a fairly big number. ("SMALL" would be #8912032, which is pretty small compared to infinity.)

Unlike INTERCAL, BigINTERCAL! lets "you" use punctuation‽ Before any expression, put WITH symbol name, where symbol name is any symbol from the Tonsil in the INTERCAL reference manual, or any symbol name mentioned here, with any worms (-) in the name removed. (The word "worm" doesn't have to be removed, though.) The symbol name lasts until the end of the statement, so DO READ OUT WITH TAIL #1 + #2 + #3 reads out "A, B, C,"; you can override this by preceding an expression with WITH NOTHING. You can't output multiple punctuation marks for the same number, but some pre-combined punctuation marks exist, like rabbit ears (") + spot (.) = RABBIT, or wow (!) + what (?) = HOW.

You can also use this to overstrike each character output with a punctuation mark (but you can't have both an overstrike and a punctuation); just append an S or ES to the name of the punctuation mark. For instance, WITH FLATWORMS will underline the word (WITH CRAWLING HORRORS is also acceptable for CLC-INTERCAL compatibility). WITH NOTHINGS will put all digits of the number on top of each other, for reasons that should be obvious (but probably aren't actually obvious).

`WRITE IN`

PLEASE WRITE IN variable inputs a value from the user. Multiple variables can't be given, separated by intersections.

Since there's only one regular variable, writing into it would clobber the program's entire state, so that's not allowed. Rather, the variable you write into has to be one of those variables starting with #—you know, #1, #2, etc. That is, once you DO WRITE IN #3, every #3 in the program from then on will refer to the input, rather than to the literal number 3. This includes the #3 in DO WRITE IN #3, so if you want to input in a loop, you'll have to refer to the number indirectly, such as DO WRITE IN #1 ¢ #1.

(Using number literals as variables has precedent, since I think some INTERCAL dialects allow literal overriding and I have my own older esolang based around it. This is okay, though; BigINTERCAL has a goal of avoiding precedents, which means that I can't have a goal of avoiding precedents since INTERCAL is a precedent for that, so I don't have a goal of avoiding precedents and therefore this precedent is okay.)

In INTERCAL, numeric input used spelled-out digits—that is, a number was a sequence of words ZERO (or OH), ONE, TWO, THREE, FOUR, FIVE, SIX, SEVEN, EIGHT, and NINE (or NINER), separated by spaces. However, spelled-out digits do extend well to arbitrary bigints, so BigINTERCAL uses them unchanged.

Unicode I/O

In addition to big integers, BigINTERCAL can also handle big characters, and by big characters I mean Unicode, and by Unicode I mean emoji. (I mean, I think there are maybe also some other characters in Unicode, but who cares about those?)

To use emoji, use the ℛℰ𝒜𝒟 𝒪𝒰𝒯 and 𝒲ℛℐ𝒯ℰ ℐ𝒩 statements. These have the same syntax as READ OUT and WRITE IN, respectively, but they interpret the number read out or written into the variable differently. In addition, ℛℰ𝒜𝒟 𝒪𝒰𝒯 does not automatically output any spaces or newlines (but it does still accept punctuation).

Both of these commands use the same format, which is pretty simple: the number n represents the emoji with code n + 65536. Most emoji have codes above 65536, so this should make things easier. For instance, 🙀 (U+1F640) is #63040, 🙈 (U+1F648) is #63048, and 😇 (U+1F607) is #62983. Some emoji are made by combining other emoji with a zero width joiner emoji in between them; to use those, put WITH ZWIDGE on the first emoji, and make sure to cancel it out with WITH NOTHING on the second emoji; for instance, 👩‍💻 (woman technologist, 👩 + 💻) is WITH ZWIDGE #62569 + WITH NOTHING #62651.

Occasionally, you might want to use emoji with emoji codes below 65536. To do that, you need to output two numbers, in reverse–UTF-16 encoding. The first number should be in the range #55296 through #56319 (U+1D800 through U+1DBFF), and the second #56320 through #57343 (U+1DC00 through U+1DFFF). The bottom ten bits of each number is selected and then concatenated together, and that's used as an emoji code. If the emoji code is over 65536, or if there's any other encoding error, a sign (�) will be output instead.

(But what if you actually want to output emoji between U+1D800 and U+1DFFF? Well, the emoji codes U+D800 through U+DFFF aren't being used, so output them instead.)

ℛℰ𝒜𝒟 𝒪𝒰𝒯 and 𝒲ℛℐ𝒯ℰ ℐ𝒩 use the same gerunds as READ OUT and WRITE IN; abstaining by gerund abstains both versions of the statement.

`GIVE UP`

Exits the program. Running to the end of the program without giving up is an error.

System library

The system library is a library that contains subroutines that may occasionally be useful. The system library contains COME FROM statements from labels in the range (1000)-(1499); to use the system library, use one of these labels in your code, followed by a COME FROM statement for the corresponding label in the (1500)-(1999) range. (This should be a COME FROM AFTER, since the system library uses its own subroutines internally.)

The system library will be auto–sufficiently-advanced-technologically included at the end of the program if it contains a COME FROM statement with a label in the range (1500)-(1999), but no statements with labels in that range. Checking "Native syslib" will instead include a native library with the same interface (but only if the system library already would have been included).

Each subroutine takes one extra parameter, called the stash, which is passed through unchanged. This can be a useful place to put the rest of the program's state.

The system library routines are as follows:

Subroutine	Entry point	Exit point	Input	Output	Notes
Add	`(1000)`	`(1500)`	`stash ¢ x ¢ y`	`\|2\3` = `stash`, `\|1\3` = `x` + `y`
Subtract	`(1010)`	`(1510)`	`stash ¢ x ¢ y`	`\|4\7` = `stash`, `\|2\7` = (`x` − `y`) max 0, `\|1\7` = (`y` − `x`) max 0	Also useful for comparison
Decrement	`(1011)`	`(1511)`	`stash ¢ x`	`\|2\3` = `stash`, `\|1\3` = `x` − 1	Error if `x` is 0
Multiply	`(1020)`	`(1520)`	`stash ¢ x ¢ y`	`\|2\3` = `stash`, `\|1\3` = `x` × `y`
Divide/modulo	`(1030)`	`(1530)`	`stash ¢ x ¢ y`	`\|4\7` = `stash`, `\|2\7` = `x` ÷ `y`, `\|1\7` = `x` mod `y`	Error if `y` is 0

Please do note that the system library assumes that none of its statements are abstained by the main program, that no bits have been ignored, and that the constants #0, #1, and #3 haven't been overwritten.

Errors

E000: A statement that has a syntax error was executed. Due to the parser I'm using backtracking by default, I think this error will generally be used in more situations than E017, compared to INTERCAL.
E002: You used the big money form of the interleave operator, but forgot the spot before the last two digits of the next number.
E017: A miscellaneous syntax error occurred.
E079: Not enough statements have PLEASE or PLEASE DO as their statement identifier (must be at least 1/3 of all statements).
E099: Too many statements have PLEASE or PLEASE DO as their statement identifier (must be at most 2/3 of all statements).
E139: A label used in ABSTAIN FROM or REINSTATE doesn't refer to any existing statement.
E182: The same label was used for multiple statements.
E197: You used a label that wasn't a positive integer. Since negative numbers and decimals result in parse errors, this will only occur if you try to use (0) as a label.
E246: The denominator of a fraction after | has an even denominator. This is not allowed.
E255: Your browser doesn't support bigints, so it can't run the BigINTERCAL interpreter. Try upgrading your browser or using a different one.
E319: A fraction was not reduced.
E436: RETRIEVE was attempted when the stash stack was empty.
E555: Multiple active and non-abstained COME FROMs refer to the same label.
E579: Numeric input was not in the right format, or a digit name was misspelled.
E633: The program got to the end without GIVING UP.
E778: An unknown error occurred in the interpreter.
E987: A symbol name was written with the worm (-) symbol.

Compatibility

BigINTERCAL is not compatible with INTERCAL. If you find a nontrivial program that works in both INTERCAL and BigINTERCAL, please report it as a bug.

Top-left wisdom tooth: 2-adic fractions

(The INTERCAL reference manual had a tonsil, so this needs yet another removable organ.)

2-adic numbers are a type of number where, instead of digits continuing infinitely to the right of the decimal point, they continue infinitely to the left instead (also it has to be in base 2). This makes them useful for selecting an infinite number of bits from a number. In BigINTERCAL, only rational 2-adic numbers can be used, which means that the digits of every number eventually form a repeating pattern. This section will focus on how to create a fraction with the bits you want; more information about p-adic numbers in general can be found elsewhere.

For non-negative integers, their 2-adic representation is the same as their representation in binary (with infinite zeros to the left). (This means that |n/1 is the same as |1\1 ~ #n.)
The 2-adic representation of −1 is all 1s. (This is why |1\1 (−1/1) selects the entire variable.)
More generally, for negative integers, their 2-adic representation is the same as their two's-complement representation in binary, with 1s extending infinitely to the left.
For any n, −1/(2ⁿ − 1) has a one every n bits (1 one, n−1 zeros, repeating), starting in the least-significant bit. So −1/3 = …01010101, −1/7 = …001001001, −1/15 = …00010001, and so on.

Multiplying the fraction by 2 shifts all the bits left one place. Multiplying by 2ⁿ shifts left by n places. Combining this with the previous point, we can extract any argument to an interleave:

If the variable is…	Then…
`x ¢ y`	`x` = `\|2\3`, `y` = `\|1\3`
`x ¢ y ¢ z`	`x` = `\|4\7`, `y` = `\|2\7`, `z` = `\|1\7`
`x ¢ y ¢ z ¢ w`	`x` = `\|8\15`, `y` = `\|4\15`, `z` = `\|2\15`, `w` = `\|1\15`
`a₀ ¢ … ¢ a_n−1`	`a`_k = `\|`2^n−1−k`\`(2ⁿ − 1)

Dividing the fraction by 2ⁿ shifts all the bits right by n places. This can result in 1 bits past the right end of the number; these correspond to fractions with even denominator, which are not allowed in BigINTERCAL.
If x and y don't have any bits in common in their 2-adic representation, then x + y's representation has a 1 wherever either x's or y's representation has a 1.
If the bits that are 1 in y are a subset of the bits that are 1 in x, then x − y's representation has a 1 wherever x's representation has a 1 and y's has a 0.

Showing that all of these are consistent with each other is left as an exercise to the reader. Due . Don't forget.

Keep in mind that bits that are selected will stay in the order they're already in, so for instance, if the variable is x ¢ y, |7\15 (= −2/15 + −1/3) will select every other bit of x and every bit of y, they'll be arranged as …xyxxyxxyx, which is different than if you did |2\15 ¢ |1\3.

Top-right wisdom tooth: Complete grammar

(Oh, Grammar, what big teeth you have!)

Preprocessing: the following replacements are made before parsing:

Find	Replace
`a` ⋮ `z`	`A` ⋮ `Z`
`¥`	`\`
`¦`	`'\|`
`‰`	`%0`
`DO`	`DO` (single token)
`PLEASE`	`PLEASE` (single token)
`whitespace`	ignored (except within `DO` or `PLEASE`)

In the following grammar, any word other than PLEASE or DO means a sequence of those characters, possibly with whitespace in between.

`int`	→	`digit`+
`int$`	→	`digit`* `.` `digit` `digit`
`unary`	→	`&`
	\|	`V`
	\|	`∀` \| `\`
`primary`	→	`#` `unary`? `int`
	\|	`\|` `unary`? `int` (`/` \| `\`) `int`
	\|	`'` `unary`? `expr` `'`
	\|	`"` `unary`? `expr` `"`
`primary$`	→	`#` `unary`? `int$`
	\|	`\|` `unary`? `int$` (`/` \| `\`) `int`
	\|	`'` `unary`? `expr$` `'`
	\|	`"` `unary`? `expr$` `"`
`expr`	→	`primary`
	\|	`primary` (`¢` `primary` \| `$` `primary$`)+
	\|	`primary` `~` `primary`
`expr$`	→	`primary$`
	\|	`primary$` (`¢` `primary` \| `$` `primary$`)+
	\|	`primary$` `~` `primary`
`stmt-start`	→	(`(` `int` `)`)?
		(`PLEASE` \| `DO` \| `PLEASE` `DO`)
		((`NOT` \| `N'T`) (`%` `digit`+)? \| `%` `digit`+ (`NOT` \| `N'T`)?)?
`stmt-body`	→	`expr`
	\|	`STASH` \| `RETRIEVE`
	\|	(`IGNORE` \| `REMEMBER`) `expr`
	\|	`COMEFROM` `(` `int` `)` (`AFTER` `(` `int` `)`)?
	\|	(`ABSTAINFROM` \| `REINSTATE`) `(` `int` `)`
	\|	(`ABSTAINFROM` \| `REINSTATE`) `gerund` (`+` `gerund`)*
	\|	(`ABSTAINFROM` \| `REINSTATE`) `expr` `plural-gerund`
	\|	(`READOUT` \| `ℛℰ𝒜𝒟𝒪𝒰𝒯`) `read-clause` (`+` `read-clause`)*
	\|	(`WRITEIN` \| `𝒲ℛℐ𝒯ℰℐ𝒩`) `expr`
	\|	`GIVEUP`
	\|	`any-character`* (until the next `stmt-start` or end-of-file)
`gerund`	→	`CALCULATING`
	\|	`STASHING`
	\|	`RETRIEVING`
	\|	`IGNORING`
	\|	`REMEMBERING`
	\|	`COMINGFROM`
	\|	`ABSTAINING`
	\|	`REINSTATING`
	\|	`READINGOUT`
	\|	`WRITINGIN`
	\|	`GIVINGUP`
	\|	`SPLATTING`
`plural-gerund`	→	`CALCULATINGS`
	\|	`STASHINGS`
	\|	`RETRIEVINGS`
	\|	`IGNORINGS`
	\|	`REMEMBERINGS`
	\|	`COMINGSFROM`
	\|	`ABSTAININGS`
	\|	`REINSTATINGS`
	\|	`READINGSOUT`
	\|	`WRITINGSIN`
	\|	`GIVINGSUP`
	\|	`SPLATTINGS`
`read-clause`	→	(`WITH` `char-name` (`S` \| `ES`)?)? `expr`
`stmt`	→	`stmt-start` `stmt-body`
`program`	→	`stmt`* `end-of-file`

BigINTERCAL

Contents