Random Word and Sentence Maker-Upper

This is a JavaScript program to make up words based on certain rules. I originally made it to make words that are pronounceable and appear to come from certain natural languages (though these rules are not based on actual statistics, but rather from my impressions as someone not fluent in anything other than English), and occasionally used it for fictional character names; later I realized it could be useful for conlanging and added a way to input your own language rules.

It can also generate sentences, and it can generate words from a context-free language given its grammar (although there's a limit to how far it'll recurse).

Number of words to create:

Language of origin:


Made-up words:

How to input languages

Type rules in the box under "Language of origin".

As a simple example, the following language generates the words cat, dog, and mouse with equal probability:

WORD = cat | dog | mouse

WORD, in all upper case, indicates that this is the rule for generating the whole word. Vertical bars separate different options (on US keyboards, a vertical bar can be typed as shift-\; on iOS, tap 123 and then #+= and it should be near the left of the middle row; or you can click the | button below the text area). Replace WORD with something else to define different parts of the word (consonants, vowels, etc.), and you can use them like this:

vowel = a | e | i | o | u
consonant = p | t | k | m | n
WORD = consonant vowel | consonant vowel consonant

Each rule should be on its own line (explicitly press return/enter). The names can be anything (e.g., you can use C and V instead, if you want), as long as they don't contain spaces or the characters =*|#. When you use the parts of a word, make sure to separate the different parts with spaces (e.g., if you've defined parts C and V, then you can use them as C V, not CV), and you need to make sure to use consistent capitalization (C and c are different). The rules don't have to be in a particular order; you could put the WORD rule first if you want.

Word parts can refer to other word parts, so you can do things like

vowel = a | i | u
consonant = p | t | k
syllable = consonant vowel | consonant vowel consonant
WORD = syllable | syllable syllable | syllable syllable syllable

The program determines whether a word refers to a rule or to some letters to add to the word by whether a rule actually exists; if, for instance, you remove the "vowel" rule, it'll interpret "vowel" as some letters to actually put in the word, and generate words like "pvowelk" and "tvowelpkvowel". You can mix things that refer to other rules and letters to just insert, so for instance consonant vowel n can be used to generate a syllable that ends in the letter "n".

You can adjust the frequencies of the different options by adding an asterisk (*) and a number, like this:

vowel = a | i | u
consonant = p * 5 | t * 3 | k * 2
WORD = consonant vowel * 3 | consonant vowel consonant

* 2 is twice as common as * 1, * 3 is three times as common as * 1, etc. There isn't any specific number the numbers have to add up to. Decimals can also be used. If the number isn't specified, it's equivalent to * 1.

You can use the predefined word part SPACE (all caps) to insert a space, or EMPTY to define a word part that can be empty. For instance, this will sometimes generate words with no consonants:

vowel = a | i | u
consonant = p | t | k | EMPTY
WORD = consonant vowel consonant

If you want to generate a sentence instead of generating a word, you can use SENTENCE in place of WORD; this will automatically put spaces between everything. In addition, in sentence mode only, you can include punctuation, so if there's a rule for generating noun and you type noun., it'll generate a noun and put a period after it. Also, if a word starts or ends with a hyphen, the hyphen and the space next to the hyphen will be removed.

Lines starting with # are ignored; this can be used to add comments or to temporarily remove lines that you don't want.

You can avoid generating certain words using an AVOID rule; for instance, to avoid words containing the sequence yi, use

AVOID yi

This works by first generating the word; then it checks if it contains the characters "yi" and, if so, it discards the word and tries again. You can also replace certain characters using a REPLACE rule. For instance, this changes any occurrence of "np" to "mp":

REPLACE np mp

Or you can delete certain characters using a DELETE rule. For instance, this deletes any hyphens in the generated word:

DELETE -

AVOID, REPLACE, and DELETE rules can use regular expressions (JavaScript regular expressions, which are similar to Perl regular expressions; don't include the slashes around the regular expressions); in the replacement part, captured groups can be inserted using $1, $2, etc. To match a space, use \x20. For instance, the following replaces all occurrences of np and nb with mp and mb, respectively:

REPLACE n([pb]) m$1

There's also REPLACE1, which works like REPLACE, except it only replaces the first occurrence, and similarly for DELETE1.

AVOID, REPLACE, and DELETE rules run in the order they appear; for instance, in this example, the AVOID rule does nothing, because all instances of yi have already been replaced by the time the AVOID rule runs:

REPLACE yi i
AVOID yi

...whereas, if the order of the rules is switched, the AVOID rule will discard all words containing yi.