Neoglyphic: A Non-Spoken Language
Formerly: Neo Cuneiform Writing System
Created Feb 3, 2010
Last Modified Feb 10, 2010
Communicating with the Distant Future

Virtually everything we read today is stored on fragile paper or in digital form that is frighteningly ephemeral compared to the clay tablets of Mesopotamia. Suppose that we realized that our modern civilization was about to fail for one reason or another, and we wished to preserve some portion of human knowledge for the future when the next civilization began to rise to take our place.
We would have to assume that some tens of thousands of years into the future paper documents and digital storage had long since disintegrated. We would also have to assume that no language remained that was recognizably related to English. Now suppose that we had some more permanent medium at our disposal, perhaps even baked porcelain tablets, or engraved sheets of marble, and that we wished to record a selected body of knowledge in a manner that could be read by intelligent humans of the future. What we could assume is that we would be communicating with human beings, and that the nature of planet Earth, and of the characteristic flora and fauna would have changed very little in the intervening thousands of years. We could also assume that our target audience was a civilization which, at a minimum, had rediscovered written language, and understood enough about its own and neighboring languages that it might occur to them that these tablets or sheets were, in fact, documents which could, in principle, be translated into their own language.
Without any recourse to any common language, how would we go about constructing a language, and instructing future Earth-bound humans in how to read that language? And how would we go about packing the most information possible into a limited space? Unlike the various proposals advanced for communicating with extraterrestrial civilizations, we could not assume that our target audience understood atomic spectra, electromagnetic wave lengths, or planetary orbits. We might assume that our readers had some limited metalworking ability, but we shouldn't assume they understand something as high-tech as a steam engine. Imagine the goal to be roughly equivalent to leaving a document by means of which Caesar's philosophers and engineers could have been taught how to build a steam-powered horseless chariot and a hang glider.
Assuming a lossless recording method, such as characters deeply cut into plates of stainless steel, or very easily distinguished symbols stamped deeply into wet porcelain clay and then glazed and fired, the first step would be to remove all redundancy from the orthography of the language. We might begin with a basic vocabulary of 1500 to 2000 words, each represented by some distinct symbol.
Neoglyphic
Imagine a writing system designed to be pressed into wet clay or stamped into some kind of durable metal. For maximum density each character should convey as much information as possible. It would not be necessary or desirable to convey any phonological information. The pronunciation of the language would be irrelevant because future civilizations would have no way of knowing which phonetic symbol represented which sound. In addition, alphabetic writing systems take up much more space than pictographic or ideographic systems. The word "language", for example, requires 8 symbols plus an empty space before and after that string of 8 symbols. An ideographic symbol for "language" would be a single glyph, taking up nearly one tenth of the phsycial space on a page or clay tablet. As a general rule, a text written in an ideographic system could pack five or six times as much information into the same page space. For this reason the language should be ideographic. We would, therefore, need an inventory of at least several thousand possible unique characters within some standard character space.
The standard character space would be a grid, either implied, or perhaps impressed or stamped into the recording medium before any characters are written. It might use a uniform grid of reference dots, for example. The size of the grid could be anything, but the right balance should be found between compact size and inventory of possible unique symbols within a given size grid.
From each cell, or pre-stamped dot in the grid, a line or bar could be pressed or stamped with a stylus or tool, connecting that dot with the dot to the right, or with the dot below, or both. Cells on the rightmost edge of the single-character grid space could only connect to the dot below, and cells on the bottom row could only connect to the dots to their right. The bottom-right cell could not connect either to the right, or below. From these limitations it is a simple matter to compute the number of unique characters that can be pressed or stamped into a grid of any particular size.
Diagonal bars would be avoided since, if not accurately stamped might lead to ambiguity. In addition, diagonal bars would not be of the same standard length as the stamping tool used for orthogonal lines. And if another reason is needed to avoid them, the characters begin to look cluttered and blotchy when diagonal lines are mixed with orthogonal ones in the same character.
Three particular grid sizes recommend themselves based on this criteria. They are the three-by-three grid, the three-by-four grid and the four-by-four grid.
The Three-by-Three grid
In a three-by-three grid, the upper-left four cells can connect in any one of four different ways, (no connection, connect right, connect down, connect both ways) accounting for 44 = 256 possible patterns. In addition, the upper two cells in the right-most column and the left-most two cells in the bottom row can each connect in two different ways, accounting for 22 = 4 ways each. This gives us a total of 256 * 4 * 4 = 4096 unique characters.
A different, but equivalent method of enumerating the possibilities is to note that there are 12 places to put a bar. Each of these places can either contain a bar (1) or an empty splace (0). The total number of patterns, therefore, is the same as the set of binary numbers from 000000000000 to 111111111111, or 212 = 4096 patterns. Since the two numbers match we can be assured that our calculations are correct.

In practice some portion of those character patterns might not be usable depending on what aesthetic criteria were established for the characters. The grid, and a few sample characters, might look something like those to the left.
If we assume that something like 80% of the possible characters meet our aesthetic criteria, then we would have something in the neighborhood of 3200 to 3300 unique usable characters. Whether this would be a sufficient lexicon size is something that could be discovered by working out trial translations of the sort of documents we might want to transmit to the future.
In any case, it's probably safe to say that the three-by-three grid, occupying 9 square units of space, is the minimum size character for an ideographic writing system.
The Three-By-Four Grid

The three-by-four grid takes up more real estate on our tablets, plates, or stones, but affords us a greater variety of unique possible characters. Using the same enumeration technique as above we can easily compute the inventory of unique characters as being 131,072. Even allowing for unusable characters due to aesthetic unsuitability, having over a hundred thousand possible characters is certainly adequate to the task of an ideographic writing system.
The total real estate used by each character is 12 square units, or 33% more space, meaning that the same size tablet, plate, or stone would hold only 75% as many words. On the other hand, having a greater vocabulary would imply fewer circumlocutions, perhaps saving as much space in word count as was lost to word size.
The Four-by-Four Grid
Finally, the four-by-four grid is almost certainly the largest grid one would reasonably consider for this task. The inventory of possible unique characters is 16,777,216, providing far more ideographs than would ever be necessary for any reasonably sized language.

Bear in mind that the only means some future civilization would have of learning the language would be by way of whatever materials were provided, in that language, by us. That being the case, the vocabulary would be necessarily limited by how many words the meanings of which could be adequately conveyed non-linguistically.
The four-by-four grid uses 78% more real estate per character than the three-by-three gird. This means that a given size of tablet, plate, or stone would hold only 56% as many words. It is unlikely that the reduction in circumlocutions would make up for the increased per-character size.
All things considered, a four-by-four grid is probably overkill.
Three High, Variable Width
It seems to me that a three-by-four grid, having over a hundred thousand unique glyphs, is ideal for this application. The three-by-three grid, however, is more compact and easier to implement as a font. A good alternative is to combine those two grid sizes by turning the three-by-four grid on its side and making it three units tall and four units wide. Now we can write glyphs that are all three units high, but vary from 1 to 4 units wide. This gives us well over a hundred thousand unique glyphs.
Using the Zegments Font for Neoglyphic
A simple way to display Neoglyphic ideograms is to use the Zegments.ttf font with 31 glyphs representing all 31 possible combinations of the two leftmost vertical segments plus the three horizontal segments in the leftmost column. Wider characters are drawn with two, three, or four consecutive characters from that special font. Taller characters can be drawn by stacking glyphs one above the other. The space bar provides character spacing. Here is a table of how the keystrokes are assigned:

The line of characters above would then be typed as:
WH FOX JZH dFc GX JTH GOP KX BZ Ra YCZ bbX
Which displays like this in MS Word using Zegments.ttf
Or if your browser supports embedded fonts: WH FOX JZH dFc GX JTH GOP KX BZ Ra YCZ bbX
Download the Zegments Font
A the font described above, named Zegments, is available from
FontStruct.com.
Or Download Zegments here.
A 50-page MS Word document with 30,761
Neoglyphic (Neo) characters can be downloaded here: Zegments.doc
Use this document with the Zegments font. Here's a sample from the Neo Specimen Book.

>> Next Up >> First steps toward common ground.