Character complexity and redundancy in writing systems over human history

A writing system is a visual notation system wherein a repertoire of marks, or strokes, is used to build a repertoire of characters. Are there any commonalities across writing systems concerning the rules governing how strokes combine into characters; commonalities that might help us identify selection pressures on the development of written language? In an effort to answer this question we examined how strokes combine to make characters in more than 100 writing systems over human history, ranging from about 10 to 200 characters, and including numerals, abjads, abugidas, alphabets and syllabaries from five major taxa: Ancient Near–Eastern, European, Middle Eastern, South Asian, Southeast Asian. We discovered underlying similarities in two fundamental respects. (i) The number of strokes per characters is approximately three, independent of the number of characters in the writing system; numeral systems are the exception, having on average only two strokes per character. (ii) Characters are ca. 50% redundant, independent of writing system size; intuitively, this means that a character's identity can be determined even when half of its strokes are removed. Because writing systems are under selective pressure to have characters that are easy for the visual system to recognize and for the motor system to write, these fundamental commonalities may be a fingerprint of mechanisms underlying the visuo–motor system.

[1]  M. B. Clowes,et al.  On Seeing Things , 1971, Artif. Intell..

[2]  Indranil Chakravarty,et al.  A Generalized Line and Junction Labeling Scheme with Application to scene Analysis , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Georges Ifrah From one to zero : a universal history of numbers , 1989 .

[4]  S. Sutherland Seeing things , 1989, Nature.

[5]  Z. Pylyshyn,et al.  Why are small and large numbers enumerated differently? A limited-capacity preattentive stage in vision. , 1994, Psychological review.

[6]  D. Norman,et al.  A representational analysis of numeration systems , 1995, Cognition.

[7]  D. Allport,et al.  What Are the Functional Orthographic Units in Chinese Word Recognition: The Stroke or the Stroke Pattern? , 1996 .

[8]  W. Bright,et al.  The World's Writing Systems , 1997 .

[9]  S. Dehaene,et al.  The Number Sense: How the Mind Creates Mathematics. , 1998 .

[10]  C. F. Hockett,et al.  The World's Writing Systems , 1997 .

[11]  G. Woodman,et al.  Storage of features, conjunctions and objects in visual working memory. , 2001, Journal of experimental psychology. Human perception and performance.

[12]  M. Changizi Universal scaling laws for hierarchical complexity in languages, organisms, behaviors and other combinatorial systems. , 2001, Journal of theoretical biology.

[13]  M A Changizi,et al.  Scaling of differentiation in networks: nervous systems, organisms, ant colonies, ecosystems, businesses, universities, cities, electronic circuits, and Legos. , 2002, Journal of theoretical biology.

[14]  M. Changizi Relationship between number of muscles, behavioral repertoire size, and encephalization in mammals. , 2003, Journal of theoretical biology.

[15]  S. Yeh,et al.  Sublexical processing in visual recognition of Chinese characters: Evidence from repetition blindness for subcharacter components , 2004, Brain and Language.

[16]  Qiong Zhang,et al.  The Structures of Letters and Symbols throughout Human History Are Selected to Match Those Found in Objects in Natural Scenes , 2006, The American Naturalist.

[17]  D. Pelli,et al.  Feature detection and letter identification , 2006, Vision Research.