Phonological information in the Chinese writing system

opening words #

The insights in this post are neither in depth nor new; please see this as a study note of some sorts, attempting to organize my intermittent stray learnings on this topic back in 2023 into something more or less coherent.

(Post last modified: 2024-03-04)

intro #

The foundation of the current Old Chinese (OC) model is the philological interpretations of three major kinds of evidence: the rhymes in Shījīng, a.k.a. the Book of Odes; the Chinese characters, or more specifically a large body thereof that formed within the more-or-less correct time-frame; and finally, the documented Middle Chinese (MC) system—how we managed to figure out Middle Chinese from the documents is a topic for another day.

The first two kinds of data are materials from the OC period; the OC phonology we reconstruct should be able to explain their behaviours where relevant. They come with the obligatory caveat that A. neither of them was chronologically flat or socio-geographically uniform, and B. though they are roughly contemporaneous, they do not align perfectly in terms of chronology (general considerations for these points are discussed in e.g. Sagart 1999 § 1.2.1, and point A as it pertains to the Chinese characters is discussed in the sidenote below). The third kind—the MC phonology—was a daughter of OC phonology, at least in broad strokes; the OC model must be reconcilable with it as well. At the core of every OC reconstruction is an attempt to reconcile this tripartite bundle of data phonologically.

While the reconstruction of OC has been surely moving toward a consensus in terms of the general OC structure, details are revealed to be messy once under scrutiny. The primary reason for such messiness is that the data involved are systemically highly complex, often with great geographical and chronological uncertainties. This is especially true for Chinese characters. Unfortunately, since Shījīng rhyming provides little insights other than, well, rimes, the reconstruction of OC initials relies largely on the Chinese characters and the Middle Chinese records. This post will focus on that.

xiéshēng #

The majority of the Chinese characters are each composed of some sound-bearing (phonetic, phonophoric, among other terms) element and some meaning-distinguishing (semantic, taxonomic, among other terms) element (hereafter termed the taxogram following Handel 2019). As an example, [Modern Standard Mandarin (pinyin) xiǎo, MC (Baxter’s) xewX, “daybreak”] is comprised of:

The phonetic element is phonetically significant but semantically irrelevant, whereas the taxogram is semantically significant but phonetically irrelevant. I think of it this way: the phonetic element points to a pool of near-homophonous morphemes (recall that in Old Chinese almost all morphemes are monosyllabic), and the taxogram is essentially a classificatory label that offers additional semantic information needed to locate the intended morpheme in that pool.

Traditionally, the relationship among the members of one such pool of phonetic-element-sharing near-homophones is called xiéshēng 諧聲 (XS, literally “sound harmony”), and such pools are referred to as XS series (the term phonetic series is also common). We say there is XS contact between two OC or MC phonological entities (e.g. initials, rimes, etc.) if there exist two morphemes [X] and [Y], each with one of said entities, that can be written with the same phonetic element.

XS and MC in tandem #

The following table contains all phonologically distinct MC initials:

labial dental 1 retroflex 1 dental 2 retroflex 2 palatal posterior
p, p t, t tr, ʈ ts, t͡s tsr, ʈ͡ʂ tsy, t͡ɕ k, k , ʔ
ph, th, trh, ʈʰ tsh, t͡sʰ tsrh, ʈ͡ʂʰ tsyh, t͡ɕʰ kh,
b, b d, d dr, ɖ dz, d͡z dzr, ɖ͡ʐ dzy, d͡ʑ g, g
m, m n, n nr, ɳ ny, ɲ ng, ŋ
s, s sr, ʂ sy, ɕ x, x
z, z zr, ʐ zy, ʑ h, ɣ
l, l y, j hj, w

The Chinese character in each cell is the traditional name for that initial. The traditional names differ slightly from the ones shown in Baxter & Sagart (2014: 15): they have /dʑ/ and /ʐ/ [unnamed] while I follow another tradition with /dʑ/ and /ʐ/ . The first in each cell is Baxter’s MC transcription, followed by the phonological value proposed by Huang (1995: 22–34, in Mandarin).

Since there was a large gap of time from OC to MC, at MC times the XS situation had become more opaque, and how one should interpret the MC reflection of XS as an OC phenomenon has invoked much debate, especially when it comes to the initials. I will explain one important example here with my understanding of the present model. Please forgive inevitable oversimplifications.

the becoming of OC coronal stops #

The MC initial stops { t, th, d, tsy, tsyh, dzy, tr, trh, dr } behave as a tight class in XS: contact among them is ordinary and systematic, and in general, series primarily populated by these initials rarely also come into contact with any other initials (except that { y, d, dr, sy, th, trh } form another significant class, which will be discussed later). A typical XS series of this kind is the one with the phonetic element :

The character(s) following each morpheme is the one(s) usually employed for that morpheme in the received classical texts. More examples include the series with

…the series with

…and so on.

(sidenote) Note that the examples of XS I used in this post are all based on the “received” XS, i.e. XS reflected in the inherited Chinese script “of the classics” (Baxter 1992: 347). It is of course indefensible to use received XS indiscriminately as evidence, since the body of this system did not fully stabilize before Hàn times (see e.g. Barnard 1978, Baxter 1991, Sagart 1999 § 1.2.1 on some discussion on this); it can be unclear if a particular case of XS mapping could reflect pre-Qín habits, or if it was a late innovation against inertia. The style and structure of Chinese characters had also changed drastically from the pre-cleric to the cleric (lìshū 隸書) and post-cleric scripts, enough so that some XS relations could be obfuscated in the received XS. When investigating received XS it’s then necessary to single out late inventions wherever possible, and on top of that it’s safer to only speak of significant tendencies, keeping a wary eye on XS behaviours that are statistically scarce and otherwise poorly corroborated.

It’s easy to see how the writing evidence from the dated excavated texts (e.g. that from the early-Zhōu bronze inscriptions or Warring-States writings on bamboo slips) has been crucial in clarifying, correcting and confirming our conclusions that had been previously drawn (or failed to be drawn) from received XS; Baxter and Sagart, as a wellknown example, have stressed this in their 2014 work. I cite only received XS examples in this post simply for convenience; as demonstrations they suffice. (2024-02-14 addendum, further modified 2024-02-25)

If you came from the link above, click here to return.

Though the MC values of these initials can be subdivided into three groups of distinct places of articulation { t, th, d } (dental), { tsy, tsyh, dzy } (alveopalatal) and { tr, trh, dr } (retroflex), such substructure is not reflected in the XS behaviour; while some phonetic elements do appear to be specialized, the three groups come into XS contact with each other with no overall inhibition. The MC dental and retroflex affricates { ts, tsh, dz, tsr, tsrh, dzr } form a similar XS class, with much analogical XS behaviour. As an example, observe the following series with phonetic element :

A statistical investigation of XS among Guǎngyùn items is found in msoeg (2020), on zhihu.com, in Mandarin. (Incidentally, I’m not aware of any academically published investigation of this kind, though that could be my own insufficient or negligent reading; there has to be one already, right?) These observations raise suspicion that the MC three-way place distinction in such series is secondary.

The suspicion grows once we take into account the distribution of these MC initials. MC finals are traditionally arranged into four divisions; here I refrain from discussing their natures, only noting that, in Baxter’s transcription, MC division-III finals are marked by a medial yod -j- (we now have good reason to conclude that this yod is secondary) or -i- as the main vowel. Division-II finals have the vowels -ae- or -ea-, and division-IVs have -e-. The rest is division-I. MC initial coronal stops show an incredibly unbalanced pattern in relation to division of finals (some example items below are taken from Jacques 2006: 27 where the same discussion is offered; also see Baxter 1992: 49–55):

dI dIV dII dIII
T: t, th, d dang deng x x
Tr: tr, trh, dr x x draeng drjang
Tsy: tsy, tsyh, dzy x x x dzyang
Ts: ts, tsh, dz dzang tsheng x tsjang
Tsr: tsr, tshr, dzr x x dzraeng tsrjang

The systematic gaps are striking. With non-division-III finals, T, Ts and Tr, Tsr are completely complementary: T, Ts never combine with division-II finals, [1] and Tr, Tsr always combine with division-II finals. Additionally, the MC initial palatal affricates (the Tsy-type) only combine with division-III finals (for this reason in Baxter’s transcription the division-III yod is omitted after them).

This distribution is not entirely straightforward to interpret historically. There has been a long history of scholarship leading us to the present conclusion, which is as follows: the retroflexion in Tr and Tsr is secondary; as is the palatalization of the Tsy group. A syllable feature (which we will call “[+ρ]” at this point, r for retroflex) gave rise to the initial retroflexion, as well as the division-II vowel qualities. A palatalizing syllable feature [2] (“[+π]”, p for palatalizing) gave rise to both the palatal Tsy- initials (from the same source as T-, but palatalized) and yodized division-III finals. [±ρ] and [±π] are orthogonal, partitioning the division III into two. An earlier distribution of these MC initials must have been:

dI [−ρ, −π] dIV [−ρ, −π] [+ρ, −π] [−ρ, +π] [+ρ, +π]
dang deng draeng dzyang drjang
dzang tsheng dzraeng tsjang tsrjang

In the model of Baxter (1992), “[+ρ]” is “presence of a *-r- medial”, and “[+π]” is “presence of a medial yod”; the pre-histories of the initials forming the XS classes { t, th, d, tsy, tsyh, dzy, tr, trh, dr } and { ts, tsh, dz, tsr, tsrh, dzr } are reconstructed as follows:

*C- *Cr- *Cj- *Crj-
C = T *t- > t-
*th- > th-
*d- > d-
*tr- > tr-
*thr- > trh-
*dr- > dr-
*tj- > tsy-
*thj- > tsyh-
*dj- > dzy-
*trj- > trj-
*thrj- > trhj-
*drj- > drj-
C = Ts *ts- > ts-
*tsh- > tsh-
*dz- > dz-
*tsr- > tsr-
*tshr- > tsrh-
*dzr- > dzr-
*tsj- > tsj-
*tshj- > tshj-
*dzj- > dzj-
*tsrj- > tsrj-
*tshrj- > tsrhj-
*dzj- > dzrj-

It’s obvious that the model adopted by Baxter & Sagart (2014) is isomorphic…

*Cˤ- *Cˤr- *C- *Cr-
C = T *tˤ- > t-
*tʰˤ- > th-
*dˤ- > d-
*tˤr- > tr-
*tʰˤr- > trh-
*dˤr- > dr-
*t- > tsy-
*tʰ- > tsyh-
*d- > dzy-
*tr- > trj-
*tʰr- > trhj-
*dr- > drj-
C = Ts *tsˤ- > ts-
*tsʰˤ- > tsh-
*dzˤ- > dz-
*tsˤr- > tsr-
*tsʰˤr- > tsrh-
*dzˤr- > dzr-
*ts- > tsj-
*tsʰ- > tshj-
*dz- > dzj-
*tsr- > tsrj-
*tsʰr- > tsrhj-
*dz- > dzrj-

…the only change being that the Division-III yod is not reconstructed back to OC anymore; instead, the “non-division-III-ness” in OC is marked by pharyngealization, and division III is explained to be previously unmarked (for discussion of this see B&S 2014: 68–76).

With this reconstruction, the XS behaviour discussed above is explained as follows:

…and, in the case of OC coronal affricates:

As those two examples show, they now simply reflect contact between homorganic stop initials. This is then entirely of the same nature as other much more straightforward stop series. For instance, MC { p, ph, b } form a XS class, which is simply back projected to OC *p, *pʰ and *b. Consider the series with phonetic element :

I’ll now visit two other kinds of important XS classes very briefly, and their significance in the present model.

the voiceless nasals #

In general the MC initials m /m/ only comes into XS contact with itself; the same can (with some complications) be said with ng /ŋ/. However, in primarily m- or ng-populated series there are a significant number of items with x /x/ involved; the frequency of contact is statistically meaningful enough that it can’t be waved away as chance interactions. In such cases we reconstruct OC voiceless nasals *m̥ and *ŋ̊ respectively.

The situation with MC n requires more thought; among initials that come into contact with n we find the ny and nr to be the most common, and since their distribution in MC is parallel to that of T, Tsy and Tr, their OC reconstructions are straightforward: *nˤ > n, *n > ny, *nˤr and *nr > nr. But if *m̥ and *ŋ̊ exist, there seems to be no good reason for *n̥ to be absent. Well! In series with MC n, ny and nr, we frequently find th, sy and trh items (sy only occurring with division III finals, and th with I and IV, trh with II and III—you know the drill for the last two), awaiting explanation. It’s natural to reconstruct *n̥ˤ > th, *n̥ > sy, *n̥ˤr and *n̥r > trh.

the voiceless liquids #

The MC liquid l does not co-occur with MC division-II finals, with very few exceptions [3] —that is, it seems to be incompatible with OC *-r-. But in XS series it is often mysteriously tangled up with MC grave (especially velar) initials, despite it being acute (coronal); among those grave initial items the overwhelming majority is division II (which, hey, is linked to an *-r- medial). l also comes into frequent XS contact with sr and trh, both MC retroflexes.

It would seem that—as is now generally agreed upon—that MC l was a rhotic in OC. The lack of MC division-II syllables beginning with l had in fact been OC phonotactic avoidance of *rr-, and the contact of l items with grave-initial division-II syllables are usually explained by (at least at the earliest stage of such series) XS contact between e.g. *kr- (with initial *k and medial *-r-) and *k-r (with initial *r and some sort of pre-initial *k- subject to later loss; for a recent investigation into one of such cases, see Nohara 2023 on [*k-rˤon > lwanX, “egg”]—sections 2 and 4 are the most relevant to the discussion here). External evidence supports the rhotic value: see Baxter & Sagart (2014: 110–111) for Hmong-Mien and Vietic evidence of a rhotic OC value for MC l. An interesting tidbit: in early transcriptions of place names we find 烏弋山離 ‘u yik srean lje (< *ʔˤa-lək-sran-ra??) “Alexandria”, where not only is ME l used for foreign r, it is also avoided for foreign l—the foreign l is transcribed instead with a y item.

The XS situation of l is not so clear as the previously discussed cases, even besides the complications with the aforementioned contact with grave initials. The XS class with l and sundry is reconstructed by both B&S (2014) and Schuessler (2008) as follows: besides the straightforward *rˤ and *r > l, *r̥ˤ develops into th and *r̥ into trh; apparently both follow Baxter (1992). Since almost all trh items in XS series of this kind are division III, *r̥ > trh is straightforward. On the other hand, the choice of *r̥ˤ > th is tremendously unsatisfactory, since as Sagart (1999: 41) has observed, th very rarely interact with l in XS—certainly no more frequently than sr (see again msoeg 2020). msoeg considers sr to be a regular outcome of *r̥ (his *ɻ̥), which must have been undesirable for B&S et al. because wherever s, sr appear in an otherwise resonant series, they reconstruct complex onset *s-C(-r-); naturally for them, then, sr that come into XS contact with l is *s-r(ˤ). This leaves no good candidate for the MC outcome of OC *r̥ˤ. The choice of th at least has a point in systematicity, for it parallels the development of *n̥ˤ and *l̥ˤ, but I see no great reason why it should.

The XS situation involving the MC initial yod y is highly complex—the initial is likely not monogenetic in terms of OC origin. Nevertheless, it has now become clear that at least one source of y involves a XS class with the diagnostic members { d, y, dr, th, sy, trh }. d and th, of course, do not appear with division-III finals, while y and sy only appear with division-III finals. This distribution is strikingly familiar—parallel to that of the coronal nasals. We then reconstruct this XS class as coronal resonants, voiced and unvoiced. Since *r is reserved for l and sundry, the most sensible choice here is *l. It is fortified to a stop when pharyngealized and/or followed by *-r-, and weakened to a yod when not: *l- > y-, *lˤ- > d-, *lr- and *lˤr- > dr-; the developments of the voiceless counterparts are *l̥- > sy-, *l̥ˤ- > th-, *l̥r- and *l̥ˤr- > trh-. This lateral value is of course corroborated by the aforementioned 烏弋山離 ‘u yik srean lje “Alexandria”. For further evidence cited by Baxter and Sagart from Hmong-Mien, Vietic, etc., see Baxter & Sagart (2014: 109–110, 115). An interesting tidbit: Vietnamese lúa, Mường lɔː³, Rục alɔː³, Arem alɑːˀ (< Proto-Vietic *C.lɔʔ, Ferlus 2010: 63) “rice paddy” and Tocharian A/B klu “rice” (Peyrot 2018: 245) are believed to be early loans from Chinese [*lˤuʔ, “rice paddy”].

(sidenote) Baxter and Sagart also cite possibly retained laterals in modern Sinitic dialects, claiming especially systematic attestations in Wǎxiāng 瓦鄉 (B&S 2014: 109)—but it has been pointed out that the voiced-resonant-initial level-tone syllables in Wǎxiāng are dark (yīnpíng), but the Wǎxiāng level-tone /l-/ items that correspond to OC *lˤ-, *lˤr- and *lr- are not; this seems to indicate secondary softening of earlier voiced stops instead (百越閒人 2022, on zhihu.com, in Mandarin). The perfect match with OC lateral items might be purely coincidental then.

closing words #

Wow, I have a lot more to write, but I lost steam quickly. The discussions above are quite elementary in terms of XS and OC reconstruction, perhaps presented in a way that’s filled with oversimplifications and naïve understandings. There are almost certainly mistakes that I didn’t catch.

There are a lot of thorny subtleties of XS that I didn’t mention. Just to throw some out there: though I claimed that the class { t, th, d, tsy, tsyh, dzy, tr, trh, dr } doesn’t normally have outsiders interacting with them, sy actually interacts with the class quite frequently. More importantly, sy items of this kind and sy items coming from voiceless resonants actually have different reflexes in Proto-Mǐn (B&S 2014: 92–93), confirming a distinct source. Or—I haven’t talked about the posterior consonants , k, kh, g, x, h, hj at all, which represent a whole new sort of messiness. Or—some of those posterior obstruents keep interacting with y; what’s up with that? I haven’t even talked about the behaviour of the finals yet.

With the limited discussion above, though, light can be shed on something fundamental. I’ve said that the phonetic element locates an OC near-homophone pool, not homophone. This is intuitively straightforward: since every phonetic element was presumably originally created to represent a specific morpheme (once again recall the monosyllabicity of OC morphemes), and it was then subsequently used as solely an indicator of sound, the writing system sans taxograms is in essence a syllabary. But the syllable count is too large for every syllable to have a unique phonetic element (creating a syllabaire imparfait [imperfect syllabary], in Sagart 2006 [in French]'s words [4] —thus compromises must be made. We can observe the basic compromises—tolerations—that ancient script-users must have had, using our model of OC initials.

the XS principle #

It turns out that in our model, the XS requirements to be met and the XS tolerations that can be made—i.e. what is usually termed the XS principle—are in fact reduced to several natural and straightforward rules. For two morphemes [A] and [B] to appear in the same XS series, below are the requirements:

…and below are the tolerations:

Tentatively one adds: since a majority of complex onsets with pre-initials are reconstructed mainly to explain XS, we have (albeit with worrying circularity):

Moreover, when there is a poverty of phonetic elements, tolerations can be made that are not usually made. Again recall the example at the beginning of this post with phonetic element :

It’s clear that [*nˤrewʔ > nraewX, “to trouble”] with nr is nontypical for such a series. But very few morphemes in its pool are attested in OC times; the only clear ones are [*nˤrewʔ > nraewX, “to trouble, to stir”], [*nˤrews > nraewH, “to be bent, to surrender”] (these two might be cognate?), [*nˤrew > nraew, a kind of bell]. It must have been felt that to create a separate phonetic element for these characters was uneconomic, so , typically writing syllables with velar nasals, was chosen since it was “close enough”: in received texts the former two morphemes are variably written as or , and the last is written as . These kinds of XS exist, but are in the minority, and in most such cases we’ll discover that it was “faute d’un meilleur phonétique” (for the lack of a better phonetic, Sagart 2006).

In excavated early writings, the taxograms (and the presence or absence thereof) are much more variable. Thus you simply can’t, say, recognize a graph as and immediately claim it should read [*ŋˤew, “lofty”], or mechanically equate any phonetic-taxogram combination with its modern reading. Instead, since the morpheme–graphic form correspondence had not been set in stone, a phonetic element—say, —could potentially stand for any syllable writeable with it; which, from our earlier observation, is in fact any syllable of the shape (imitating Sagart 2006’s notation) *()(ŋ/ŋ̊)(ˤ)(r)ew(), i.e. the set { *ŋew, *ŋewʔ, *ŋews, *ŋˤew, *ŋˤewʔ, *ŋˤews, *ŋrew, *ŋrewʔ, *ŋrews, *ŋˤrew, *ŋˤrewʔ, *ŋˤrews, *ŋ̊ew, *ŋ̊ewʔ, *ŋ̊ews, *ŋ̊ˤew, *ŋ̊ˤewʔ, *ŋ̊ˤews, *ŋ̊rew, *ŋ̊rewʔ, *ŋ̊rews, *ŋ̊ˤrew, *ŋ̊ˤrewʔ, *ŋ̊ˤrews }, with potential pre-initials. That is our “near-homophone” pool. Obviously no phonetic element is used to write all possible members in its pool, but those are the possibilities. Any proposed reading outside of those possibilities are immediately doubtful, unless compelling explanation can be offered.

The set above can be written as ŊEW; that is, ŊEW = *()(ŋ/ŋ̊)(ˤ)(r)ew(). This all-caps notation has precedence in e.g. Старостин (1989, in Russian), Hill (2015) and has been used for a while in China at least among online enthusiasts—see 布之道 (2021, on zhihu.com, in Mandarin) for a compendiously written overview. (What I’ve been referring to as a “near-homophone pool” that the phonetic element points to, they—the Chinese online enthusiasts—call it the 諧聲域 xiéshēng domain of that phonetic element, which is also very neat.)

Most importantly—we have abstracted the XS domain from the phonetic element. Writing the phonetic element is then basically equivalent to writing ŊEW, and, by a more tolerating extension, NEW; for , TOK; for , TSA; for , MAŊ; for , LAŊ… and so on. We now actually have a clear and simple way of recognizing what a phonetic element can mean, and what it typically cannot. This is invaluable for the palaeographical interpretation of early texts, and perhaps the most important takeaway from this entire post.

Hey, that was a long “closing words” section. Here’s the real closing words: I wish you a happy life. See you later.

footnotes

  1. The only exceptional items are [taengX besides tengX, “to hit”] and [dijH, “ground”]. ↩︎

  2. Or rather, as is actually the case, the absence of a palatalization-blocking feature; see below. ↩︎

  3. The most common one is [laengX, “chilly”]. ↩︎

  4. The “imperfection” is, of course, twofold, the other aspect being there is no inhibition for a syllable to have more than one phonetic element. ↩︎