Subdivisions of Phonetics

Phonetics is a science that deals with speech sounds. As a spoken conversation is the most frequent manner of communication between human beings, an urge to find out more about the exact mechanisms that speech requires is very natural.

The act of spoken communication always involves at least the following three stages. After the speaker has decided which sounds to employ in order to convey his message, he has to produce these sounds using his vocal organs. Next, the sounds need to travel through the air and cover the distance between speaker and listener. Finally, the listener receives sensory input at his ears and has to recover the intended message from those sensory stimuli.

Communications diagram
Source: J. D. O'Connor: Phonetics. p. 10.

Each of these three stages allows for distinct scientific investigations. It is thus along these lines that phonetics may be broken up into the principal branches of articulatory, acoustic, and auditory phonetics.

Characterisation of Speech Sounds

There exists a long tradition of research into the components of speech, dating as far back as to the Ancient Greeks. It is the articulatory branch of phonetics that has geared the greatest interest, since every competent speaker is at least vaguely familiar with the organs involved in speech production, whereas measuring the physical properties of sound waves requires sophisticated physical instruments, and the impact of an auditory stimulus on the ear-brain-system is even more elusive to a scientific approach.

Articulatory Phonetics

Producing speech means employing one's vocal organs to modify an egressive airstream created by the lungs. In normal breathing, there are no constrictions to this air flow. However, the organs that form part of the vocal tract, most notably the larynx, velum, tongue, and lips, are subject to conscious muscular activity which modifies their positions and thereby the shape of the cavities through which the air needs to pass. In articulatory phonetics, speech sounds are classified by the relative positions of the vocal organs and by their effects on the air flow.

The major categories into which articulatory phonetics divides speech sounds are stops, fricatives, and vowels. These terms refer to the degree of constriction that is imposed on the airstream. In stops, also called plosives, no air at all may pass out of the vocal tract because the passage is completely blocked off at some point (eg. [b], [t]). In fricatives, there is a considerable narrowing of the passage which results in audible friction (eg. [x], [z]). Finally, if the vocal passage is almost unrestricted, the resulting sound is called a vowel (eg. [a], [o]).


Another important property of the vocal tract is the position and shape of the vocal chords, which are located inside the larynx. In normal breathing, these are relaxed and offer a free passage of air. However, the vocal chords can be tensed to a certain degree, which causes them to vibrate in the egressive airstream. Those vibrations result in a musical sound, which is called voice. Vowels are usually voiced, i.e. the vocal chords vibrate while the air passes through larynx and mouth. Fricatives may be voiced or voiceless. In stops, the air, by definition, ceases to flow for a short moment, and naturally, the vocal chords do not vibrate. However, stops are still considered to be voiced when the vocal chords vibrate just before or just after the stop, and voiceless otherwise. By tensing the vocal chords to a higher degree than that required for them to vibrate, the air passage can also be completely closed off at the larynx. The resulting speech sound is called a glottal stop.

Stops and Fricatives

Further classifications that are used in articulatory phonetics to distinguish between various stops and fricatives refer to the place of articulation. For stops, one particular place in the vocal tract is highly relevant, namely that location at which the passage is blocked.

The tongue, being the most agile vocal organ, is usually the active agent in constricting the air flow. To specify the exact place of articulation, it is helpful to name the other part of the vocal tract which serves in passive opposition to the tongue. For example, the sound [t] is referred to as an alveolar stop because the tip of the tongue blocks the air passage by touching the alveolar ridge. Other common places of articulation are dental, palatal, and velar, but it may sometimes be found necessary to specify the place of articulation even more exactly. The lips, being positioned outside the oral cavity, can also be used to block the air flow. The phonetician refers to the constriction caused by the closing of the lips as bilabial, e.g. [b]. Sounds that are formed by the lower lip touching the upper teeth are called labiodental, e.g. [f].

As fricatives are closely related to stops, varying only in the degree of constriction imposed on the airstream, the above places of articulation lend themselves equally well for describing fricatives.


Sounds that require an unblocked passage of air, most prominently the vowels, cannot be described in the same way as given above. The characteristic differences between the vowels are the position and shape of the tongue, on the one hand, as well as the shape of the lips.

The shape of the tongue is commonly given by the position of its highest point, which may move in two dimensions. For front vowels, e.g. [i] and [e], the highest point of the tongue is located close to the hard palate, and for back vowels, it is closer to the velum, as in [u] and [o]. Also, the highest point of the tongue may be quite high, as in [i] and [u], or rather low, as in [a]. These qualities need to be seen on a gradual scale, however, and many different shades of vowels are found in the languages of the world. A neat graphical way of demonstrating the position of the tongue is given by Jones's Trough.

The sound quality of a vowel is greatly affected by the shape of the lips. A vowel pronounced with rounded lips is called a rounded vowel, whereas unrounded vowels are those pronounced with spread lips.

The above methods for the classification of speech sounds may be extended to cover other types of speech sounds, such as affricates, glides, diphthongs, liquids. Furthermore, I have omitted several features of speech sounds, for example nasalization and stress. A full coverage of all elements involved in the production of speech sounds is given by the alphabet that the International Phonetics Association recommends for the notation of speech sounds, but explaining or even listing all of them would take me beyond the scope of this essay.

Acoustic Phonetics

A relatively recent approach to phonetics is the study of the physical properties of speech sounds. Because acoustic phonetics requires direct access to the physics of sound waves as these travel through the air (or any less common medium), it could only be established as a science in its own right after acoustic physics provided the framework and appropriately designed measuring instruments for more elaborate studies in the acoustics of speech sounds.

The most prominent instrument involved in the physical quantification of speech sounds nowadays is the acoustic spectrograph. This device analyses the frequencies that overlap to create a uniform sound wave. From earlier research into the acoustics of musical instruments, it is known that the frequency we perceive as the pitch of a musical key is not the only frequency contained within that sound, but many other frequencies, called harmonics, overlap to make up one sound. It is the distribution and the intensities of those additional frequencies that give each musical instrument its characteristic quality.


From an acoustic point of view, it is convenient to begin with describing sounds that have a musical quality. These sounds are called vocoids in acoustic phonetics, and they comprise mainly the vowels, but also the nasal sounds and some of the other sonorants.

Starting with the fundamental frequency of speech as created by the glottal voice, the cavities of the vocal tract act as resonators or filters to modify the harmonics of the emitted sound wave. On a spectrogram, the fundamental frequency is clearly discernible, but one can also distinguish several bands at higher frequencies which feature higher intensities than the rest of the frequency spectrum. These bands are called formants, and in order to facilitate referring to each of those, they are numbered from the first frequency band above the fundamental upwards, as in F1, F2, F3... Research has not been concluded yet, but it seems that the frequencies of the first two formants suffice to give a rather adequate classification of the vocoids. To quote an example from Brosnahan, p. 94, the sound [i] may be classified as a vocoid with F1 at 250 Hz and F2 at 2500 Hz, contrasting with [e] with F1 at 400 Hz and F2 at 2000 Hz, and with [y] with F1 at 250 Hz and F2 at 2000 Hz. Care needs to be taken in comparing speech sounds this way because the frequency levels vary widely from speaker to speaker and are naturally higher in female voices. Furthermore, such an analysis ignores the actual intensities of the formants, focussing on their frequencies only, and for a full description many other variations need to be taken into account.


Acoustic phonetics classifies anything which is not sonorant in nature as contoids. This group includes fricatives and affricates, which show as noise on a spectrogram, as well as stops, which, technically speaking, do not create any sound at all, but affect the preceding and following sonorant sounds. It seems that research has centered on proper stops, which may be analysed as silence surrounded by short glides in the formants of neighbouring sonorant sounds. As such, it becomes obvious that it is difficult to characterise the acoustic properties of a stop as they will always depend on the qualities of the surrounding sounds. Phoneticians have tried to explain exactly what kind of transitions are required for a stop to be heard as [p], say, or [k]. For [d] as opposed to [b], for example, O'Connor suggests that in both [d] and [b], the formant F1 at the beginning of the next vocoid starts out from a little lower, in order to glide to the frequency required for the following vocoid, and in [d], F2 performs a transition from some high frequency towards its final position, whereas for [b] this initial frequency for F2 is somewhat lower.

Although some results such as the one above point the way for fruitful theories, existing explanations have not yet been successful in explaining satisfactorily the properties characteristic of a certain stop. Fricatives and other speech sounds require even more research.

To conclude this short description of acoustic phonetics, it is as yet not very obvious how to understand the spectrograms, which appropriately reflect the properties of the acoustic waves corresponding to speech sounds. Some progress has been made, but phoneticians are still quite remote from being able to understand the incoming physical sounds in as competent a way as the human ear does.

Auditory Phonetics

Auditory phonetics researches into the way the human ear and brain perceive and analyse different speech sounds. Quite a lot is known about the anatomy of the ear, but as it is difficult to obtain any objective measurements from inside a subject's head, and even more so because the brain processes involved in analysing speech are very intricate, this branch of phonetics is arguably the least developed one.

Research into the physiognomy of the ear has shown that it consists of several cavities which probably serve as resonance chambers as well as filters. The tympanum, together with the three small bones of the middle ear, acts as an amplifier in transmitting the incoming sound waves to the oval window. The oval window forms an interface to the liquid-filled cavity of the cochlea which contains the Corti organ with many sensitive nerve cells. Current research suggests that this mechanism maps the frequencies of incoming sound waves to different areas inside the cochlea.

The impulses received by the auditory nerve cells travel from the ear to the brain, where they are subject to complicated neural processing, which is as yet almost inaccessible to scientific enquiry. Apart from electric encephalograms, which measure differences in electric potential inside the brain and which are not particularly helpful to the phonetician, the only way to obtain information on auditory sensations that human beings perceive upon hearing certain sound waves is by asking test subjects to describe their particular sensations. Although such data are obviously subjective, it seems one can classify speech sounds by using pairs of contrasting adjectives, such as hissing and buzzing, dull and sharp, heavy and light (Clark and Yallop, pp. 310), and agreement on whether a given sound constitutes a vowel or a consonant is usually considerable.

To me, it seems as if auditory phonetics requires a lot more research before one may properly call it a science, but at present phoneticians lack accessible ways of measuring auditory qualities of sounds, and it may be impossible at all to advance the knowledge in auditory phonetics without breaking moral boundaries.

Relations Between the Three Branches of Phonetics

As noted earlier, the oldest research into phonetics has been in the field of articulatory phonetics, and therefore many formerly articulatory terms are also used to refer to related sounds in the other two subdivisions of phonetics. Even today, it is the articulatory branch of phonetics that has the greatest influence on phonetics as a whole, especially because no complicated experimental setup is needed for initial investigation. This prominence of articulatory phonetics becomes obvious as soon as one reads any introductory textbook on phonetics.

Articulatory and Acoustic Phonetics

The relation between articulatory and acoustic phonetics is a rather onesided one. Acoustic phonetics was dependent on articulatory phonetics in its early days because the physical data collected by scientific measurements seemed so unrelated to the known speech sounds at first that known articulatory observations were welcomed as a guiding hand. Finally, acoustic phonetics has taken off as a science in its own right, and it has produced results that are in considerable agreement with articulatory discoveries, particularly concerning the vowels. I would like to quote from Ladefoged an experiment that highlights the relation between resonance chambers in articulatory phonetics and the formant F1 in acoustic phonetics.

"If you place your vocal organs in the positions for making the vowels in each of heed, hid, head, had and then flick your finger against your throat while holding a glottal stop, you will produce a low pitched note for the word heed, and a slightly higher one for each of the words hid, head, had."

P. Ladefoged: Elements of Acoustic Phonetics, p. 102

The dependency of a contoid on its surrounding vocoids is shown by the following observation. In articulatory phonetics, it has been noted that consonants in front of a high front vowel, such as [i], become palatalized, and experiments by Brosnahan confirms that such contoids are indeed different. For example, if one were to take the sound waves corresponding to the sounds [sk] in [ski], and place them in front of the sounds [ul] from [sku:l], and play the resulting sound pattern to a test audience, most listeners will identify the heard sounds as [spu:l], and not as [sku:l] (Brosnahan, pp. 78).

Articulatory and Auditory Phonetics

There is also considerable agreement between articulatory auditory phonetics, which may seem obvious as verbal communication would be quite futile otherwise. If asked to group certain speech sounds according to their articulatory characteristics, and then to analyse the sounds with respect to their auditory qualities, almost any speaker will produce highly similar lists. For example, different nasal articulations frequently result in being classified together as humming sounds.

Furthermore, when a linguist, or indeed any speaker who is attentive to actual speech sounds, hears a sound that he cannot classify on its auditory merits alone, trying to articulate a similar sound himself will help him to establish the characteristics of that particular sound. And anyone who has learnt a foreign language will certify that it is a lot easier to pronounce new sounds, which he has to acquire in order to use the new language because his native language does not use those sounds, if he has managed to notice the auditory difference, and vice versa.

Acoustic and Auditory Phonetics

Strange enough, the correlations between auditory and acoustic phonetics are not very obvious at all. When we listen to a speaker, we are hardly ever aware of the particular acoustic qualities of his voice, and even if we are, we would hardly be tempted to discuss any of the technical properties mentioned above, such as formants. So the relation between acoustic and auditory phonetics seems to be a very elusive one. Of course there has to be some kind of relationship, as we purport to know that it is the acoustic sound wave that transmits the message from the speaker to the hearer.

One striking correlation between acoustic and auditory phonetics has been confirmed by experiments. Often, when we hear speech in a noisy environment, and when misunderstanding a word, the exact order of the speech sounds gets confused. Brosnahan states that a temporal threshold of hearing has been shown by experiment, which prevents a listener from stating exactly which one of two speech sounds given in quick succession is the first one.

Segmentation of Speech

One question which might have been dealt with at the outset is how do we know where a speech sound ends within the constant flow of speech and where the next one begins. The answer may seem quite obvious in the case of stops, but when one considers the gradual changes that are required for pronouncing glides, as well as the considerable amount of preparatory positioning of the vocal organs before any given, well-defined sound may actually be articulated clearly, it is no longer easy to see any clear pattern. However, every competent speaker of any language is aware of clearcut sound boundaries, and these boundaries are a further point of high overlap between articulatory and auditory phonetics. Acoustic phonetics, on the other hand, accurately reflects the gradual changes that take place during speech production, and articulatory phonetics confirms the gradual movements of the articulatory organs, as seen in realtime X-ray photography.


Overall, the communicative system of speech seems to allow for a very high degree of distortion, and speech tolerates quite massive noise overlay, showing that it is at once very robust and suitable for interhuman communication as well as highly redundant in its acoustic and semantic properties and thereby difficult to analyse fully. When I consider the useful applications of articulatory phonetics in instructing learners of foreign languages or the impaired of hearing how to pronounce speech correctly, and of acoustic phonetics in building machines that can produce synthetic speech, phonetics seems to be a fruitful science, and auditory phonetics may lead the way to a better understanding of the human brain, one of the most intricate and least understood structures known to man. On the other hand, there is considerable overlap between any of the three major branches, as outlined above, which justifies the classification of each of them as a subfield within phonetics.


