Tonally Expressive Metapatterns

Tom Hagan
New York University

Citation information: Hagan, Tom. 2017. Tonally expressive metapatterns. The NYU Student Journal of Applied Metapatterns, volume 1, issue 1. Available at:


Musical scales are a collection of notes that cut up octaves into a number of intervals. Each of these notes, or tones, is the auditory interpretation of a sound wave with a specific frequency. Two of the most common scales in music, across many borders and cultures, are the standard major and minor scales. This paper uses Tyler Volk’s metapatterns1 to examine why humans across different cultures independently created, and still employ, these same musical scales, and associate them with the same emotions.


Until the early 90’s, the prevailing belief was that the emotions we perceive from certain scales are a result of cultural conditioning to recognize certain speech patterns; however, there is evidence now that this perception may stem from a biological need to discern the intended emotion of speech as an evolutionary advantage for survival. This paper will examine these two possible theories, as well as attempt to develop a stronger understanding of these matters, as there is still no overall agreement as to why certain combinations of intervals (scales) evoke a certain emotion in people regardless of cultural borders and boundaries.

This investigation requires the creation of a new branch of metapatterns: Sonic metapatterns. The term “sonic metapatterns” refers to metapatterns that propagate through auditory means. The specific form of sonic metapatterns this paper will discuss I have taken to calling Tonally Expressive Metapatterns. These are patterns of intervals of sonic frequencies that appear in different applications, such as speech and music, to express a certain emotion. Studies have begun to show that the reason we derive a certain emotion from a specific combination of tones could be a result of those tones matching speech patterns that we recognize subconsciously and use to derive the intended emotion. Here I will dive into the exploration of vocal patterns occurring in music to transmit an emotion, and vice versa.


The human ear is capable of distinguishing around 240 different tones within a midrange octave.2 In music, an octave is the name for an interval that is either half or twice the frequency of the starting note. There is middle C (C4) and an octave above lies another C note whose wavelength has twice the frequency, written as C5. In many cultures (both Eastern and Western), octaves are broken up into twelve intervals, known as the chromatic scale. Subsets of these notes can be used to create different types of more “focused” scales. In recent centuries, the most common scales formed as subsets of the intervals in the chromatic scale are diatonic scales. Diatonic scales are scales with seven intervals: five whole steps and two half steps. Globally, the most common of these scales are the Ionian and the Aeolian scales. These are more commonly referred to Major and Minor scales, respectively.3

Understanding the concept of intervals is key to understanding Tonally Expressive Metapatterns, so I’ll provide a short explanation. As stated in the abstract, each tone in any scale correlates with a specific frequency. For example, the note known as A, specifically the first A that lies above middle C (known conventionally as A4), has a frequency of 440 Hz.4 The next note in the scale of A major, B, has a slightly higher frequency. This difference in frequency results in a slightly higher note when it is received by the human ear, and the difference between these two notes is known as an interval. This is where it becomes slightly more complicated: the frequency associated with a note is not the only frequency that the note produces. A note is actually a combination of the main frequency (the frequency discussed above) and a number of less prevalent frequencies known as overtones.5 Every note has these harmonic overtones.

It is known that voiced speech sounds are harmonic in nature.6 This means that every tone emitted vocally while speaking has overtones along with that main frequency. The ratios between the frequencies of the overtones and main frequency correspond to those found in music. Statistics show that most of the frequency ratios in the chromatic scale specifically appear in voiced speech across languages that are not closely related. It is human biology to extract pertinent information regarding the emotional state of another person through the attributes of their voice and delivery.7 It is speculated that this would have played a role in the survival of humans as we evolved. The biologically functional differences between excited and subdued speech cause us to articulate differently depending on the emotion we are trying to pass with our speech. These differences result in different intervals being used when speaking in an excited manner than those used when speaking in subdued manner.8

Music naturally mimics the behavior humans show to express emotion. For example, excited music tends to have a quicker tempo, and be more energetic. Sad or subdued music, on the other hand, tends to be slower, softer, and more delicate. This leads experts to believe that the similarities between the way people express emotions socially, and the way they express emotions musically, are closely linked. Farther still, many experts feel that it is short sighted to think that the similarities end with tempo and energy in music. It follows a trend to think that our concept of emotions conveyed through music is a result of a subconscious association of intervals in speech. Therefore it makes sense to pose the question: are the sonic differences that distinguish major and minor scales paralleled by the sonic differences that distinguish excited and subdued speech?9 If that is the case, then it may make sense that we associate Major scales with excitement and Minor scales with subduedness based on a biological reception of the intervals used when expressing this speech.

Conclusion and findings

The evidence for our interpretation of music being biological as opposed to culturally inflicted on us seems to be the most prevalent in my research. Therefore, it is my opinion that this theory makes the most sense. There are a couple of points I feel are worth pointing out, however, just from my own knowledge of music. Relative minors are minor scales that have scales made up of the same notes as their major counterparts. For example, the scale of A minor has all of the same notes as C major. The difference is that they start on different notes, known as the tonic. So the tonic for the key of A minor is the note A. The relative minor of any major inherently has all of the same intervals, considering it is made up of the same notes. Does then the voice speaking in excited or subdued speech have to have some sort of tonic to define it? We learned above that it is the intervals between the tones emitted during speech, among other inflections, that pass the information on whether or not the speech is subdued or excited. Thus I would speculate that the “tonic” in speech is actually dictated by the energy of the speech and other inflections, which pass some of the information to the recipient from the speaker. From there, the basis for excited or subdued speech is set, and the rest of the information can be gained from the musical intervals present in the speech.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License