Evolutionary musicology

Evolutionary musicology is a subfield of biomusicology that grounds the psychological mechanisms of music perception and production in evolutionary theory. It covers vocal communication in non-human animal species, theories of the evolution of human music, and cross-cultural human universals in musical ability and processing.

History
The origins of the field can be traced back to Charles Darwin who wrote in his Descent of Man:
 * "When we treat of sexual selection we shall see that primeval man, or rather some early progenitor of man, probably first used his voice in producing true musical cadences, that is in singing, as do some of the gibbon-apes at the present day; and we may conclude from a widely-spread analogy, that this power would have been especially exerted during the courtship of the sexes,--would have expressed various emotions, such as love, jealousy, triumph,--and would have served as a challenge to rivals. It is, therefore, probable that the imitation of musical cries by articulate sounds may have given rise to words expressive of various complex emotions."

This theory of a musical protolanguage has been revived and re-discovered repeatedly, often without attribution to Darwin.

The origin of music
Two major topics for any subfield of evolutionary psychology are the adaptive function (if any) and phylogenetic history of the mechanism or behavior of interest including when music arose in human ancestry and from what ancestral traits it developed.. Current debate addresses each of these.

One part of the adaptive function question is whether music constitutes an evolutionary adaptation or exaptation (i.e. by-product of evolution). Steven Pinker, in his book How the Mind Works, for example, argues that music is merely "auditory cheesecake" - it was evolutionarily adaptive to have a preference for fat and sugar but cheesecake did not play a role in that selection process.

Adaptation, on the other hand, is highlighted in hypotheses such as the one by Edward Hagen and Gregory Bryant which posits that human music evolved from animal territorial signals, eventually becoming a method of signaling a group's social cohesion to other groups for the purposes of making beneficial multi-group alliances.

Another proposed adaptive function is creating intra-group bonding. In this aspect it has been seen as complementary to language by creating strong positive emotions while not having a specific message people may disagree on. Music's ability to cause entrainment (synchronization of behavior of different organisms by a regular beat) has also been pointed out. A different explanation is that signaling fitness and creativity by the producer or performer in order to attract mates. Still another is that music may have developed from human mother-infant auditory interactions (motherese) since humans have a very long period of infant and child development, infants can perceive musical features, and some infant-mother auditory interaction have resemblances to music.

Part of the problem in the debate is that music, like any complex cognitive function, is not a holistic entity but rather modular – perception and production of rhythm, melodies, harmony and other musical parameters may thus involve multiple cognitive functions with possibly quite distinct evolutionary histories.

The Musilanguage hypothesis
"Musilanguage" is a term coined by Steven Brown to describe his hypothesis of the ancestral human traits that evolved into language and musical abilities. It is both a model of musical and linguistic evolution and a term coined to describe a certain stage in that evolution. Brown argues that both music and human language have origins in a "musilanguage" stage of evolution. He argues that the structural features shared by music and language are not the results of mere chance parallelism, nor are they a function of one system emerging from the other–indeed, this model argues that "music and language are seen as reciprocal specializations of a dual-natured referential emotive communicative precursor, whereby music emphasizes sound as emotive meaning and language emphasizes sound as referential meaning." The musilanguage model is a structural model of music evolution, meaning that it views music’s acoustic properties as effects of homologous precursor functions. This can be contrasted with functional models of music evolution, which view music’s innate physical properties to be determined by its adaptive roles.

Musilanguage hinges on the idea that sound patterns produced by humans fall at varying places on a single spectrum of acoustic expression. At one end of the spectrum, we find semanticity and lexical meaning, whereby completely arbitrary patterns of sound are used to convey a purely symbolic meaning that lacks any emotional content. This is called the "sound reference" end of the spectrum. At the other end of the spectrum are sound patterns that convey only emotional meaning and are devoid of conceptual and semantic reference points. This is the "sound emotion" side of the spectrum. Actually, both of these endpoints are theoretical in nature, and music is seen as falling more towards the latter end of the spectrum, while human language falls more towards the former. Music and language often combine to utilize this spectrum in unique ways; musical narratives that lack clearly defined meaning, such as those of the band Sigur Rós, where the vocal element is in a made-up language, fall more on the "sound emotion" end of the spectrum, while lexical narratives like stories or news articles that have a greater amount of semantic content will fall more towards the "sound reference" end of the spectrum.

Properties of the musilanguage stage
The musilanguage stage is argued to exhibit three properties which also found in both music and language: lexical tone, combinatorial phrase formation, and expressive phrasing mechanisms. Many of these ideas have their roots in existing phonological theory in linguistics, but Brown argue that phonological theory has largely neglected the strong mechanistic parallels between melody, phrasing, and rhythm in speech and music.{BROWN, S. (2000). The 'Musilanguage' of Music Evolution. }

Lexical tone refers to the pitch of speech as a vehicle for semantic meaning. The importance of pitch to conveying musical ideas is well-known, but the linguistic importance of pitch is less obvious. Tonal languages, wherein the lexical meaning of a sound depends heavily on its pitch relative to other sounds, are seen as evolutionary artifacts of musilanguage. According to Brown, the majority of the world’s languages are tonal. Nontonal, or “intonation” languages, which don’t depend heavily on pitch for lexical meaning, are seen as evolutionary late-comers which have discarded their dependence on tone. Intermediate states, known as pitch accent languages, are exemplified well by Japanese, Swedish, Serbian and Croatian language. These languages exhibit some lexical dependence on tone, but also depend heavily on intonation.

Combinatorial formation refers to the ability to form small phrases from different tonal elements. These phrases must be able to exhibit melodic, rhythmic, and semantic variation, and must be able to be combined with other phrases to create global melodic formulas capable of conveying emotive meaning. Examples in modern speech would be the rules for arranging letters to form words and then words to form sentences. In music, the notes of different scales are combined according to their own unique rules to form larger musical ideas.

Expressive phrasing is the device by which expressive emphasis can be added to the phrases, both at a local (in the sense of individual units) and global (in the sense of phrases) level. There are numerous ways that this can happen both in speech and language that exhibit many interesting parallels to one another. For instance, the increase in the amplitude of a sound being played by an instrument accents that sound much the same way that an increase in amplitude will accent a particular point that a language speaker is trying to make when he or she speaks. Similarly, speaking very rapidly often creates a frenzied effect that mirrors that of a very presto musical passage.

AVID model of music evolution
Joseph Jordania has suggested that music (as well as several other universal elements of contemporary human culture, including dance and body painting) was part of a predator control system used by early hominids. He suggested that rhythmic laud singing and drumming, together with the threatening rhythmic body movements and body painting, was the core element of the ancient "Audio-Visual Intimidating Display" (AVID). AVID was also a key factor in putting the hominid group into a specific altered state of consciousness which he calls  "battle trance" where they would not feel fear and pain, and would be religiously dedicated to group interests. Jordania suggested that listening and dancing to the sounds of loud rhythmic rock music, used in many contemporary combat units before the combat missions is directly related to this. Apart from the defense from predators, Jordania suggested that this system was the core strategy to obtain food via confrontational, or aggressive scavenging.

Apart from loud rhythmic singing-stomping-dancing, Jordania also suggested that soft humming could have played an important role in the early human (hominid) evolution as contact calls. Many social animals produce seemingly haphazard and indistinctive sounds (like chicken cluck) when they are going about their everyday business (foraging, feeding). These sounds have two functions: (1) to let group members know that they are among kin and there is no danger, and (2) in case of the appearance of any signs of danger (suspicious sounds, movements in a forest), the animal that notices danger first, stops moving, stops producing sounds, remains silent and looks in the direction of the danger sign. Other animals quickly follow suit and very soon all the group is silent and is scanning the environment for the possible danger. Charles Darwin was the first to notice this phenomenon on the example of the wild horses and the cattle. Jordania suggested that for humans, as for many social animals, silence can be a sign of danger, and that's why gentle humming and musical sounds relax humans (see the use of gentle music in music therapy, lullabies)