Cocktail party effect

The cocktail party effect describes the ability to focus one's listening attention on a single talker among a mixture of conversations and background noises, ignoring other conversations. This effect reveals one of the surprising abilities of our auditory system, which enables us to talk in a noisy place.

The cocktail party phenomenon can occur both when we are paying attention to one of the sounds around us and when it is invoked by a stimulus which grabs our attention suddenly. For example, when we are talking with our friend in a crowded party, we still can listen and understand what our friend says even if the place is very noisy, and can simultaneously ignore what another nearby person is saying. Then if someone over the other side of the party room calls out our name suddenly, we also notice that sound and respond to it immediately. The hearing reaches a noise suppression from 9 to 15 dB, i.e., the acoustic source, on which humans concentrate, seems to be three times louder than the ambient noise. A microphone recording in comparison will show the big difference.

The effect is an auditory version of the figure-ground phenomenon. Here, the figure is the sound one pays attention to, while the ground is any other sound ("the cocktail party").

Experiments and theoretical approaches
The effect was first described (and named) by Colin Cherry in 1953. Much of the early work in this area can be traced to problems faced by air traffic controllers in the early 1950's. At that time, controllers received messages from pilots over loudspeakers in the control tower. Hearing the intermixed voices of many pilots over a single loudspeaker made the controller's task very difficult.

Cherry (1953) conducted perception experiments in which subjects were asked to listen to two different messages from a single loudspeaker at the same time and try to separate them. His work reveals that our ability to separate sounds from background noise is based on the characteristics of the sounds, such as the gender of the speaker, the direction from which the sound is coming, the pitch, or the speaking speed.

In the 1950's, Broadbent conducted dichotic listening experiments: subjects were asked to hear and separate different speech signals presented to each ear simultaneously (using headphones). From the results of his experiment, he suggested that "our mind can be conceived as a radio receiving many channels at once": the brain separates incoming sound into channels based on physical characteristics (e.g. perceived location), and submits only certain subsignals for semantic analysis (deciphering meaning). In other words, there exists a type of audio filter in our brain that selects which channel we should pay attention to from the many kinds of sounds perceived. This is called Broadbent's filter theory. There is some empirical evidence to support this theory, though it has been criticized by some (Norman, et al).

There are other theories, including those of Treisman (1960), and Deutsch and Deutsch (1963).

This phenomenon is still very much a subject of research, in humans as well as in computer implementations (where it is typically referred to as source separation or blind source separation). The neural mechanism in human brains is not yet fully clear.