Melody is Emergent

Melody is Emergent

The visual system is not a camera and the auditory system is not a tape recorder. Not only do we not see what is “there”, we do not hear what is “there” either.  The world that we experience is the result of a creative process, a process that organizes sense data into information that is useful for the kind of animals that we are.  Consider something as simple as three disks laid down in an arbitrary arrangement:

triangle(1)

How many things are in this picture? The instructions to a computer to draw this figure only require the mention of 3 objects, where they are and how big they are, and what their shape is.  But there are not 3 things in this picture. There are 4.  The fourth thing in this picture is the triangle.  The triangle emerges from the three disks and is part of the information conveyed by the sense data. There is a lot of information in a triangle, much more than you might think if you confined your attention to the disks that form the sense data. The triangle has a boundary and this leads to its having an interior and an exterior.  The inside and outside of the triangle are created, they are not in the disks.  Triangles also have angles and we can see the angles even when the sides are not drawn in.  Once we see the angles we also see that triangles point.  That is, triangles create impressions of direction. The next figure is a triangle that points to the right:

arrow(2)

To get this triangle to point unambiguously to the right a base has been added so that it becomes an arrow.  Without the base this equilateral triangle would sometimes point upwards or downwards to the left.  In fact you can use your mind to make the triangle point up or down and when you do that the arrow will stop looking like an arrow and instead look like some kind of odd shape.  The interior of this arrow has been painted black – or has it?  If we see the arrow as being painted then it does look like it has been painted black.  If we see the arrow as being cut out of a piece of white paper and laid on a surface that is black then the arrow looks like a window – it does not have any color.  There literally is more in both of these figures than what meets the eye.  The “more” is due to perceptual organization or what psychologists call Gestalt.  Gestalt is in part the idea that the whole is greater than the sum of the parts.  What the whole being greater than the sum of the parts means here is the disks are seen as being part of something – a triangle.  The disks exist in the visual world, the triangle is in the mind of the observer.  Roger Shepard, an eminent cognitive psychologist, put the situation most clearly; perception is informed hallucination.

The mind not only create objects in space, it also creates objects in time.  Auditory objects do not have names like dog or cat and are generally referred to abstractly as auditory scenes.  Some examples of auditory scenes are bird calls, speech sounds, the sound of splashing or eating and so on.  The idea here is that the auditory system pieces together discrete pieces of sound into events that involve substantial amounts of time.  Music is an excellent example of this process.  The individual notes are like the vertices of a triangle.  And just like the triangle emerges from the entire collection of vertices, the melody in a piece of music arises only when the notes are perceived to be together and part of something bigger.  In a melody we hear not only the notes but also the relations between notes.  Musicians know all about these relations and use them to create anticipation, resolution, suspension, and drive to name a few of the Gestalt properties of music.

Auditory scenes are different from visual scenes in that some kind of memory system is implicated in hearing.  In order for an auditory scene to emerge from a sequence of sounds, the sequence has to be kept alive in a memory bank for at least a few seconds.  The memory system that creates auditory scenes in general, and music in particular, is what I am interested in and what the 40 bpm limit on metronomes is ultimately about.  You can experience the activity of your memory systems by pressing the button marked performance.

PERFORMANCE

Hopefully you just heard a little bit of Erik Satie.  This piece is played at 90 bpm.  Press the button marked slow and you can hear the exact same piece played at about 45 bpm.

SLOW

Now the piece sounds a little sketchy.  What has changed by changing the tempo?  The notes are the same, they have not changed.  This piece was created digitally in a synthesizer and the note values and lengths are fixed. The only thing that has changed is the rate at which the notes are fed into the sequencer.  Obviously the tempo has changed but more than that, the contour has changed.  It has begun to disintegrate. The individual notes are beginning to stand out in relief and separate from the piece that embeds them.  The complete destruction of the contour can be heard by pressing the button really slow.

REALLY SLOW

Now the tempo has been reduced by another factor of 3 to about 15 bpm.  Nothing of the contour remains.  Each note is an island.  Evidently the memory system that create musical contour out of individual notes breaks down when the notes are too separated.

The memory system that creates musical experience is only indirectly based on beat velocity measured as beats per minute.  It is more directly based on the time between beats or the beat period.  Metronome designers could have labeled the wand in terms of period and 40 bpm would be relabeled as 1.5 seconds.  The metronome limit of 40 bpm is telling us that composers should not expect conductors to wait 1.5 seconds to move their hand to the next position.  From the Satie example it is also obvious that composers should not expect listeners to wait much more than a second for notes to arrive if they expect their music to be intelligible.  The research that I discuss in this article is about what a couple of seconds means for human information processing.

Previous: Implicit Working Memory                              Next: Measuring the Coherence Threshold