Social top


Reflections on Aspects of Music Interactivity in Performance Situations

School of Art, Design and Media, Nanyang Technological University, Singapore

This paper in its first version was given at the Hong Kong University Research Colloquium Series (September 2006). It has been revised and updated for publication in eContact! 10.4.

Aspects of Music Interactivity
A Definition of Interactivity
The Score
Degrees of Music Interactivity
Where Lies the Creativity of Interactive Systems?
Bibliography | Author Biography


Music interactivity is a sub-field of human-computer interaction studies. Interactive situations have different degree of structural openness and musical “ludicity” or playfulness. Discussing music seems inherently impossible since it is essentially a non-verbal activity. Music can produce an understanding (or at least prepare for an understanding) of creativity that is of an order neither verbal nor written. A human listener might perceive beauty to be of this kind in a particular music. But can machine-generated music be considered creative and if so, wherein lies the creativity? What are the conceptual limits of notions such as instrument, computer and machine? A work of interactive music might be more pertinently described by the processes involved than by one or several instanciations. While humans spontaneously deal with multiple process descriptions (verbal, visual, kinetic…) and are very good at synthesising, the computer is limited to handling processes describable in a formal language such as computer code. But if the code can be considered a score, does it not make a musician out of the computer? As tools for creative stimulus, composers have created musical systems employing artificial intelligence in different forms since the dawn of computer music. A large part of music interactivity research concerns interface design, which involves ergonomics and traditional instrument maker concepts. I will show examples of how I work with interactivity in my compositions, from straight-forward applications as composition tools to more complex artistic work.

Aspects of Music Interactivity

It may seem contradictory at first, but it is clear that the more technology advances, the more influence an anthropomorphic perspective has over the development of computer tools. A phenomenological perspective on the relationships between hand and voice, tool and sight, and of how they formed early man, is developed in (Leroi-Gourhan 1966). Today, Human-Computer Interaction is a major field of study. If you allow, I will let Figure 1 illustrate a certain kind of relationship between humans and machines.

Figure 1
Figure 1. The main patcher window for Leçons hints at a simiothropic perspective on computer creativity.

Can machines listen? We may have to wait some time before a robot can do semantic-listening, as defined by Schaeffer, Chion and Smalley, but already today a computer can be trained to mimic reduced listening. There is a lot of ground to cover before they start taking pleasure in such activity, but who knows what lies ahead? Discussing art and artificial creativity in the foreword to 100,000,000,000,000 de poèmes (1961), Raymond Queneau attributes to Alan Turing the aphorism: “Only a machine can appreciate a sonnet composed by another machine.” Before we can discuss whether machines might one day be capable of actually composing and appreciating music (whichever comes first), we — humans — need to interrogate the meaning of concepts such as “listen”, “appreciate”, and “understand”. The problem was addressed by Max Mathews, stating that the fundamental challenge of computer music is not technical, but psychological:

There are no theoretical limitations to the performance of the computer as a source of musical sounds, in contrast to the performance of ordinary instruments. At present, the range of computer music is limited principally by cost and by our knowledge of psychoacoustics. (Mathews 1963)

It is easy to be blinded by technology. A reflection on the tools we use from a phenomenological and semiotic perspective can be a vaccine against being carried away with every novelty. We will therefore embark on a discussion of some of the many aspects of the phenomenon of interactivity, in parallel with brief presentations of recent works. Much as in life, we will alternate informally between practice and reflection. The examples are meant to illustrate three metaphors of the computer’s role in music/sound creation, across stage-like performance, installation and studio analysis-composition. The distinction between these situations is essentially given by the way a user can affect the computer and to what degree. Before we set off, I will let Figure 2 illustrate, in a both humourous and profound way, one kind of relationship between programmer, machine and end user.

Dynamical Systems

Figure 2
Figure 2. Aspects of interactivity and system control being discussed by inventor and end-user in a typically human fashion. From A Bamse Adventure (Andréasson). [Click image to enlarge.]

Depending on the context and the philosophical inclination of the user, the computer can be thought of as an instrument, an assistant or a machine (or in some entirely different way). In regards to music performance (a discussion of the distinction between sound and music is outside the scope of this article, and I will simply conflate the two and call them “music”), I think that the appropriate perspective depends on the structure of the performance situation and the degree of interactivity in play. Theoretical approaches have nuanced an understanding of the computer as “a universal sonic instrument, a device with which we can model and produce any conceivable sound-object or organisation for sounds” (Wishart 1983, 325) or as a “representation of an instrument” (Cadoz 1999) or even as an “abstract field of potentials” (Ryan 1999). At the same time, more hands-on approaches treat it as an extension of traditional instruments. Or, the computer can creatively assist us in conceiving new sound-objects, that is, material or instruments, and organisations, that is, processes and actions. All of these activities lie in the purview of the composer; for example, a score is a set of action instructions, singular or process-like, essential in the creation of a single sound or a concert performance, both of which are sound-objects. When describing a dynamical system, we name agents the system entities that show some degree of independence, such as limited decision-making or capability for “intelligent” responses. Agents may be natural (human) or synthetic (e.g. a computer program or a robot). We call “interface” the devices employed by agents enabling them to communicate with their environment. The instrument in interactive music may be thought of as an interface, a function transforming mechanical energy to electrical signals according to some predefined mapping scheme. The interface is the link through which an agent accesses a sound-producing unit. Designing the interface (physical and/or virtual) is a major part of contemporary composition of interactive music. In many pieces, the interface becomes intimately associated with the musical work itself. In a strongly interactive work, the two might be indistinguishable.

The Computer as Hyper-instrument

In many a piece of contemporary music, the performance situation is typical: a musician plays a (traditional) acoustic instrument and a computer occasionally uses the instrument sound as input to a real-time transformation, creating a sound response. The relation of the transformation output to the original instrument input may range from that of a “shadow” (following the original narrowly in the time domain but extending it in the spectral domain), “tail” (extending it in time domain but most often less so in the spectral domain) to “multiplication” (employing various ways of enlarging the original sound in both spectral and time domains). In addition to such real-time transformations, the instrument sound may trigger the playback of prefabricated soundfiles. If the output is time-delayed significantly, possibly separated by silence, it is more convenient to speak of a distinct response. If in addition to being distinct in time the output is significantly transformed, the computer may appear as being capable of a degree of creative contribution to the whole, thus approaching being perceived at least as a semi-independent agent, even if it does not strictly speaking possess agency. Within a context of music-making and performance, it is more about convincing illusions and interesting concepts than the actual functioning of the agents. Many systems for real-time music performance are purely reactive (Cardon 2000) in that they show a unidirectional flow of creative input. As Philippe Manoury has pointed out, an appropriate term for this situation is “real-time interpretation”, rather than real-time interaction (Manoury 1991). Obviously, the computer is a very capable effect box, offering manifold possibilities for nuanced feedback over different channels (sonic, visual, haptic with or without physical force feedback) going from computer to performer. Explored and discussed by Michel Waiswitz and Serge de Laubier, among others, systems with close interaction between acoustic instruments and computer parts are referred to as hyper-instruments, meta-instruments, hybrids… The metaphor of the computer as hyper-instrument is concerned with immediacy of interpretational possibilities, offering the performer rich access to ornamental layers of the compositional structure.

A Hypercymbal for TreGraphie

For Tregraphie (Koh and Lindborg 2006), composer Joyce Beetuan Koh requested a “singing cymbal” to be played by a percussionist in a trio together with a pianist and a vocalist. We designed a “hypercymbal” for the percussionist, consisting of a large suspended cymbal and a Max/MSP patcher (more about this below) for sound transformation, controlled by a Wacom tablet. The patcher is shown in Figure 3. The cymbal sound is picked up with a contact microphone; no direct amplified sound is sent to the loudspeakers. The treatment is based on Adrien Freed’s resonating filter bank, with parameters for pitch, amplitude and decay rate (filter Q) of each filter. The filter bank emulates a resonator model and the cymbal is a rather noise source. Different models may be created by sound analysis or by intuitive drawing. The performer can interpolate between four different resonating models by moving a Wacom mouse on the tablet, and affect the fundamental pitch by a continuous pedal. To add liveliness to the model, pitch (or amplitude or sharpness) of the filters can be affected by the performer flicking the mouse wheel: pmpd, a virtual string model by Cyril Henry, makes the filter oscillate around the parameter values, creating variation at an ornamental level.

Figure 3
Figure 3. Graphical user interface for the computer part of the hypercymbal in TreGraphie. [Click image to enlarge.]

At the start of the development of the hyper-instrument, we gave the performer control over a 360˚ spatialisation of the hypercymbal sound, using the direction of the cordless mouse on the tablet. This was ultimately taken out, as the performer had quite a task in playing the cymbal with different beaters, fingers and bow in complex rhythms, as well as manipulating the sound transformation via the interface. It also proved impossible to receive audio feedback on the spatialisation of the sound from his stage position, rendering control over that parameter meaningless.


An example of the kind of “rich” control over a responsive hybrid instrument that I am talking about would be the audiovisual performance Skalldans (2007–08). This project involves developing an instrument for creating a live techno-style rhythmic music and a dreamlike, synthetic live video. A major part of development lies in exploring ways of mapping data between the two domains, and reducing the reliance on knobs-and-sliders interfaces. The sounds are made from FFT-filtered noise, additive synthesis and Atau Tanaka’s Relooper Redux, all using live video as input in various ways. Figure 4 shows the main Max/MSP/Jitter patcher.

Figure 4
Figure 4. User interface patcher for Skalldans. [Click image to enlarge.]

The visuals are made from a live webcam input of the performer’s face and a 3-D object made from a MRI scan of the composer’s cranium (plus a few simple effects such as downsampling and 4-channel crossfading). There are two interfaces: a USB controller mainly piloting the sound generation and mix, the webcam giving information about the performer's head position in front of the screen, and a Nintendo Wii-remote which tracks the angle of the performer’s head; these data are mapped to the 3-D skull, and in this way, the performer can move his head rhythmically and “skull dance” to the music. The images in Figure 5 gives an impression of what it looks like.

Figure 5
Figure 5. Collage of four snapshots from a performance of Skalldans at Happy Ears Festival (Belgium, 2008).

The obvious purpose of the Skalldans setup has been to bring in a more exciting performance element that I feel is often lacking in laptop music. I wanted to emphasise the presence of the performer; make him more than the typical bronze-bust with a frozen gaze into the computer screen, bleakly illuminating his face. At the same time, the piece inevitably creates discussions on dreamscapes, Dionysian dancing, death and many other equally fascinating themes.

A Definition of Interactivity

Basing on general principles of human-computer interaction, a particular form of interactivity in music creation is often described as “communication between musician and machine.” What constitutes a communication? Gabriel Valverde once suggested that any act involving music-making or appreciation is in itself “communication”. In which situations does this hold true? Can any activity creating a link between an action and its perception within an observing agent be called a communication? Consider a simple act such as the reading of a music score (is this “making music”? As we will see below, Stravinsky may very well have considered it to be exactly so). The flow of information goes towards the reader; it is unidirectional. Indeed, the flow constitutes a link between the reader and the read. The reader might be changed through the interaction, but nothing will happen to the text itself. A two-way linking is clearly needed. In my DEA thesis, I wanted to emphasise the importance of dialogue-like communication and equivalence between agents, and proposed the following definition: a deep interactivity is characterised by conversational, bi-directional information exchange between agents, transforming the functioning of the entities involved, which have equivalent degree of import to the system as a whole. A performance situation is interactive if such conversational exchange is a dominant feature of the system. A broad discussion leading to the definition is given (Lindborg 2003b).

The Computer as Expert Assistant

With this tentative definition in mind, let us look at what could be considered interactive use of the computer in score composition. By this I refer to the discrete-time (non-real-time, outside-time, off-line…) creative situation, characterised by intensive use of music computing tools or development of such, in a studio-like situation, individual or collaborative but more significantly, absent of an audience expecting to hear a sound anywhere soon. Composition is a process, involving numerous methodologies and producing a wide range of outputs. Typically, the output is a symbolic score on paper, soundfiles, computer code with or without an interface, or a combination of these. Today, the composer is not obliged to learn a programming language and master all its grammatical detail. Software such as Max/MSP, Pd, SuperCollider, Cecilia/Csound, PWGL and OpenMusic are supported by a graphic programming interface, greatly increasing the ease of use. For example, in the case of Max/MSP, the patcher (such as the one for the hypercymbal) lets the user manipulate and interconnect objects, and at the same time it is a script generator passing on actions to a compiler instanciating threads that take care of tasks and schedules deeper inside the machine. In various ways, all of these softwares are examples of composition expert systems, through which the user can develop programs to solve specific music composition problems.

The ensuing example will describe a rather typical process (that is, for me) of creating a symbolic score. In theory, it consists of a few and chronologically reasonably distinct methods: selection of concept; analysis of a set of external data; generation of musical material (harmonies, rhythms, melodies, divisions, actions, orchestrations); selection among the materials; composition of global-scale structures from material; adaptation of local-scale material for the application, constrained by idiomatic and pragmatic concerns. In practice, it may be less neat. Be that as it may, the computer is here something of an “expert assistant” to the composer. It can be argued that the situation shows a bi-directional exchange of information and initiative. Normally, the composer poses precisely programmed questions to the expert assistant, who/which proposes solutions of a prescribed given scope that the composer may accept or reject. Since this process happens in discrete time he can use as much time as it takes, e.g. using computationally intensive algorithms, such as a complete search for all solutions in a state space. This kind of search may take a very long time, if the problem is hard or even intractable (as discussed e.g. in Russell and Norvig 1995). The other observation is that since this happens without an audience, failure to arrive at a solution can be used creatively. The output of the computer may offer a new problem. If the composer receives no solution at all (which may of course also be a solution, philosophically speaking), work may be needed to solve a technical obstacle, and this most often leads to a more precise question being asked. If a solution offered is out-of-range, it may force the user to reconsider the scope, resizing the problem as well as the application. Output which may at first have been considered completely useless may eventually be re-contextualised and integrated in the composer’s plan, perhaps at another level. The computer’s “failed responses” may in this way be embraced and act to stimulate the user. When this happens it is potentially radical, and may occasionally give the composer the impression that his expert assistant is actually providing creative input to a conversation, a collaboration. Can we thereby say that the creativity is distributed between the two agents? The Turing test model does not apply in the situation, since we our example concerns the studio situation where there is no audience; no third party to judge the “naturalness” of the agents’ output. A haunting question surfaces: is the impression of agency the same thing as creativity?


Let us look at an example of a work involving extensive use of an expert system. TreeTorika is a composition for chamber orchestra based on an analysis of a recording of Mao Zedong’s Tiananmen speech of October 1949. The analyses concern the voice sound; absolutely nil of the word content of the speech. I first determined Mao’s phrasing, in particular the rhythm between his speaking and silence following each enunciation; second, I transcribed the vocal line as a melody; third, the phrases and the melody were used to determine tempo and quantification; fourth, I extracted harmonic material from the vowels. Figure 6 identifies the processes and dataflow involved as well as the supporting software applications, and Figure 7 shows the main OpenMusic patch for beat speed analysis.

Figure 6
Figure 6. Overview of analysis and composition phases.
Figure 7
Figure 7. [Click image to enlarge.]

One specific problem that I needed to solve in preparing for TreeTorika concerned the creation of rhythmic background material. I wanted a polyrhythmic texture made up of several layers, with each one containing rhythmic material extracted from Mao’s speech rhythm. I only needed to control global features such as slower-moving synchronisation points (downbeats), and general density directivity (accelerando, stasis or ritardando). The rc library (Sandred 1999) in OpenMusic is specially conceived for working with hierarchical rhythmic layers. The library interfaces to Michael Laurson’s pmc search engine, and generates solutions to optimal-path problems using constrained by user-defined rules. It allows programming of the “compositional questions” discussed above. The composer writes constraint rules expressing his global intentions. The syntax for creating rules entails a bit of logic but is quite intuitive. It can be done entirely using the graphical interface, as illustrated in Figure 8. For example, one may want to specify rules such as “prefer triplets in layer three”, “make phrases gradually shorter”, “forbid rhythmic unisons” and so on. The search engine generates a solution, which the user may accept or reject. The creativity lies in adjusting the rules to create increasingly better output, until an optimal result is found (which may be expressed as having reached the balancing point between what is acceptable and what is doable before the deadline). Tweaking the rules can speed up the process (in particular with the optional heuristical rules, which determine the ordering of candidates). In this dialogue-like communication between composer and composer lies the interactivity of the expert system.

Figure 8
Figure 8. Graphical programming of heuristic rules to generate rhythms. [Click image to enlarge.]

More details on the analysis of Mao’s rhetorical voice and the composition project leading to TreeTorika can be found in the OM Composer’s Book #2 (Lindborg 2008).

The Object-process Duality of the Work of Music

Working with symbolic scores forces us to return to a central question: what exactly is a work of music? Is it the same thing as a composition? Within the earlier definition of deep interactivity, there is no mention of “composition”. We may propose that the work is a persistent entity acting as mediator between agents; thus, it cannot itself be an agent. It could be considered an interface, but this is not intuitive. I will try to define it differently. In his pivotal study on aesthetics and Greek tragedy, Nietzsche made a dichotomy between the “Apollonian” and the “Dionysian”. Robert Duisberg describes a work of music in its Apollonian mode as “form without content”: an idea-like object untouchable by time and not yet physical vibrations or sound waves. In contrast, the work in its Dionysian mode is “material without form” (Duisberg 1994). In Greek thought, Apollo and Dionysus were gods, meaning that they were externalisations of representations of human perceptions of the world (such things are useful because they offer ways to communicate ideas from one generation to the next — from the dead to the living — and generally, mediate between people having no direct contact). Within the context of interactive music, there is a similar duality in the way we see the nature of a work of music (a composition). Can we associate the Apollonian mode of communication with an interpretation of the work as object, and the communication in Dionysian mode as an interpretation of the work as process? The former reflects the ethereal qualities of the work, seen to reside “outside time”, having resilient form and being disinterested in the reactions of a listening agent. With the latter, we evoke the temporal aspects of the work, emphasise the dynamic activity producing content and the impact it has on a listening agent. The object-process duality is reminiscent of Igor Stravinsky’s words:

It is necessary to distinguish two moments, or rather two states of music: potential music and actual music. Having been fixed on paper or retained in the memory, music exists already prior to its actual performance, differing in this respect from all the other arts, just as it differs from them … in the categories that determine its perception. The musical entity thus presents the remarkable singularity of embodying two aspects, of existing successively and distinctly in two forms separated from each other by the hiatus of silence. (Stravinsky 1942/1977, 121)

For Stravinsky, the two “moments” were more than stages in the production of a work of music. For him, the score (symbols on a paper), although aphonic, is already music, and at the same time, the promise of becoming sound. While a typical orchestra score may allow only limited scope of interpretational indeterminacy — producing different sounds from a single text — with interactive works, the two modes exist clearly simultaneously. How can the dual nature of the musical work be thought of in phenomenological terms and used in composition?

Process and Object in a Perspective of Écoute réduite

By the word “process”, I do not simply mean a function that generates a quantifiable collection of musical objects (the group of possible instances) or states (the flow of possible states). Work-as-process is a mode of existence of a work and it refers to the time-dependent structure of a work at the time-point when it leaves an impression on a listening agent, which, as a consequence, is given the potential to realise a material instance corresponding to the perceived data. In the receiver (listener) exists the work-as-object. It does not exist within a process; it is a representation produced by an agent from perceived input. The work-as-object exists only as representation, and it is epistemologically disputable to associate it with some material quantity. We may observe that the view of the materiality of the object constitutes a fundamental difference between Peirce and Saussure, as the former did not exclude the possibility of a physical reality of the object. It seems that Saussure (significant-signifié) — via Jean Molino and Jean-Jaques Nattiez — has been more influential than Peirce in contemporary musicology, for example, in the comprehensive mathematic-semiotic system by Mazzola (1999, 2002). The process is not a producer of objects but of object instantiations; the objects are formed in the mind of the listener and are material insomuch as they are physically different states of a brain or a computer data storage unit. This understanding of object and process seems similar to the distinction between Schaeffer's entendre and comprendre, in regards to ultimate goal: causal listening (Dennis Smalley’s translation) is concerned with a unique object, whereas semantic-listening is concerned with an objects role in the world; its reason-to-be. However, a thorough comparison is beyond the scope of this article. Figure 9 shows the four modes of reduced listening strategies.

  abstract concrete
objective semantic-listening
subjective (causal) listening

Figure 9. Schaeffer’s square describing modes of listening strategies (écoute réduite), with Denis Smalley’s English translations.

Let us look at a simplistic example. Imagine that a composer wishes to convey the idea of “threeness”. This is the work as model. He contemplates different possibilities, like three oranges, three durians and so on. This is the work-as-process. He finally creates a composition out of three apples: an instantiation. Because of the artfulness of the presentation, the listener does not sense exactly the same thing, but maybe “regular succession of non-pitched impacts, increasingly lowpass-filtered”, or “love” or something entirely different. This perception is the work-as-object. If the composer is lucky, the semantic-listening interpretation of ‘three-ness’ is stronger than the abstract-subjective or other mode of perception.

The Score

Pragmatically speaking, we may define a composition as a set of descriptions of performance actions, and composer as an agent that assembles such descriptions in order to convey an idea to a performer agent. Let us focus on the nature of the descriptions. This is a set, an agglomerate object, consisting of images, score symbols, code, verbal instructions and so on. It is referred to as the score, and signifies the model of a work of music. For the receiver (listener and/or performer) the model is a projection of the score’s modes of existence, both work-as-process and work-as-object. Because the former is a time-dependent function, the projection is in continuous transformation. At the receiving end, the model is perceived as an instantiated object bound to a particular moment in time. It may persist as a mental representation, but has no physical existence outside the mental state of a brain (or of some other device capable of holding a representation). Figure 10 attempts to illustrate schematically the relationship between these concepts.

Figure 10
Figure 10. Two aspects of an interactive music work: object and process, and how the score engenders an instantiation. [Click image to enlarge.]

Historically, music notation was developed as a mnemonic device to support the communication or learning phase of performing a work. Well into the invention of a symbolic notation system, the choral masters cum composers would employ a process-oriented method of oral trading, aided by the score. After having composed a work, he would teach the lines one by one to the singers, who would in their turn memorise them. Later, the notation would come to serve for the composer to externalise his musical knowledge and intention, eventually leading to the parallel descriptions of the work embedded in conductor's score and musician's parts. The history of how the notion of the score developed in early motet style is discussed in A History of Western Music (Grant 1980, 79–115) Since it is relatively insensible to the passing of time, the score promises a future re-presentation of the music, specifically, of the work-as-model. However, in the mind of the perceiver, the instantiated object may dissociate from the model, leading to the instantiated object being mistaken for the work of music. This is what happens when a rupture in interpretational praxis occurs; with no person-to-person communication assuring the detail of an intention, something vital is lost. Less than everything musical can be conveyed by symbols alone.

Representation in Fixed and Dynamical Media

In the case of fixed-support music, the sonofication of the score happens at a time-point later than the time-point of the creative act. This loss of immediacy makes us think of it as a physical entity, a work-as-object. It is “form without content” and contains in itself no creativity. This is particularly clear in the case of Tape Music, but it is also, to a lesser degree, the case with music interpreted from a paper symbolic score. Simplistically put, the score paradigm based on notation reigned in European art music for almost a millennium, until improvised music was recognised as a learned art. Today, we are faced with the necessity of considering works of interactive music. This may eventually constitute a third paradigm of the work-of-music. Stravinsky’s observation about the different modes in which a work seems to exist was intuitive. A more developed expression of the same idea is given by Xenakis in Formalized Music, postulating that music can exist either “in-time or outside-time” (Xenakis 1971/1992). The former refers to temporal art, where there is instantaneous creation. Analysing and composing interactive music requires a time-dependent description of how the model changes. Can we say that a computer code formalising such change is a score, intended for a synthetic performer? Code is necessarily dynamic (primacy on process) as opposed to a traditional music notation or a soundfile, which are examples of static representations (primacy on data). However, the moment natural language is used together with a static representation, conveying additional and vital information, the whole becomes a dynamical score. Scores containing natural language can probably only be interpreted by humans. The iconic verbal score is Karlheinz Stockhausen’s Aus dem sieben Tagen, and recent works notated in score form with mixed notation — text and music symbols, as well as electronics — would include Rolf Wallin’s Phonotope and PerMagnus Lindborg’s RGBmix. In an interactive music work, it may be that the score itself is not fixed, but time-dependent. In experiencing such scores of interactive works, it seems that one has to integrate listening with structural analysis, and apprehend the aphonic (non-sounding) part of the musical work.

The Computer as Synthetic Musician: Leçons

The computer can be a synthetic musician, acting directly on the compositional structure. In Leçons pour un apprenti sourd-muet (2003) [“Lessons for a deaf-and-dumb apprentice”], I investigated methods for automatic learning, algorithmic composition and real-time synthesis. The computer performs a real-time audio analysis of an improvising saxophonist, using bonk~, fiddle~ and amplitude following to determine note onsets and offsets. Phrase structures, notes, and pitch-amplitude variations in the course of individual notes, are stored in different registers, constituting a database that grows during performance. A hidden Markov model is used for automatic recombination of notes. The more data available, the more likely is the algorithm to generate output that resembles the original. More detail about the conceptual background to the piece as well as some of the nuts-and-bolts functioning is given in (Lindborg 2003a). Figure 11 shows a screenshot of the Max/MSP patcher.

Figure 11
Figure 11. The main (right) and some of the subsidiary interface windows for Leçons. [Click image to enlarge.]

Xenakis points out that any sound analysis or composition procedure can be treated statistically and that every measurable and storable parameter of sound can be recombined using Markov chains. However, the first and non-trivial problem is of a perceptual nature: the choice of parameters to analyse. The second problem is one of efficiency: given that the size of the probability matrix is exponentially proportional to the order of the chain, we may run out of space. A Markov chain model of sufficiently high order can imitate any quantifiable style to a satisfying level, as is shown in recent work by Marc Chemillier and Gérard Assayag (Assayag et al 1998–present), and François Pachet (The Continuator). Perfect imitation is not necessarily what composers are looking for, and subsequently may choose to work with erasing chains of an order between 2 and 5, for example. With lower order, the inherent randomness of the model provides for variation and, maybe, some surprise, locally. In addition, to make the model more human, we can let the Markov matrix “age” with time by erasing parts of the database; the computer musician sometimes forgets what it has learnt. Leçons implements an approach to the phenomenon of invention through decision-making guided by probabilities registered in a Markov matrix. The computer follows a performance script containing the actions defining a performance. It does not know any notes or phrases on beforehand, and listens to the saxophonist throughout the performance, analysing the audio in order to build a database of material, allowing it to improve. The computer registers aspects of the style of a human improviser. The material is recombined using a statistical method generating algorithmically its phrase responses, which are played in real-time. The goal of the work has been to construct a situation where the music is created during the performance and where the two improvisers are compatible in terms of producing musically interesting material. Figure 12 illustrates the dataflow in Leçons.

Figure 12
Figure 12. The three main processes in Leçons: analysis, recomposition, and synthesis. [Click image to enlarge.]

Degrees of Music Interactivity

I proposed earlier that the appropriate phenomenological perspective on the computer depends on the kind of music interactivity in a situation. As a lemma to the discussion of the duality of a musical work, note that the interactivity is primarily determined by the nature of the model as expressed in a score. While both work-as-object and work-as-process are aspects — the vanishing points of observation — of the music work, the interactivity emerges as the mediator between the two extremes. A full investigation of this model is beyond the scope of the article and would have to be the subject of future work. I will only propose a simple sketch. Following the definition advanced earlier, we should be in position to estimate the degree — depth — of interactivity in different works of music, by examining the performance situation and by reflecting upon how the material and processes in it relate to our key concepts and derivatives:

We may then start to think of interactivity as a continuum between two extremes. Figure 13 gives an example of an interpretation of the degree of interactivity, or interaction depth, in five typical music performance situations.

Figure 13
Figure 13. Examples of degrees of interactivity.

Where lies the Creativity of Interactive Systems?

Our discussion of the first metaphor, the computer as instrument, concerns interactivity characterised by reactiveness and emphasises the richness of the control. Our second metaphor, the computer as expert assistant, concerns is dialogue-like interactivity, and offers the occasional sensation of involving actual creative exchange. Our third metaphor, the synthetic musician, points towards a great challenge for interactive music performance, namely to address the issues of agency and artificial intelligence in music. I would like to let Max Matthews set the tone for a few concluding remarks:

I think that computer programs that have been developed are pretty interesting as musical instruments, making sounds and timbres. They are pretty uninteresting — at least to me — as compositional algorithms … maybe [because] we haven't found or written the right algorithms; that’s a more difficult problem than just making timbres. (Mathews 2002)

Figure 14
Figure 14. Journey into the unknown. Illustration from A Bamse Adventure (Andréasson). [Click image to enlarge.]

If it is true that “the best music composition system we know of is the mind of an artistic human”(Todd and Werner 1999, 315), it will be through imitation of human behaviour that we can make machines more artful, more convincing. However, it should be kept in mind that every successful imitation of human creativity has caused a redefinition of what intelligence itself is. Using the machine on its own “brute-force” terms, allowing extensive search and computationally intensive calculations and extensive search, can make it a useful tool or assistant, but such models do not necessarily bring insights to cognitive research and our understanding of the functioning of the human brain. It is possible to use the computer as an intelligent assistant for analysis and composition, in order to learn more about what human musical creativity is. This is truly a journey into the unknown, and it may be appropriate to halt the reflections at this point. We wave goodbye to Bamse and his friends, as they fly off (Fig. 14), full of the excitement, modest confidence and hesitation that such a journey inspires.


Andréasson, Rune. Bamse. Cartoon strip. Egmont Publishing, 1966–90.

Assayag, George et al. The OMAX Project Page. 1998 – present.

Cadoz, Claude. “Musique, geste, technologie.” Les Nouveaux Gestes de la Musique. Edited by H. Genevois and R. de Vivo. Marseille: Éditions Paranthèses, 1999, pp. 47–92.

Cardon, Alain. Conscience artificielle & systèmes adaptatifs. Paris: Editions Eyrolles, 2000.

Continuator. Software developed by François Pachet. 2002 – present.

Duisberg, Robert. “On the Role of Affect in Artificial Intelligence and Music”. Perspectives on Musical Aesthetics. Edited by John Rahn. Norton & Company, 1994, pp. 204–33.

Grant, J.G. A History of Western Music. London and Melbourne: J.M. Dent & Sons Ltd., 1980. 3rd edition.

Henry, Cyril. “pmpd — Physical modelling for Pure Data.” Pure Data library. 2005 [?]. Ported to MaxMSP by Ali Momeini and adapted for Intel processors by Mathieu Chamagne, 2007. [Last accessed 25 September 2008.]

Koh, Joyce, Beetuan and PerMagnus Lindborg. TreGraphie for soprano, percussion, piano and computers. Oslo: NMIC Publishing, 2006.

Leroi-Gourhan, André. Le geste et la parole. (1, Technique et language; 2, La mémoire et les rythmes). Editions Albin Michel, 1964.

Lindborg, PerMagnus. “Leçons: An Approach to a System for Machine Learning, Improvisation and Musical Performance.” Post-symposium Proceedings of CMMR 2003 (Montpellier, France, 26–27 May 2003). Also in Lecture Notes in Computer Science (LNCS) vol. 2771 (Berlin: Springer Verlag, 2003).

_____. “Le dialogue musicien-machine: aspects d’interactivité.” Mémoire de DEA (diplôme d'études approfondies: first part of doctorat). 2003.

_____. “About TreeTorika: Rhetorics, CAAC and Mao.” OM Composer’s Book #2. Edited by J. Bresson, C. Agon and G. Assayag. Paris: Éditions Delatour France / IRCAM—Centre Pompidou, 2008. [ISBN 978-2-84426-399-5 / 2-7521-0051-5]

Manoury, Philippe. "Les limites de la notion de « timbre ».” Le timbre, métaphore pour la composition. Edited by Jean-Baptiste Barrière. Paris: Éditions Ircam et Christian Bourgeois, Paris, 1991.

Mathews, Max. “The Digital Computer as a Musical Instrument." Science (November 1963). Cited in Joel Chadabe, Electric Sound, p. 110. Mathews had just completed the third generation of the Music programming language.

_____. "Dartmouth Symposium on the Future of Computer Music Software: A Panel Discussion." Edited by Eric Lyon. Computer Music Journal 26/4 (2002).

Max/MSP/Jitter. Software. Cycling ’74, 1991 – present.

Mazzola, Guerino. “Semiotic Aspects of Music.” Semoitics Handbook Vol. 8. Edited by Roland Posner et al. Berlin: W. de Gruyter, Berlin, 1999.

Mazzola, Guerino, Stefan Göller and Stefan Müller. The Topos of Music: Geometric Logic of Concepts, Theory, and Performance. Basel: Birkhäuser, 2002.

OpenMusic. Software. Paris: IRCAM, 1998 – present.

Ryan, Joel. Private conversation. Paris, 1999.

Sandred, Örjan. OpenMusic RC library version 1.1. Software library. Paris: IRCAM, 2000.

Stravinsky, Igor. Poetics of Music. Harvard University Press, 1970 [original edition in 1942].

Todd, Peter M. and Gregory M. Werner. “Frankensteinian Methods for Music Composition.” Musical Networks: Parallel Distributed Perception and Performance. Edited by Niall Griffiths and Peter M. Todd. Cambridge MA: MIT Press, 1999, pp. 313–39.

Wishart, Trevor. On Sonic Art. Harwood Academic Publishers, 1983/1996.

Xenakis, Iannis. Formalized Music. Hillsdale, NY: Pendragon. 1992 (revised edition of 1971 publication).

Social bottom