Social top


A Phenomenological Time-Based Approach to Videomusic Composition

Thesis submitted April 2008 as part of a Master of Music (M.Mus) in Electroacoustic Music, Faculté de Musique, Université de Montréal.

Résumé français

Afin de définir un cadre conceptuel pour l’analyse de la vidéomusique, ce document parcourt l’évolution historique et discursive de la musique visuelle ainsi que les multiples utilisations du son et de l’image dans sa pratique. Il se concentre sur la théorie de Michel Chion, qui a étudié la relation perceptive entre le son et l’image dans le contexte cinématographique, pour développer une réflexion sur la relation entre le mouvement et les gestes combinés dans la vision et l’audition. Finalement, à partir de ces bases théoriques, ce document analyse mon travail personnel de vidéomusique the hands of the dancer en examinant comment le mouvement et la perception du son et de l’image peuvent être exploités pour accomplir des différents effets temporels.


This paper considers the history and discourse of visual music and the multiple ways that sound and image have been engaged through its practice, in order to set up a conceptual framework to analyze videomusic. It focuses on the theory of Michel Chion who studied the perceptually binding relationship of sound and image in the context of film, to turn the dialog toward the relationship between movement and combined gesture in the ocular and aural senses. Finally, it analyses my own videomusic work the hands of the dancer with reference to this discussion, examining ways in which movement and the perception of sound and image may be exploited to achieve different temporal affect.

Editor’s Note

Due to its length, the present article has been separated into three parts:


The dream of a unified inter-sensorial compositional framework dates back to Wagner’s operatic innovations of the early nineteenth century. Composers have since struggled to create works that cross the boundaries between media. Notably, the term visual music has arisen to describe non-narrative sound and image compositions that are informed by musical structure. This genre has been interpreted in many ways, from Scriabin’s compositions for colour organs (instruments that generate coloured projections) to Norman McLaren’s dancing animations, whose sound arises directly from drawings made on the physical soundtrack of the film. While many have taken the position that only abstract images are suitable to express the plastic nature of music, visual artists such as Stan Brakhage, as well as Woody and Steina Vasulka, have initiated a dialogue between abstraction and concrete form through which the manipulation of image itself reflects the necessary pliant qualities.

The discourse of electroacoustic music focuses primarily on timbral characteristics and their development over time, rather than on more traditional tonal relationships. The transformation and manipulation of sound is given primary focus in such composition, resulting in highly fluid temporal arrangements. Because electroacoustic music engages freely with time without any structure or metric imposed by its material attributes, its language offers a point of departure for all experimental time-based media that engages with transformation and manipulation in a similar fashion. Further, the existence of shared editing techniques and similar data representation paradigms between all digital arts facilitates such an exchange. The careful association of audio and visual edits, manipulation, and form reinforces compositional structure as distinct from content, creating unified gestures between both media. Several artists working from within the domain of electroacoustic composition have explored this principle, including Jean Piché, Ryoji Ikeda, and Phil Niblock. The term videomusic is currently used in the electroacoustic music community to describe such video and sound work, presented as electroacoustic composition.

But what does it mean to make a unified gesture between media? What is the experience of this kind of inter-sensorial work? Composers working in the field of videomusic must ask themselves the relevant questions: Is the whole greater than the sum of its parts? Why is the experience of videomusic different from that of video or even of music alone? To further this line of inquiry and to build new compositional strategies from such a departure it is necessary to examine the way that sound and image inform each other when they are coupled through composition. Doing so will reveal a relationship between movement and perception in which image and sound are intertwined at the levels of both construction and experience.

To consider these questions, I will first examine the history of visual music and some of the multiple ways that sound and image have been engaged through its practice. Next I will explore sophisticated theoretical discourse in the field to see how it might serve as a departure point to explore videomusic. To extend the discussion, I will focus on the work of Michel Chion, who studied the perceptually binding relationship of sound and image, as it appears in the context of film, both narrative and non-narrative. I will attempt to enhance his theory with the work of other thinkers who have concerned themselves with understanding the dialog between movement and the senses. Finally, I will analyze key moments of my own videomusic work the hands of the dancer with reference to this discussion, examining ways in which movement and the perception of sound and image may be exploited to achieve different temporal affect in videomusic.

Chapter 1: The History and Discourse of Visual Music

1.1 Historical Approaches

As far back as Aristotle, western thinkers sought to find a connection between colour and sound in hopes of finding some kind of unified organizational principle towards a balanced æsthetic experience. Aristotle himself stated within his De Sensu “we may regard all colors as analogous to the sounds that enter into music” (Aristotle 1907, 439b). Two millennia later, Sir Isaac Newton associated the identified seven colours that corresponded to the perceivable bands of the rainbow (red, orange, yellow, green, blue, indigo, and violet) with the seven notes of the western C major scale (Collopy 2000, 357). Technological developments helped advance the struggle to pair image and sound in a meaningful way. Louis-Bertrand Castel, a Jesuit priest, invented the first colour organ in the eighteenth century (Collopy 2000, 356). Many other such devices, involving the playing of an equal-tempered keyboard to produce coloured light projections, were invented over the next two hundred years and as early as 1930 the name lumia was coined by Thomas Wilfred to describe visual art that was composed under the same time-structuring methodologies as music (Collopy 2000, 356). Later, with the addition of sound to the filmic image at the term of the twentieth century, the works that descended from this intellectual tradition would become known as visual music.

Visual music concerns itself with entirely different subject matter than that of traditional narrative cinema as presented through the commercial motion picture and television industries (Russett 2004, 110). Its internal discourse stems from a fine-art lineage and does not seek to present itself as a representation of reality, though individual works may attempt to create alternate realities, which appear convincing through their own consistency. Historically, visual music has tended toward the abstract in the same way that music is abstract (at least until the invention of sampling practice and Pierre Schaeffer’s musique concrète [1]), concerning itself more with structure and form than with signified references to tangible objects. Though it inherits from the same perceptual mechanisms that make the filmic soundtrack so successful, the audio component of visual music is free from the functional restraints normally imposed on sound accompanying film.

Many composers working in the domain of music have expressed views on the relationship they experience between sound and image. Famously, Messaien included the names of colours in his score for Couleurs de la cité céleste in 1963, and claimed to experience a relationship between music and colour that was almost synæsthetic (Poast 2000, 217). He stated that he tried “to translate colours into music: for me certain sonorities are linked to complexes of colour, and I use them in full knowledge of this” (Samual and Aprahamien 1976, 17). Ligeti too believed he could associate colours and shapes with sounds directly, and worked with the painter Reiner Wehinger to devise a visual accompaniment to his Artikulation in 1958 (Poast 2000, 217). Schoenberg, both a painter and a composer, included instructions for colours to be projected on the stage in his score for Die glückliche Hand in 1910–13, and Scriabin scored music for a colour organ in his Prometheus during the same period (Poast 2000, 217). Other composers, such as John Cage and Krzysztof Penderecki, used coloured demarcations on their scores to be interpreted by their performers while playing.

Similarly, several painters have expressed views on the musical counterparts to their images. The abstract expressionists in particular sought to unify the arts in their work. Wassily Kandinsky stated that “the sound of colours is so definite that it would be hard to find anyone who would try to express bright yellow in the bass notes, or dark lake in the treble” (Kandinsky 1977, 25) and both he and Paul Klee created several paintings that took direct inspiration from the pieces of music after which they were named, as did Piet Mondrian, Jackson Pollock, and Mark Rothko, among others (Whitney 1994, 45). Peter Peretti decided to investigate the relationship between such artists’ visual work and the music that inspired them. Pushing the results of Walter L. Wehner who had already done some studies in this area, Peretti tested 100 music majors and 100 non-music majors, equally divided between women and men in each group, to see if they could match six paintings made by Paul Klee with the 21 different pieces of music that inspired their construction. His results were surprising. Most test subjects were able to match the music to the relevant paintings with great accuracy.

Peretti attributes the findings of his study to emotional cues embedded within both the music and paintings presented. In Max Luscher’s seminal work The Luscher Color Test, Luscher expresses views that reinforce these results. He notes the physiological effects of different colours on the body, such as blood pressure increase or decrease, stimulation or relaxation of the nervous system, and the slowing down or speeding up of both heart beat and breath, and associates them with different emotional states known to cause the same results. Similar studies relating musical structures to emotion also exist (Hevner 1936), as do studies pairing specific visual stimuli with individual sounds through association (Cowles 1935); a scientific basis for Peretti’s claim can be found in the literature of his field. For many artists and composers working between sound and image, capturing an emotional resonance between their audio and visual material is sufficient to elicit interest, even without a more well defined correspondence.

Perhaps the most iconic category of visual music is that in which the sound and image presented are meant to match each other with some degree of strict mapping. Artists have chosen to create such mappings in many ways, usually withholding some aspects of their audio-visual compositions to be forged through their own artistic discretion, while other aspects are left to be determined through such binding relationships. This is both because the number of formal visual and aural elements that can be animated in a composition match only under precisely controlled conditions, and because a certain amount of artistic license is often necessary to help such strictly mapped compositions resolve æsthetically. In this category of visual music are compositions in which the visual material is derived in strict relationship to the sound, in which the sound material is derived from visual characteristics, and other work in which the visual and aural material are derived together from underlying structural data.

The practice of creating image directly from sound dates back to Castel’s colour organs in the eighteenth century. Over the next hundred years, D.D. Jameson, Bainbridge Bishop, and A. Wallace improved on Castel’s initial design, leading to deeper and more meaningful structures of association and control between sound and light attributes (Collopy 2000, 356). The earliest colour organs provided only projected washes of colour, however, later design developments allowed for more precise rhythmic control as well as more interesting mappings between the piano-like keyboards normally used to trigger such devices. Many pieces have been written for this family of instruments, with scores notated in standard musical notation to be interpreted by a trained pianist. In some cases the projections emitted by the organ serve as a visual accompaniment for a separately created music, in others the same keyboard creates sound as well.

Robert Snyder is a modern composer who works with a modern well-tempered colour organ. His keyboard controls five colour outputs to different areas of a screen, temporally modulated by his playing through what he terms a luminous envelope, instead of through a direct mapping between colour and key (Snyder 1985). In his work the scores for light and sound are not separate but instead should be comprehended as a single entity whose main attribute is density. In contrast, Scriabin famously scored music for colour organ and orchestra for a performance at Carnegie Hall (Pocock-Williams 1992, 29). In his realization the light performance served as a kind of accompaniment to the musical performance, though it was scored in musical notation and reflected the major emotional themes of its aural counterpart. A more direct mapping between sound and image used by a colour organ can be seen in the work of musician Leopold Stokowski and visual artist Thomas Wilfred, who created a scored light performance synchronized to Scheherazade by Rimsky-Korsakov, in which the music’s pitch determined colour (Pocock-Williams 1992, 29).

Complex mappings from sound to image are found in the work of Lynn Pocock-Williams, an artist who turned to computers in order to algorithmically express her ideas. Early visual computer art that responded to sound often used only pitch and temporal signifiers, as specified by the early MIDI standard, to control vector or pixel-based animations. In more recent history, such systems often react to spectral analysis of their input as well as to advanced deductions from expert-type predictive algorithms embedded in the software. In her paper “Toward the Automatic Generation of Visual Music,” Pocock-Williams describes the construction of a visual language based on animation lookup, where parts of visual sequences are triggered through deduced variations and repetitions of melodic fragments extracted from a MIDI score.

The most common direct sound visualization work seen today is in the form of music visualization software packaged inside sound playback applications for personal computers. The visualizers inside such applications as Nullsoft’s Winamp or Apple’s iTunes software respond to the music being played through a sophisticated mixture of dynamic and spectral recognition. Although these visualization algorithms are created somewhat outside the scope of modern fine art practice, nevertheless they share similar goals and techniques to those of composers working within the discipline of visual music. For many engaged listeners these visualization programs meet the æsthetic demand for related cross-sensory stimulation.

Several visual music composers have adopted a reverse strategy, attempting to create the musical component of their work using a tight pairing of their visual imagery with sound. Both Norman McLaren and Oskar Fischinger experimented with painting visual elements onto the optical tape soundtrack of their films. McLaren in particular is famous for doing this, and films he created as early as the late 1930s including Dots, Loops, and Scherzo explore this concept (McWilliams 2008). A related form of image synthesis with a different implementation found a home in the domain of computer software. Programs such as Coagula and MetaSynth concentrate on spectral information rather than on waveforms to allow users to design sounds through visual depictions of their own creation.

Another approach to tightly mapping images to sounds relies on visual scoring methodologies that musicians can read or interpret. Michael Poast, a composer who is also a painter, took inspiration from the abstract expressionist movement when he decided to start painting notation onto scores for his works. His process was gradual; he began by occasionally integrating coloured notes onto his stave lines for emphasis, adding more and more colour until his markings dominated the notes already there. Eventually he abandoned his notes altogether. In his writing on the subject, he claims that his own formalized system for matching colour, texture, and mark quality allows him a sufficient degree of precision and reproducibility to make specific works identifiable upon performance (Post 2000). He attributed this compositional methodology to the work of John Cage and the Fluxus Group as well as to the Happenings of the 1960s, and maps out several parameters that elicit reproducible responses amongst musicians sensitive to visual stimuli. Amongst these are associations of pitch and voice with colour and shade, temporal clusters with placement and density, dynamics with scale, and other emotional characteristics with shape and texture.

A common technique often employed by composers attempting to define a direct mapping between sound and image involves creating their aural and visual material in response to the same underlying structures. In this way the composition can truly be structurally coupled between the media employed without one medium taking a reactive role to the other. A variety of technological tools have been engaged in order to do this, ranging from abstract score-based systems to invented mechanical devices and more recent digital innovations. Of particular interest are systems in which deeper structural strategies are manifested over simple associations of note and pitch.

Recently a slew of artists using generative processes to manufacture both sound and image have arisen in the field. Tools such as Cycling 74’s commercial Max/MSP/Jitter programming environment and the open source community’s rough equivalent Pure Data have opened up algorithmic visual music design to a new generation of composers and software designers. Luke Dubois’ work is a prime example of such tools being used to push the barrier of co-structured image and sound. Dubois, who is perhaps best known as one of Jitter’s original developers, creates pairs of visual and aural synthesis instruments using Jitter’s OpenGL library and his own sound synthesis extensions to MSP. He strives to match perceptible æsthetic characteristics between each pair of instruments in order to unify their relationship, and maps related control values to each of their synthesis parameters concurrently. Dubois’ programming work, meant for live manipulation in a performance environment, creates a base for his own visual music improvisation. His mappings favour spectral changes over tonal or harmonic structure and are associative based on what are fundamentally his own æsthetic judgements. Interestingly, though it’s Dubois’ data sets that control both his abstract visual and aural imagery, variations in these sets do not create wildly different compositions. Instead, real æsthetic variation is created by Dubois’ choices of software instruments and the transitions between them, though they are synchronized on the level of individual gesture.

1.2 Theoretical Discourse

A very different example of a composer using control data to associate his sound and image work is John Whitney. Over the course of his life, Whitney found several ways of unifying sound and image, searching for meaningful bonds between colour and tone that functioned as a single compositional unit. He named such functional associative relationships complementarities, and described them as existing in an “inter-reactive form of temporal union” (Whitney 1994, 46). Whitney, however, soon grew dissatisfied with associating sound and image elements only through their temporal placement. Although he was quite inspired by Kandinsky’s writing on the fundamental sameness of visual and aural experience, he felt that the avant-garde’s exploration of co-structured sound and image almost never addressed the real issue of musical harmony:

A contemporary might as well have proposed a poetry (as some did) that ignored word meaning so that random blocks of letters might be spaced in strict clock-time patterns between punctuation marks along with stunning scattered “silences”. In truth, the clock only measures time. Harmonic forces arbitrate (shape) time in music just as meaning and grammar shape the temporal architecture of a sentence. (Whitney 1994, 48)

Whitney’s desire to capture and communicate the tension and release of consonance and dissonance drove him to experiment with relational structures rather than with simpler direct mappings. He invented what he termed a differential digital harmony to follow from his idea that “progressions of ratio in visual as well as tonal (chordal) patterns of harmony lie at the heart of our perception of time as æsthetic structure” (Whitney 1994, 48). In his exploration of this idea, Whitney derived a family of algorithms in the sixties, seventies, and eighties that used differential dynamics and other geometric structures to model musical harmonic architectures, through pixel and vector animation, creating both his images and his music from the same harmonic mathematics. In this way, he sought to reflect meaningful musical relationships within his synthesized digital imagery, manifesting the principles of music composition outside the domain of sound.

Visual music works range from loose pairings of music and abstract footage to animations whose sound are seemingly diegetic, that is, belonging to the narrative within which the objects themselves reside. No formal rules exist in the field to determine temporal or æsthetic relationships, however, many different compositional paradigms have been proposed. The great variety of mapping strategies employed by visual music composers help define procedural æsthetics unique to each body of work. The sheer multitude of such work produced, each piece striving to create its own unified aural and visual language, demonstrates how varied successful techniques can be. Several composers have struggled to create an underlying theory for visual music, one that explains the deep connection between sound and image assumed to be present by those working in the field. These theories range from paired matching of attributes such as colour and pitch, or size and volume, to more complicated philosophies of shared underlying structure that can be used to generate a multi-sensorial experience.

Fred Collopy discusses several attempts to pair sound and image in his article “Color, Form, and Motion: Dimensions of a Musical Art of Light.” He notes the similar physical properties of sound and light that have long haunted philosophers, but concludes no matching pattern in how humans perceive colour band and tonal band structure (Collopy 2000, 357). Collopy disparages attempts at direct mappings between musical tonality and abstract colour saying that “there is no reason to assume that structures that create beautiful sounds will in and of themselves produce works which look beautiful” (Collopy 2000, 357). Tom DeWitt makes a similar observation in his article “Visual Music: Searching for an Aesthetic.” Turning to the work of Helmholtz, who established a relationship between the overtones produced by human hearing mechanisms and the geometric progressions found in tonal sound, DeWitt attempts to isolate psychological factors relevant to the perception of moving abstract image that might bear meaning on creating a “visual æsthetic similar to music” (DeWitt 1987, 118). To this end, DeWitt investigates visual structures in his own work that reflect these factors using symmetry and threshold borders along with the persistence of vision.

DeWitt’s focus on the perceptual characteristics of vision addresses few of the compositional mechanisms of music, and instead relies on the judgment of the individual composer to sequence visual material as they see fit. This dialog is resumed in the writing of Brian Evans, who attempts to define a theory of visual harmony through similar mechanisms, as well as one of visual counterpoint, and one of visual tonal centre. Evans suggests a precise mathematical measure for evaluating the amount of temporal coherence in a given digital image based on the colour values of its pixels and their distances from what he terms neutral domain (Evans 1990, 43), essentially monochromatic shades whose red, green, and blue components have even value. From this metric Evans believes a dialog between tension and release similar to traditional musical harmony can be constructed. Other contributing factors towards this tension and release for which Evans suggests metrical analyses include the visual weight distribution of an image as determined by the position and density of both abstract mass of colour, and discretely perceivable objects contained therein. Evans calls upon the fundamentals of visual design to discuss this theory, echoing Rudolph Arnheim’s claims that “[o]ne of the basic visual experiences is that of right and wrong” (Arnheim 1966, 2). For Evans, this same rightness and wrongness amount to visual consonance and dissonance:

Any introductory design book talks about how to achieve visual balance and harmony and good visual composition. From this premise composing visual music is a simple process. If rightness is codified and understood, wrongness is easily created by not being right. Movement from visually wrong to visually right is a construction of tension/release. (Evans 2004, 2)

Visual counterpoint is manifested for Evans through discretely definable visual elements moving through consonant and dissonant relational forms. Here, Evans gives a second but related criterion for the construction of visual consonance and dissonance. In movement, it is visual balance, direction, and weight as determined by the motion of the eye that creates the necessary tension and release for temporal dialog. He again appeals to Arnheim who says that “[e]quilibrium is attained when the forces constituting a system compensate one another” (Arnheim 1966, 26); to this Evans adds that “[b]alance in an image should give a sense of being out of transition, with all elements having reached a sense of repose. This balance is accomplished by the interplay of elements comprising the image” (Evans 1992, 14).

Strong parallels exist between Evans’ description of visual consonance in elemental counterpoint and techniques used by filmmakers including storyboarding and even post-to-post animation. This is no accident, and convincingly illustrates Evans’ understanding of temporal coherence and motion. A storyboard is a graphical tool used to help filmmakers organize the narrative of their story into discrete segments and shots. It depicts select images that represent key moments of narrative, in order to trace the action of a scene through completion. Usually in such a depiction a single image will depict a static shot, with more images used when the frame itself or key elements inside the frame are in motion. Post-to-post animation is a technique where animators provide images for only the beginning and end frames of a motion or sequence; the successive frames are generated through tweening — an approximation of the motion’s in-between points that can be rendered either through computer software or by hand. Storyboarding is a useful tool for filmmakers precisely because it depicts moments of visual integrity or consonance, the transformations that can be filled in through the viewers’ own imagination. In a sense, these moments are privileged through perception above those that occur mid-motion. An entire story can be communicated through them, which is not necessarily the case with images in motion, because these images do not imply the places where the motion stops. Post-to-post animation arose as a technique to improve efficiency and save money in animation studios where it was found that frames in movement tended to be less recognizable than those that depicted the motion’s beginning and end points. As a result, the frames did not need to be completely redrawn, or at least, not by the more expensive talent of the company. The privileging of these relative moments of visual consonance serves to illustrate Evans’ depiction of visual balance in terms of movement.

Evans extends this discussion of elemental counterpoint to provide a model from within which visual music pieces can be constructed. He believes that visual imagery can progress through related structures akin to key changes in tonal music through the nested resolution of sequential, weighted, transformation (Evans 1992, 15). He describes this process as it mimics the temporal details of music structure:

In music composition particular chord progressions and melodic materials work to define the overall form of the work through devices of repetition, variation and contrast. The tonal system establishes a hierarchic relationship in which the harmonic material is always striving to return to the tonic key and chord. Main melodic material also develops by establishing a hierarchy, creating a sense of resolution when the main theme returns. Structural keys can be advanced in time-based visual design through repetition and variation in a manner similar to harmonic/thematic development in music. Hierarchic relationship of structural keys can be established, functioning temporally as the main motivic material of an abstract animation. (Evans 1992, 14–15)

Underlying this discussion is the explicit assumption that tonal music is organized as it is, not because of psychoacoustic principles, but because, fundamentally, its shape reflects inherent truths about time-based composition.

Evans’ æsthetic theory relies on the perceptual characteristics of moving images evaluated through the dialog of tonal musical structure. His carefully constructed visual analogies, derived from counterpoint and harmony, are meant to communicate the tension and release implicit within tonal musical discourse. While it is certainly true that a dialogue between these two factors is central to all time-based narrative, it is unclear to what extent analogies dependent on specific tonal music constructions carry across media. Certainly, music composition deals with questions regarding the abstract structuring of temporal events, however, specific constructions used to this effect are likely dependent on the psychoacoustic properties associated with tonality as well as on previous conditioning. Tension and release can be constructed through multiple structural designs, many of which have no analogies with tonal music. Any theoretical framework seeking to explain visual music must also account for the success of pieces that stray from this model.

Evans’ writing is among the most sophisticated in the field, building on the insightful observations of John Whitney, and incorporating principles of visual design theory as well as musical theory and analyses. Three questions, however, haunt his theoretical framework when it is applied to visual music composition. First, are the temporal organizational principles of tonal music discourse fundamental to the way composition is perceived over other media? Second, how should sound be paired with such composed visual structures under this proposed framework? Third, and perhaps most relevant, what is to be gained by the pairing of sound and image under such circumstance? These questions must be answered in order to truly build a framework for visual music composition.

In order to analyze videomusic in particular, a professed sub-genre of visual music, Evans’ theory must be extended and tailored. The term videomusic has arisen in the electroacoustic community to describe composed works for video where the soundtrack and visual component are tightly integrated. These works draw from the rich traditions of both experimental film and video art while inheriting the compositional vocabulary of electroacoustic music. As a result, concrete elements that refer to the physical world as well as abstract elements are commonly used throughout videomusic composition with a significant focus being on the transformation of both audio and visual material through digital processing. The mix of these elements necessitates a dialog between form and abstraction as well as one between sound and image, and opens up discussions descended from more classical narrative film theory.

Equally pressing, the compositional structures of such pieces do not necessarily follow the structural framework of tonal music as referenced by Evans in his discussion. Typically, though not universally, the discourse of electroacoustic music focuses on timbral characteristics, rather than on more traditional tonal relationships, and their development over time. In such composition, the transformation and manipulation of sound is given primary focus, as are perceptual elements such as sound segregation and spatialization. This often results in highly fluid temporal arrangements, and no definable metric structuring of time must be present in a piece. Discussions of compositional direction and resolution often occur in cinematic terms or at the level of individual gesture and transformation, rather than in terms of traditional counterpoint or harmony. Evans’ theory is useful specifically because it frames the discussion of visual music in terms of movement, duration, and transformation. To apply these concepts towards the analyses of videomusic, an understanding of them must be reached in terms of their perceptual characteristics for coupled sound and image, as described through gesture rather than motivic development.

Continue to next section (Chapter 2: Movement and Gesture in Videomusic)

Social bottom