Electroacoustic Sound and Audiovisual Structure in Film
The following article is extracted from the author’s thesis, currently in progress under the supervision of Denis Smalley at City University, London.
In 2005, director Jacob Ballinger invited me to compose the music for his 35mm short film, Rocketman (2007), in which he intended to fully exploit the storytelling potential of the relationship between sound and image. In order to fulfil his ambition, it soon became clear that it was necessary to dismiss the traditional dividing line between music and sound design, and instead to integrate the two elements in a through-composed soundtrack. Having worked as a composer within the field of electroacoustic music for several years, using essentially the same tools as provided in a standard post production studio, I agreed to venture into the world of film sound, being responsible for the making of the entire soundtrack. In the working process I made several observations regarding audiovisual relations and their potential in relation to structural articulation in particular. On this basis the concepts elaborated below emerged. The ideas are briefly demonstrated using classic, well-known feature films as case studies, and finally applied to examples from Rocketman, from which they originally stem.
Music Versus Sound Effects — Sound Designer qua Composer?
Given sound’s ability to create temporal experiences unique to that of other media, it seems no accident that it has attained an important role in film. Where a film montage made out of cuts in time and space causes a chaotic, fractionated temporal experience, sound can bring to the image an all-embracing temporal trajectory. Conversely, where a film shot implies linear time, sound can impose the experience of non-linear time. While applying to sound in general, in traditional film making such temporal attributes are mostly exploited by means of a film score accompanying the picture. The reasoning behind this, according to Michel Chion, is that where “other sound … elements … are obliged to remain clearly defined in their relation to diegetic space and to a linear and chronological notion of time,” music, on the other hand, “enjoys the status of being a little freer of barriers of time and space” (Chion 1990, 81). Chion states that the spatiotemporal quality of music especially applies to what he himself calls “pit music” 1[1. “Pit music is … music that accompanies the image from a nondiegetic position, outside the space and time of the action.” (Chion 1990, 80)], with reference to the orchestral pit in the opera house (ibid.). In other words, pit music is the traditional music accompaniment to a film, often performed by an orchestra, residing outside the film’s world. Accordingly he implies that “screen music” 2[2. “Screen music is … music arising from a source located directly or indirectly in the space and time of the action, even if this source is a radio or an off screen musician.” (ibid.)], that is music arising from a source within the film’s world, may only in some cases attain the status of its aforementioned counterpart. A classic example occurs when music is heard over a car radio, linearizing an otherwise nonlinear montage of images showing a character travelling a long distance. The “other sound elements”, on the other hand, the structuring of which Chion considers subordinated to the spatiotemporal information within the film’s world, could, besides dialogue, be regarded simply as “sound effects”.
In the context of motion picture production, sound effects denote all sound elements that do not fall into the categories of music and dialogue. The recording, processing, editing and mixing of sound effects, including “on screen”, Foley, and background sound effects, is often managed by a sound designer. According to Walter Murch — the first person to be credited the title in recognition of his contribution to the film Apocalypse Now (Coppola 1979) — the role of the sound designer is to take care of the overall treatment of sound in film (Thom 1998, 122). Equally, in general, sound design is considered as an artistic field covering all non-compositional elements of film sound.
While one can arguably make such a distinction between music and sound effects when considering films featuring traditional film music, incorporating electroacoustic music in film challenges this idea. For decades any sound available, be it instrumental or environmental, has been part of the electroacoustic composer’s sound palette, and musical properties such as space (spatial articulation) have developed and acquired equal importance to pitch and rhythm. Furthermore, just like the sound designer, the electroacoustic composer is concerned with sound recording, editing and mixing, each representing essential and often indistinguishable parts of the compositional process. Thus, in principle, by means of the electroacoustic medium, the composer has the opportunity to include all sounds relating to a given film’s world in his compositional work, thereby potentially exploiting their temporal forces in relation to image. However, temporal utilization of sound effects in film is not new but has been practised for decades by innovative sound designers. As a brief example, consider the “tiger scene” in Apocalypse Now in which a high-pitched, sustained sound of insects freezes time and causes the experience of suspense, an effect usually achieved through the use of music, and by orchestral means. In such examples, if the categorization of sound elements into film music and sound effects is to retain any meaning, we would have to redefine the terms according to sounds’ correspondence with the image rather than simply their means of production.
An early example of an entirely electroacoustic soundtrack, which defies the subdivision between film music and sound effects based on the differentiation between sound sources or qualities, is found in the 1956 sci-fi movie Forbidden Planet. Here all sounds, except the most basic Foley sounds such as foot steps, share the same “electronic” origin, timbre and identity. Nonetheless, we perceive some sounds as sound effects while others as musical, and often, being the most interesting case, we experience them as having both functions simultaneously. Our interpretation of the sounds at any given instant relies, here, entirely on the level of correspondence between sound and image.
Regarding film in general, we can identify three main components of correspondence between sound and image: identity, time and space. Identity concerns the correspondence between the identity of a sound-object associated with a visual action and the sound we instinctively expect to be produced by that action. More specifically it relates to the object’s material and intrinsic 3[3. “The notion of intrinsic space is based on the perception of internal spatial components inherent in individual sounds and sound events.” (Henriksen 2002)] spatial properties such as texture and size respectively. The expectation as to how a given visual action will sound may be founded on both natural and culturally-related perceptual experiences. Time-correspondence concerns the timing between the visual action and the associated sound. In contrast to the real world, in film the occurrence of a sound-object may be dislocated from the action producing it. Finally, space-correspondence concerns the match between the space implied by auditory and visual cues respectively, in terms of extrinsic 4[4. “The notion of extrinsic space … concerns the sound in space.” (ibid.)] spatial properties such as room characteristics and size and the proximity of a given source.
In the context of this writing, the image not only refers to the actions taking place within the visual field of the spectator, defined by the screen, but the entire space of the film’s world. For example when we see a character walking, his feet will often have been cut off by the edge of the camera frame. However, as spectators we still believe that the person has feet and expect that they will make sound when he walks. Another example occurs when an establishing shot showing two characters in a room chatting cuts to a close-up shot of one of the characters, to give a more detailed vision of facial expression. Does that mean that the other character has disappeared? Maybe, but that would need to be revealed to us later. At the instant where we cut to the close-up shot we maintain an internal image of the entire space in which two characters are standing next to each other. While the above concerns the representation of an objective-external reality, the image may also, by means of point of view shots, especially if moving, represent an objective-internal reality, bringing the audience inside the body of a character so to speak.
Identity, time and space each inhabit a continuum between close and remote correspondence. In short, close identity correspondence is the result of coupling sound with the imaged representation of its course, as, for example, when the sound of footsteps, whether real or sound-alike, accompanies the image of footsteps. If we replace the sound of footsteps with string pizzicati, provided these remain synchronized with the image, we experience a remote identity correspondence. Close time correspondence occurs when sound and image are synchronized with each other, as in the example above. If, by contrast, we locate the sound and visual action separate places in time, yet retaining the association between the two, a remote time-correspondence occurs. When spaces indicated by image and sound respectively match each other we identify a close space-correspondence, while if there is a mismatch between the two, as there would be if we added a huge amount of reverberation to the sound of a car driving through the desert, we would experience a remote space-correspondence. The level of identity-, time- and space-correspondence may differ from each other at any given instant, and each may change dynamically over time.
Accounting for both identity, time and space, generally the closer the correspondence, the more that dimension contributes towards making the sound appear as originating from within the image. When, for example, the sound of footsteps 1) reflects plausibly the sonic identity of the shoes and the material being walked on, 2) is synchronized with the movement of the character, and 3) matches the space in which the character is walking, we identify an overall close correspondence between sound and image. Conversely the more remote the overall correspondence between sound and image, the more the sounds are pushed outside the film’s world. Pit music is a good example of sound which lacks correspondence with image in terms of identity, time and space.
In the above investigation of the endpoints of the overall correspondence continuum we discovered a link with the two traditionally divided elements of film sound, that is, at one extreme, sound effects relating to overall close correspondence, and, at the other, film music, here considered in the guise of pit music, resembling overall remote correspondence. The concept of correspondence will not, though, differentiate between sound effects and screen music. Imagine, for example, a character plays the piano, say a Beethoven sonata, until suddenly, for some reason connected to the diegesis of the film, he bangs his head onto the keyboard. During the entire sequence we may identify close correspondence between sound and image in identity, time and space, and yet experience the sound functioning as both music and sound effects respectively. In order to categorize the associated sounds as either music or sound effects we need to ask whether they are organized according to the image or vice versa. When sounds seemingly follow a filmic logic (for example the cluster sound determined by the piano player’s head hitting the keyboard), they function as sound effects, whereas if the image (as that of the finger movements of the character playing the piano) follows a musical logic the sound will serve as screen music. In the field of overall close correspondence, due to the implicit “one-to-one relationship” between sound and image, very limited room is left for ambiguity between sound effects and music to exist. Accordingly, attempts to transform gradually the role of sound from one to the other often yield rather deceptive results in which the spectators merely flip between the two interpretations.
Delving into the realm of overall remote correspondence we encounter the symbol correspondence continuum. Although sound may not correspond to image through identity, time and space altogether, it may do so in a symbolic fashion. By means of culturally generated codes, sound can take part in the moods of the images, it can tell us when to be alert, and through the imitation of musical styles or sound qualities, (e.g. analogue distortion) define the (relative) temporal setting of the images. Moreover, by means of internal references, sound is able to contextualize images, and hence to create a network of relations between otherwise separated elements of a film. As with the above three dimensions of correspondence, symbol correspondence may vary from close to remote.
In film sound, the field of remote correspondence, ultimately allowing for a symbolic relationship with the image, has largely been the territory of traditional instrumental music. Being accustomed to the film genre and its precursors (theatre, opera) we, the audience, are willing to accept the presence of the accompanying orchestra although lacking correspondence with the image. This precondition, uniquely allowing for the operation outside the film’s world, has facilitated the promotion of traditional music as the prime temporal mediator in film. In the following I will elaborate the above ideas leading to the discussion of the potentials of other sound elements belonging to the film’s world, by manoeuvring within the middle ground of the three-dimensional correspondence continua, to transcend the traditional division between sound effects and film music, and eventually adopting temporal functions usually achieved through the use of traditional music.
Identity Correspondence Continuum
Regardless of the level of identity correspondence, when a sound and a visual action occur at the same time the spectator will believe that there is a correspondence between the two. According to Chion, who denoted this phenomenon “synchresis” (a combination of the words “synchronism” and “synthesis”), we may use any sound we might desire for footsteps. While the level of identity correspondence may not influence our perception of an audiovisual event as integrated, it will nevertheless affect our interpretation of the event, from being hyper-realistic to unreal. Sound designers and Foley artists mostly utilize the property of synchresis in terms of substituting sound sources with others that produce similar sonic identities. This includes providing actions which in reality make only little or no sound with a strong sonic identity, based on natural properties of sound. By being primarily concerned with close identity correspondence, these examples occupy the upper part of the continuum, and serve to make audiovisual events seem plausibly realistic. Lowering the level of identity correspondence, audiovisual events become more ambiguous and reliant on creative, associative activity on the part of the spectator. Accordingly, remote identity correspondence is often utilized to create a dreamy atmosphere, and may evoke the sensation that we hear what a character is hearing through a blurred perception.
A factor in locating a sound in the identity correspondence continuum is the number of Materializing Sound Indices (MSI) involved. According to Chion, Materializing Sound Indices are the details in sound that supply information about its concrete materiality and production, and which cause us to “feel” the material conditions of the sound’s source. The number of MSI’s provided by a sound often depends on the quality of the sound recording. For example, a high-quality sound recording carries a higher number of MSI’s compared to dull sound recordings (Chion 1994, 114). A large number of MSI’s will often cause a sound to tip towards one of the extremes of the identity correspondence continuum, whereas a lack of MSI’s will place the sound in the midfield of the continuum, potentially allowing the listener to associate the sound with other sources within the image.
Close identity correspondence may not necessarily rely directly on real-world experiences. That is to say that one does not need to have experienced a given audiovisual event in real-world life in order to accept it as true in the movie theatre. For example, although some of us have never experienced a rocket launch in the real world, when such an event is presented to us on television or film we may perceive a close identity correspondence between image and sound. Rather than real world experiences, here the inference of close identity correspondence relies on previous experiences from watching rocket launches on TV and film. This condition is exactly what allows for the utilization of audiovisual media to establish fake realities in which the spectators may (temporarily allow themselves to) believe. By consistently coupling a sound with a visual event or object, the two of which do not cohere in reality, provided that the audience has not encountered this event or object outside the movie theatre (e.g. if nonexistent), it is possible to establish a close identity correspondence between the two. While, in principle, we may pair any sound with a fictitious object we do, however, remain critical as to what we accept as representing “reality”, and in the validation process our real-world experiences continue to provide the most important tool. Thus, for the correspondence between image and sound to be close, in this way suggesting that the object in question is “real”, it needs to reflect real-world physics.
Material Versus Spatial Identity
With regards to identity correspondence, two sub-components can be identified: material and spatial identity. Being equally concerned with the correspondence between implications made by image and sound respectively, material identity applies to substance and texture while spatial identity concerns size, and position in vertical space. To illustrate the difference between the two, and to show how each may contribute towards making a fictive object become “real”, it is relevant to take a look at the lightsaber in Star Wars, whose sound was designed by Ben Burtt. In terms of material identity the lightsaber’s continuous, humming sound can be considered to correspond closely with its visual counterpart, firstly because the image represents stable, continuous energy 5[5. Within the fiction of Star Wars, the lightsaber’s ”blade” is made by a tight loop of highly focused energy.], and secondly because in real life that energy would most likely be dependent on electricity. The humming sound, which we recognize from electronic circuits, reflects exactly those two properties. The concept supporting the materialization of the lightsaber is further enhanced by the occurrence of electrical “zaps” or cross-connection sounds when two lightsabers cross. While visually the crossing is only marked by a flash at the very instant where two lightsabers cross, the zapping sound is held throughout the entire period of contact, thereby indicating a continuous friction.
Central for the spatial identity correspondence between image and sound is the natural relationship between source size and fundamental pitch. Although the fundamental pitch of an electronic hum will not tell us anything about the size of the source, the apparatus’ ability to resonate at that frequency indeed will. Drawing a parallel with musical instruments, the fundamental pitch of the lightsaber (90 Hz) lies between that of the French horn and the lowest producible pitch of the acoustic guitar. Suggesting a magnitude in the vicinity of these instruments, 90 Hz seems slightly weighty but somewhat appropriate for representing the lightsaber with its four-foot length, including handle and blade.
Besides size, frequency may also imply something about the sounding object’s position in vertical space. Generally there is a tendency to associate a glissando moving upwards in terms of frequency with a literal upward movement. Consequently, an upward glissando will correspond more closely than a downward glissando to, for example, the image of a rocket launch. While this may seem natural, in reality, due to the Doppler effect 6[6. When the sounding rocket moves away from the listener in continuously increasing speed, the sound wave reaching the listeners ears is continuously stretched and consequently slowed down, forming a downward glissando.] relating to space-correspondence 7[7. Because of being concerned with sound in space, in the context of this writing, the Doppler effect adheres to the concept of space-correspondence.], a slight downward glissando is more likely to occur. Trevor Wishart has suggested that the former association of pitch with height relies on an environmental metaphor: airborne creatures are dependent on a small body weight with which follows a small sound-producing organ confined to produce high pitched sound. With this in mind, creatures with deeper voices are spontaneously assumed to earthbound (Wishart 1996, 191).
Potentials of Identity Correspondence
The rules outlined above, derived from our common perceptual experiences in the real world, are often followed by film makers in order to bring credibility to fictitious elements or even to turn reality into hyperrealism. Examples of the former include the association of sound to many of the fantasy creatures in Jackson’s Lord of the Rings films, modelled to have a rational physiology, in this way almost giving them a potential existence. Here, recordings of animals are transposed down to match the often enormous size of the creatures. However, a paradox relating to spatial identity occurs when it comes to the audiovisual design of large-scale airborne creatures since in such instances it is evidently not possible to conform to the rules of magnitude and height simultaneously. Here the latter, spatial metaphor is generally favoured at the expense of magnitude correspondence, resulting in dinosaur-sized eagles vocalizing the same high pitches as those residing outside the cinema. Presumably, for the same reason, the winged beasts in Lord of the Rings, ridden by ringwraiths, only roar with profound voices, matching those of real world earth-bound creatures, when grounded. 8[8. When airborne, the winged beasts tend to be associated with the ringwraiths’ crow-like screams.] A brief example of hyperrealism is the conventional association of sound with a rising, or descending helicopter. Working as sound editor on Carpenter’s Escape from New York (1981), David Lewis Yewdall discovered that the upward glissando caused by jet whine warm-ups (prior to blade rotation) could be utilized to enhance the impression of the helicopter’s movement in vertical space. Played forward or backward the sound supports the associated image of the rising or descending helicopter respectively (Yewdall 2003, 213). The most extreme example of hyperrealism achieved through the use of diegetic sound is found in space movies. Although we know that sound cannot travel in space, for us who have never been there it remains just as abstract as any fantasy world represented on film. As our perceptual reference frame remains confined by life on earth, the absence of sound may provoke a conspicuous and dramatic effect. Thus for fiction films, sound in space can be considered as a means of protecting perceptual reality, allowing the audience to forget about the real world and enter the film’s world. Conversely, in documentary, the lack of sound enhances conceptual reality, bringing awareness of the real world, which the film aims to describe, into the conscious mind of the audience.
The identity correspondence continuum is often used to separate natural from supernatural objects, or events from each other. Just as sound may contribute to making unnatural objects or creatures become natural, it may turn the natural into the supernatural. While the former is obtained through close identity correspondence, as demonstrated above, the latter calls for a more remote correspondence. To evoke the sensation that a specific object, event or a character is supernatural, its associated sound’s spectral register is often extended downwards, e.g. by transposition, hence making the source size indicated by sound exceed that of the associated visual object. If context excludes the existence of supernatural phenomena the cause of such spatial disorder is likely to be attributed to the blurred perception or the memory of a given character. The same goes for examples in which material identities, otherwise corresponding to image, are distorted.
Film sound may not only represent the exterior surrounding the characters, or their subjective perception of it, but also the physical interior of a given character. In other words, we might identify a close correspondence between a point of view shot and the sound of breathing, a combination situating the audience inside the head of the character. This category of sound, which Chion has designated objective-internal sounds, includes most biosonic phenomena.
Time Correspondence Continuum
The time-correspondence continuum concerns the level of timing between sound and visual action. Close time-correspondence denotes tight synchronicity, while remote time-correspondence reflects an asynchronous relationship between a sound and its associated image. Between the two extremes lies loose or random synchronicity. As in the case of identity correspondence, the higher the level of time-correspondence, the more realistic the result produced. Chion suggests loose synchronization, occupying the midfield of the time-correspondence continuum, yields a less naturalistic, more readily poetic effect (Chion 1994, 65). Finally, remote time-correspondence will ultimately push the sound outside the film’s world.
A certain level of time correspondence is required for the audience to perceive an interconnection between image and sound. Owing to the property of synchresis, any sound, regardless of whether it is remote from image in terms of identity and space, may function as a link to visual action and vice versa, thereby establishing an apparent mutual motivation between the two. For instance, where there is close time correspondence between an action and a musical cue, sound often serves as an “alibi” or motivation for visual action, or vice versa. Accordingly, time correspondence provides the key with which the composer or music editor unnoticeably can access the film.
In the context of this writing, visual action in film not only refers to actions within shots but also to cuts between shots. In situations where the causes of the sounds heard are not visualized explicitly, synchronizing sound and film cuts (e.g. angle cuts) with each other may be enough to establish the sensation of close time correspondence. For example, in Lynch’s Eraserhead (1977), where the “industrial” soundtrack offers very limited correspondence with image in terms of time (within shots) as well as identity and space, it is simultaneous cuts in sound and picture which bind sound and image together.
Space Correspondence Continuum
In audio post production the means for operating within the space-correspondence continuum can be considered to include panning, sound level mixing, filters (e.g. equalization and crosstalk cancellers) and artificial reverberation. While the former three properties relate to the position of a given sound source, in that, for example, sound level and spectral brightness gradually decrease as the sound source moves away from the listener, the latter relates to the actual space in which the source is situated. In film, the relations between the spatial makeup of different sounds associated with image may to some extent contradict reality without compromising the perceived realism. Sound designer Walter Murch made a relevant analogy between the property of spatial sound mixing and photography: just as the photographer, by focussing his camera, decides what are the important elements of a given setting, and hence what the spectator should look at, the sound mixer may, by way of low-pass filtering and gain-reduction, spatially soften the sounds associated with specific events, thereby throwing them “out of focus”, guiding the attention of the spectator towards the central action (Murch 1998, 89). 9[9. Murch originally uses the example to illustrate how music by use of spectral treatment (i.e. equalizing) can be brought into the background.]
“Focussing” through spatial sound mixing may also be utilized as a means for the compensation of the low spatial resolution of theatre sound projection resulting from the limited amount of loudspeakers available. For example, while in real world life our binaural hearing system allows us to filter out one voice from a group of voices according to its carrier’s spatial position, say in a pub, when such a sound is recorded and projected over a limited number of loudspeakers we merely hear a pool of voices. In order to compensate for such reduction, the sound levels of all but the one voice determined to be the object of our attention may be reduced in level.
While the above examples, since they apply to the upper part of the space-correspondence continuum (i.e. close space-correspondence), generally articulate what may be called an “objective space”, lowering the level of correspondence may bring about the sensation of what can be designated “subjective” or “portal space”. The former occurs whenthe manipulation of sound space conveys to the audience the perceptual focus of a character or a group of characters, for instance on a threatening sound whose source is not yet visible. In this case, all sounds connected with the image are reduced in level in order to make room for the approaching sound. Another type of subjective space is achieved by setting the “sound focus” opposite to that of the camera. For example, the sounds associated with the visually in-focus actions of a character in a given situation may be softened compared with the sounds emitting from the surrounding actions being visually out-of-focus, in this way indicating that the character is distracted from the situation.
Portal space, on the other hand, is the result of juxtaposing two spaces essentially remote from each other, one suggested by image and the other by sound. The effect indicates that a voice, a character, an object or an entire setting occupies a space distant from the spatial setting of the film, for instance the memory or imagination of a character. When applied to a voice, a character or an object within the image, spatial distancing prevents them from “physically” entering the image or, in other words, turns them into phantoms. When applied to an entire setting it will prevent the audience from “entering” it, making them reside in the more realistic spatial setting of the film, “watching” the sequence in question from a distance. In most cases, the articulation of portal space relies upon the use of reverberation as it provides the means of creating unambiguous mismatches between the space implied by sound and that of the image unless, of course, the actions take place in a supposedly reverberant space. Portal space is also the means for separating supernatural beings or events from apparently “real” ones. For example, when Obi-Wan Kenobi, after his physical death in Star Wars IV, reappears, however slightly nebulously, in Star Wars V, his voice is coated with reverberation clearly contrasting the exterior location in which he is situated, thereby suggesting that he is a mere ghost, the spirit of which belongs to another realm (i.e. the force).
In terms of spatial positioning the voice may, more than any other sound element associated with image, contradict the image without evoking any sense of spatial remoteness to the conscious mind of (most of) the audience. This is due to a convention of cinematic presentation allowing, for example, a soft-spoken dialogue to be heard “close-up” although the characters speaking are situated at a considerable distance from the “spectators’ point of view”, or to hear a telephone receiver as if we were holding it next to our own ear in an otherwise objective space. When conforming to this convention, dialogue evidently ceases to have any significance as spatial articulator.
Symbol Correspondence Continuum
Although sound may not correspond to image in terms of identity, time and space altogether, it may do so in a symbolic way. Symbol correspondence bases itself upon codes either culturally derived or generated within, and thus unique to, a given film. Culture-based symbol correspondence occurs when a sound fits the mood of an event or a character, a feature traditionally assigned to film music. In this respect close symbol correspondence has certain equivalence to Chion’s concept of “empathic music”, while remote symbol correspondence could be considered to embrace “anempathic music”, that is, music which “exhibits conspicuous indifference from the situation” (Chion 1990, 8). Yet, in certain cases, although being indifferent to the imaged situation, music may contract a symbolic relationship with more abstract phenomena associated with the image. Consider, for example, the very moment in Lynch and Frost’s Twin Peaks (1990–91) in which the police unveils the face of dead Laura Palmer. While there is nothing beautiful about this situation (except Laura Palmer looking like a bride) an accompanying melody of beauty culminates exactly at this point. What is at play here is a “horror music code” which Robert Spande has denoted “the sublime”, the effect of which is “a ‘haunting’ evocation of a realm unrepresentable. Namely, death.” (Spande filmsound.org)
Unique to a specific film, internal symbol correspondence may be designed by the composer or sound designer by deliberately associating a specific, otherwise remote sound with a character, an object or event. If subsequently this sound is played in combination with an image in which the associated Gestalt is absent, we experience a remote symbol correspondence. Because of the established association, the sound will, however, convey the knowledge of the Gestalt into the imaged situation, eventually evoking a sensation of its presence. As the sound in question is not associated directly with the images through identity, time and space, but rather commenting from outside the film’s world, this knowledge is shared with the audience but kept secret from the characters within the image. It goes without saying that the concept of internal symbol correspondence equals that of Wagner’s Leitmotif or Berlioz’ idée fixe. Returning to Twin Peaks, the melodic theme which in its first appearance was associated with the death of Laura Palmer, later accompanies, for example, the love scenes between Donna and James, who were close friends to Laura, giving the impression that Laura, although being dead, is aware of their action. In this example the symbol correspondence between the imaged situation and tender music is close, while the internal symbol correspondence, due to the absence of Laura, is remote.
Sound Effects Poaching on the Territory of Film Music
Relying primarily on overall remote correspondence in terms of identity, time and space, structural articulation by means of symbol correspondence has, not surprisingly, become the territory of film music. However, by somewhat loosening the correspondence with the image, other sound materials belonging to the film’s world (i.e. sound effects) may eventually come to carry out equal functions.
A case study in this respect is Forbidden Planet (1956), a major Hollywood film of its time, in which Louis and Bebe Barron’s pioneering work on the sound effects track literally developed into adopting the role of a complete musical score. 10[10. As the score did not make use of traditional instruments the original screen credit was changed from “electronic music” to “electronic tonalities” in order to comply with the American Federation of Musicians.] Consider, for example, the association of the muffled, pulsing sound with the invisible monsters of the ID 11[11. “The id is a ‘cauldron full of seething excitations’ (Freud 1933, 73) of raw, unstructured, impulsive energies…” (Mitchell and Black 1996, 20)], that is, the monsters manifested in the subconscious of scientist Dr. Morbius. On the first occurrence, the sound is paired with a point of view shot, too high up to represent the sight of a human being, approaching the spaceship which has just landed on the planet “Altair”, where Dr. Morbius resides. Despite lacking apparent correspondence with the image, the abstract, pulsing sound is interpreted as the internal sound of a beating heart, hence imagining that the imaged sight is that of an alien creature. In other words, we identify a close identity correspondence between the equally objective-internal sound and image. Later, on the creature’s second return, leading to the murder of Chief Quinn, the pulsing sound is synchronized with the appearance of huge footprints in the dust. Hence, the previous identity correspondence ceases, while a close time-correspondence between sound and image is established. The general operation within the mid-ground of the overall correspondence continuum allows the pulsing sound to place one foot in diegetic space, and the other in non-diegetic space, or, in other words, to play as sound effects and music at the same time, projecting a tense mood on the images. In the following scene, in which the main character, Skipper, assesses a sculpture cast from the monster’s footprint, the pulsing sound appears for a third time, but without suggesting the actual presence of the monster. Due to the inherent ambiguity of the pulsing sound, and its lack of correspondence with the image, it now attains a purely musical quality. Yet, because of its previous association with the monster, the pulsing sound corresponds symbolically with the image of the foot sculpture, thus emphasizing the connection between the sculpture and the monster. While having identified an instance of internal symbol correspondence, a culture-based symbol correspondence, between the foreign, unsettling mood of the sound and the situation, is also present.
As the story develops, the pulsing sound gradually becomes associated symbolically with the image of Dr. Morbius, revealing to the audience a possible, otherwise unspoken, relation between him and the monster. Towards the end, in the process of realizing that the monster of the ID is the product of the subconscious of Dr. Morbius’ mind, Skipper confronts Morbius with the underlying clues. As Morbius apparently does not approve of Skipper’s line of thought, Skipper finally cries out: “Your mind refuses to face the conclusion!” At the moment in which the word “mind” is pronounced, the pulsing sound breaks the silence, and provides the audience with the decisive clue as to what the unspoken conclusion is. Underlining the connection between Dr. Morbius and the monster, in the following moment the sound is, again, associated with the monster approaching from the outside. Shifting the sound’s association with Morbius to the monster, internal symbol correspondence accordingly shifts from remote to close. However, at the end, the pulsing sound comes to symbolize closely the unity of Morbius and the monster, manifested from his subconscious, and when finally the two die together the sound equally ceases.
The success of the soundtrack in seamlessly integrating music and sound effects, providing the key for composers/sound designers to make strong contributions to the overall structure of the film, may to a large extent be attributed the almost exclusive use of purely electronically generated sound. Compared to real-world, recognizable sounds which we tend to associate with specific objects, abstract electronic sounds can be associated with a variety of fictitious sources. Moreover, in situations where overall correspondence between sound and image is inevitably remote, real-world, recognizable sounds tend to remain associated with the film’s world, yielding rather surrealist results, while abstract electronic sounds are able to exit entirely from the diegetic space of the film. For instance, in the opening of Forbidden Planet a continuous electronic soundscape linearizes a sequence of cuts between shots of the interior and exterior of the spaceship. When the sound is associated with the exterior shots, showing the spaceship flying by, an audio visual correspondence is established between the rising and falling pitches and the vertical movement of the spaceship (i.e. spatial identity correspondence), hence making the audience perceive the soundscape as sound effects. Contrarily, when associated with images of the spaceship’s interior, with which the electronic soundscape lacks overall correspondence, the soundscape is pushed outside the film’s world, attaining a purely musical function.
Sound and Imagination
Being biased towards exploiting “abstract” electronic sound for its associative independence, and hence its ability to correspond closely with a number of non-real objects, the science fiction genre provides, more than any other film genre, a foundation for the integration of music and sound design. Consider, as another brief example, the equivocal nature of the muffled soundscape, presumably created by composer Edward Artemiev, permeating Tarkovsky’s Solaris (1972). When associated with the image of the planet “Solaris” a close correspondence between sound and image in terms of identity and space is articulated, causing us to believe that the sound emits from the planet. Yet, on most occasions the soundscape does not correspond with the image in terms of identity, time and space, but in terms of cultural and internal symbolism, expressing the planet’s psychological impact on the characters. As in Forbidden Planet, the soundscape’s ability to connect with both the diegetic and non-diegetic space of the film is exploited in the articulation of Solaris’ overall narrative structure.
What is, of course, not exclusive to the science fiction or fantasy genre is the concept of imagination which easily brings one beyond the boundaries of reality. While the fantasy genre strives to make an imaginary world real, imagination, being a part of our everyday lives, may well be reflected in films without compromising an otherwise established “reality”. Remarkable “non-fantasy” films blurring the line between “reality” and imagination by means of audiovisual design, include Lynch’s The Elephant Man (1980) with sound design by the director himself, Ford Coppola’s aforementioned Apocalypse Now (1979), and also his earlier The Conversation (1974), both with sound design by Walter Murch. 12[12. In The Conversation, Murch is credited for “sound montage” and “sound re-recordist”.] In the latter we identify a transition from overall to partially remote correspondence between sound and image, or film music gradually entering the diegetic space of the film. In the first part of the film the music, performed on solo piano, acts like a traditional score. However, as the mind of the main character, surveillance expert Harry Caul, from whose point of view the story is told, increasingly becomes distorted, preventing him to distinguish imagination from reality, the pure piano sound equally gets distorted. Towards the end, as Harry forces himself to tear apart a small figure of the holy Maria, suspecting a microphone to be hidden in her, the piano undoubtedly sounds as if being heard through surveillance equipment similar to that used by Harry himself. 13[13. According to Murch, who also carried out the music editing work for The Conversation, the distortion was generated using a synthesizer.] Consequently, a pathway between close and remote correspondence, or between the diegetic and non-diegetic space of the film, or ultimately between “reality” and imagination, is hinted. In the end, having torn his entire apartment apart without succeeding in locating the hidden microphone, Harry is sitting on the floor playing the saxophone over the film music, in this way binding the two worlds together. While initially suggesting that the music gradually enters the film’s world, a related interpretation would be that Harry, in playing along with the music otherwise residing in a non-diegetic space, ultimately exits in the real world.
Before proceeding to examples from Rocketman, it is worth noticing, firstly, that in the above three films, overlaps between picture and sound departments took place. While obviously in the case of The Elephant Man (the director and sound designer being the same person), in the latter two films it resulted from Murch also working as picture editor. Secondly, the films, as well as many of the other films discussed, date back about three decades. On several occasions sound designer Randy Thom has implied that this period was the bygone time where film makers were interested in exploring the potentials of sound in film, or, more specifically, to include sound as an equal collaborator at the pre-production stage as opposed to merely regarding it as something to be added in the post-production process where the structure of the film is already in place. 14[14. See, for example, Thom’s articles “Designing a Movie for Sound” (1999), “Confessions of an Occasional Sound Designer” (1995) and “More Confessions of a Sound Designer” (undated).]
All the way up until the final scene of Rocketman, I have striven to retain a level of correspondence between image and sound or, in other words, to situate sound within the space of the film. The approach brought with it a number of constraints as to how to cope with specific structural issues. For example, where linearization of two subsequent shots representing two different locations was required, rather than utilizing traditional music preconditioned to reside entirely outside the film’s world, I would need to search for sounds which would fit in both locations within the space of the film. For instance, on several occasions a noisy, grainy sound, able to bond with both the image of the sea and that of a rocket, was used to linearize two shots in which the respective objects were present. The approach taken also brought with it a number of new possibilities. For example, it was possible to create gradual transformations between “reality” and imagination, as demonstrated below.
Temporal Linearization using Sound Effects
The most significant example of temporal linearization by use of sound effects in Rocketman is found in the “dressing room” scene (Video 1). In terms of cinematography, two contradictory things happen at the same time. On one hand, the camera is gradually approaching the main character, Sergej, focussing on his calmness and complete ignorance as to what is going on around him. On the other hand, an accelerating montage, that is, a sequence of shots of increasingly shorter length, builds up an atmosphere of tension. Subsequently, we arrive at a steady close-up shot of Sergej which, however, soon cuts into what may be interpreted as a psychological shot, in other words, a shot where the spectators see what the character sees, or imagines. Thus, during the course of the scene, a transformation between an objective and a subjective perspective takes place.
In the process of creating sound for the scene I realized that Foley and sound effects tended to support the accelerating montage, making the scene even more hectic, hence preventing viewers from identifying with the calm Sergej, the reason being that the sounds associated with the vast amount of actions within the images inevitably also happen to join with the cuts of the montage. In other words, close time-correspondence not only occurs between the sounds and their associated actions within the shots but also between the sounds and the cuts between shots, in this way accentuating the rhythm of the accelerating montage. Because of their inability to identify with Sergej, the spectators are inclined to interpret what was otherwise intended to be a psychological shot as an objective or establishing shot. In other words, rather than considering the shot showing an empty window as a product of Sergej’s imagination, viewers may experience being taken to a different time and space.
Therefore, in order to gradually transport the spectator from watching Sergej “from the outside” to take on his point of view it became necessary to go against the accelerating montage, and instead follow the slow and steady zooming-in on Sergej. The solution was to add an overall steady rhythm to the scene, derived from Sergej’s space suit which is brought into the image early during the scene, and which anyway comes to represent his shield from the outside world. The rhythmic sound of space suit’s respirator stabilizes the temporal flow of the images, and minimizes the tension otherwise created by the accelerating montage. The change from objective to subjective perspective is achieved by gradually softening Foley sounds which do not follow the rhythm, while exaggerating by means of saturation those which do. In this way, besides supporting the established rhythm, the process reveals Sergej’s ignoring of the noisy surroundings. Furthermore, the brightness of the reverberation corresponding to the acoustic properties of the setting decreases as Sergej’s ears become blocked during dressing up. Finally, when arriving at the close-up shot, the viewers have now “entered” Sergej, and believe that the window shot is a picture from his imagination.
As a whole, the scene exemplifies a gradual transformation from overall close to slightly remote correspondence in terms of identity, space, and to some extent, time: identity as Foleys and sound effects are gradually exaggerated, space as brightness decreases, and time as Foley sounds not fitting into the overall rhythm are gradually removed. While Foleys and sound effects evolve into a rather musical sequence, and indeed carry out functions usually achieved by use of traditional music, they can neither be categorized as pit music nor as screen music. Not as pit music because the sounds are easily associated with elements within the images, and not as screen music as the sounds are organized according to a filmic logic rather than the other way around.
Just as in the dressing room scene, in terms of space, the entire first part of the Rocketman concentrates solely on transitions between the objective and subjective, or in other words between the objective space of the film, and the main character Sergej’s perception of it. The sense of realness, evoked by the exclusive use of objective and subjective space, however, becomes challenged in the scene in which Sergej fixes the satellite (Video 2), when suddenly a huge amount of reverberation is added to sounds whose sources, according to the image, inhabit an open space. The resulting mismatch between the enclosed space implied by the sound and the open space suggested by the image evokes the sensation of portal space. Adding to the foreignness of the scene in question, a slightly bell-like and musical quality emerges as a result of the reverberation applied to tiny “pling” sounds associated with the image of neon lights continuously switching on. Ultimately, although corresponding somewhat closely with the image in terms of identity and time, the associated sound seems to be framing the image rather than emerging from it, hence temporarily transcending the diegesis of the film. The ambiguous space correspondence between sound and image challenges viewers’ ability to distinguish between fantasy and reality, almost putting them in the place of the main character Sergej who is completely absorbed in fixing the satellite, neglecting that he just broke the contact with mission control, and is in great danger.
Towards the end of the film, after having repaired the satellite, Sergej finally awakes from his reverie realizing that his space ship has left without him, that he is running out of air, and is going to die (Video 3). Experimenting with sound effects and Foley sounds for the scene, for example that of Sergej’s breathing slowing down, indicating that he is running out of air, I realized that such sounds yielded a rather conceptual and deceptive result; a sort of “emotional vacuum” where the spectator would merely witness the physical death of Sergej. For the first time in the film I chose to associate a sound lacking overall correspondence with the image in terms of identity, time and space. Yet, the sound, being the most melodic and beautiful of the entire film, corresponds with image in two symbolic ways. On one hand, a cultural symbol correspondence is established between the haunting sound and the un-representable fate of Sergej (cf. Spande). On the other hand, since the sound is extracted from and thus refers to the sound which was earlier associated with Sergej indulging himself with fixing the satellite (i.e. neon lights switching on), an internal symbol correspondence occurs. Hence, the sound functions as a hint to the audience as to the very reason why Sergej is captured in space, in this way emphasizing the tragedy.
Another example of internal symbol correspondence, exclusively based on the use of sound effects, is the ever returning muffled sound of the ocean associated with a variety of images. In the opening of the film, where Sergej is standing in the kitchen watching the sea through the window (Video 4), the sound is associated with the image of the sea. Over it we hear Sergej’s silent, internal breathing, suggesting a subjective point of view, and the two sounds combine to represent Sergej’s peace of mind. In the following long section of the film Sergej is exposed to stressful situations. He is taken to the space station, and transported to space where he is to fix the satellite. When a software error causes the mission to fail, and mission control starts to blame Sergej, he decides to cut the communication with them, and enter the satellite to fix it in his own way. As he enters (Video 5), the sea soundscape from the opening of the film returns. Although it is now associated with the image of the satellite’s interior, the reference symbolizes that Sergej is regaining his peace of mind. Towards the end of the film, having completed the mission, Sergej finds himself in total peace symbolized by the final return of the sea soundscape (Video 6). This time the sound leads Sergej’s thoughts back to the kitchen where we initially heard it.
Although the sea soundscape is associated with different images, in all three occurrences a level of overall correspondence is retained, hence situating the sound within the space of the film. In the first instance (Sergej watching the sea through the window) a close identity correspondence between the noisy sound and the image of the sea, as well as a close time correspondence between the gradual amplitude and spectral changes of the sound and the image of the waves are evident. However, the dull quality of sound, suggesting it is being heard from a distance, combined with the rather close up shot of the sea, results in a slightly remote space correspondence. On the second return (entering the satellite), the dull quality of the sea-soundscape comes to correspond with the dark, rather undefined space of the images (i.e. close space correspondence), while identity correspondence, as we discover no sounding objects within the images, becomes remote. Time correspondence, however, remains close as amplitude and spectral changes of the sound follow the movement of objects within the image. An equal overall audiovisual correspondence is evident on the sea soundscape’s third return (the completion of the satellite). However, this time the spectator may experience a close identity correspondence between the image of distant neon lights switching on and the sea-soundscape, as the latter resembles that which is earlier associated with similar images.
Manoeuvring within the midfield of overall correspondence in terms of identity, time and space possibilities opens up ambiguities in audiovisual relations, thereby permitting internal symbol correspondence to occur. As in the example above, such potential may be exploited in the articulation of the overall structure of a film.
Chion, Michel. Audio-Vision: Sound on Screen. New York: Columbia University Press, 1994.
Henriksen, Frank Ekeberg. “Space in Electroacoustic Music: Composition, Performance and Perception of Musical Space”. Unpublished PHD thesis. City University, 2002.
Mitchell, Stephen A. and Margaret J. Black. Freud and Beyond: A History of Modern Psychoanalytic Thought. Basic Books, 1996.
Murch, Walter. “Touch of Silence”. 1998. Soundscape, The School of Sound Lectures 1998–2001. Edited by Larry Sider, Diane Freeman and Jerry Sider. London: Wallflower Press, 2003.
Spande, Robert. “The Three Regimes: A Theory of Film Music”. Available online at http://www.uselessindustries.com/robobo/filmmusic.html Last accessed 8 August 2010.
Thom, Randy. “Designing a Movie for Sound”. 1998. Soundscape, The School of Sound Lectures 1998–2001. Edited by Larry Sider, Diane Freeman and Jerry Sider. London: Wallflower Press, 2003.
Wishart, Trevor. On Sonic Art. Edited by Simon Emmerson. Harwood Academic Publishers, 1996.
Yewdall, David Lewis. Practical Art of Motion Picture Sound. 2nd ed. Burlington MA: Focal Press / Elservier, 2003.
Ballinger, Jacob, dir. Rocketman. Music and sound design by Martin Stig Andersen. With Christoph Theußl. 2007. More information can be found online on IMDB http://www.imdb.com/title/tt1210537
Carpenter, John, dir. Escape From New York. With Kurt Russell. 1981.
Coppola, Francis Ford, dir. Apocalypse Now. With Martin Sheen, Marlon Brando and Robert Duvall. 1979.
_____. The Conversation. With Gene Hackman. 1974.
Jackson, Peter, dir. The Lord of the Rings: The Two Towers. With Elijah Wood, Sean Astin et al. 2002.
Lynch, David, dir. Eraserhead. With Jack Nance and Charlotte Stewart. 1977.
_____. The Elephant Man. With John Hurt and Anthony Hopkins. 1980.
Lynch, David and Mark Frost, dirs./writers. Twin Peaks. With Kyle MacLachlan et al. 1990–91. [TV Series]
Lucas, George, dir. Star Wars: Episode V — The Empire Strikes Back. With Mark Hamill, Harrison Ford, Carrie Fisher and Billy Dee Williams. 1980.
Tarkovsky, Andrei, dir. Solaris. With Natalya Bondarchuk and Donatas Banionis. 1972.
Wilcox, Fred M., dir. Forbidden Planet. With Leslie Nielsen, Walter Pidgeon, and Anne Francis. 1956.