English

Sound Reproduction Using Multi-Loudspeaker Systems

by Philippe-Aubert Gauthier

GAUS (Groupe d’Acoustique de l’Université de Sherbrooke), 2500 boul. de l’Université, Sherbrooke, Québec, Canada J1K 2R1

Parts of this article are also discussed in “An Introduction to the Foundations, the Technologies and the Potential Applications of the Acoustic Field Synthesis for Audio Spatialization on Loudspeaker Arrays” [PDF], by Philippe-Aubert Gauthier et al: An article discussing Wave Field Synthesis, a sound reproduction technique which reconstructs the virtual sound field over a vast zone of the reproduction space. Also concerns research and creation with networks of loudspeaker networks including up to one hundred sources.

This article presents a short introduction about the broadcasting of sound with multi-loudspeakers systems. This discussion concentrates mainly on the classification of the technologies of reproduction of the spatial character of the natural hearing while situating the sound diffusion with multi loudspeakers systems in this context. With such a technological topography of spatial sound reproduction, it is easier to be situated in the present times or in connection with future. It is with the concern of general clarification that this presentation was so drafted. Although this review bases itself essentially on the technical dimension of the problem, it does not penetrate into the detail of technologies and is specially built to answer necessities or questioning which could arise in the process of creation or artistic theorization. So, within this trend, the current and future spatial description and, finally, the potential roles of the artists are the other evoked subjects.

By means of his book "Le Son des Musiques", Delalande succeeds in demonstrating the transition of an importance of the music as a system of notes towards an importance of the music essentially, or now more frequently, based on the aesthetic appreciation of the sound as raw material . In this context, one adheres to a specific definition of the aesthetics of the sound, that is auditive qualities of sounds. Qualities which could be formalized by means of adjectives such sweet, rough, warm, granular, smooth and a plethora, even an infinity, of others linguistic possibilities. In this same study, an important parallel is established between this migration of interests and the evolution of the technologies of sound reproduction or recording . With the improvement of the fidelity of sound reproduction on fixed supports, one notes the inclusion of new dimensions and aesthetic considerations in the now most frequently recorded music. So, as unique evoked example, the progressive reduction of the level of background noise of recordings incorporated gradually the acoustical qualities of the reverberation of the recording places. Previously, during the production of the recorded music, one did not hear either this reverberation or all its subtleties, it was not so the subject of consideration and aesthetic usage. At the moment, it is obviously no more the case with the high temporal and spectral fidelity of professional recording.

It is by extrapolation of this observation that it is possibly valid to place the current "spatial music" in a technological context of sound reproduction. Because it is now possible to reproduce, more or less faithfully depending on techniques, with commercial (1) or currently under research (2) technologies the spatial character of the natural hearing, it is likely that the artist feels, consciously or not, individually or collectively, the need to appropriate this new parameter now usable with the technologies of spatial sound reproduction. Music was always obviously a spatial experience, but this character was lost with the arrival of the initially monophonic audio recording. Inherent to musics played by instrumentalists with acoustical instruments and now gradually got back by technological means, this reproducible spatial character becomes a subject of aesthetic considerations when one reproduces voluntarily (3), or uses, this character by means of current technologies.

If one adheres now to the idea that the usage of a branch of tree as tool can be identified as a technology, it is possible to place otherwise spatial music or his equivalent in audio art. My personal opinion, which does not demand the exclusivity of all the possible explanations, origin so of a point of view more close to researches within the field of virtual reality display (auditive, visual and even tactile).

So, just like for the hi-fi sound reproduction, it is possible to suppose that the faithful reproduction of the spatial character of the auditory experience can join in a long-standing tendancy of the human society regarding reproduction, or representation. Let us think, as a first example, about the elaboration of the used rules for the reproduction of graphical perspective originally made with the technologies of the moment such as simple rulers and inks. Another obvious illustration of this will for representation is of course any, ancient or recent, figurative drawing or painting. In a surely more current and technological trend, artificial intelligence which tries to reproduce simple elements of natural intelligence could also join in this secret dream of representation, reproduction. It is the same for artificial evolution and genetic algorithms which, in music and in musicology, are used to reproduce creativity or to study the evolution of the music . So, by means representation and reproduction, artistic works are build up and certain phenomena are communicated, explained and understood. Although this dimension can be investigated more often by the audio or conceptual artists, it can also be significant for the composers.

Before continuing, one owes underline quickly the practically equivalent meanings of two expressions on which rely this text and certainly many other discussions and debates. In fact, because it is practically unthinkable to dissociate the broadcasting of sound on "multi loudspeakers" systems from an interest for the spatialixation, we shall use rather, for the rest of this text, the expression audio spatialization.

THE SITUATION OF MULTI LOUDSPEAKERS SYSTEMS IN THE TECHNOLOGICAL PARADIGM OF AUDIO SPATIALIZATION

While the previous section briefly touched the intentions attributable to the existence of the creative broadcasting on multi loudspeakers systems, this section concentrates on technological methods appropriate for these hidden or explicit intentions. Obviously, one limits himself to the discussion of general intentions, methods or techniques because the conceptual and artistic approach of such technological possibilities is left to the artist own intentions defined by its philosophy or style.

Amid researches made on the subject, it is possible to introduce a first dichotomy for the main possible approaches. The first class of methods is the simulation of the perception and the second, according to a school of thought which answers the interest to reproduce a sound field on a more or less spreaded zone, is the simulation of sound fields. This first distinction of two general approaches is represented in the Figure 1 and can be easily illustrated by some practical examples.

Fig. 1 – Classification of functional approaches and schools of thought in artificial reproduction of the spatial character of the hearing.

As an example, the set of what is usually identified as binaural technologies belongs to the simulation of the perception. Such technologies suppose, with reason, that the sole reproduction of the acoustic pressure at the eardrums of the listener can be effective for the reproduction of an auditive spatial impression. The validity of this hypothesis rests on a well-known and verified fact. One knows indeed that a frequency-dependant filtering operates on the sounds which goes to the ears according to their direction of arrival (with regard to the external ear which includes the bust, the head and the pinna). This filtering, essentially attributable to the external ear, includes binaural indications which allow the auditive location of the acoustic sources because binaural signals are slightly colored according to the direction of arrival and according to the distance. This slight coloring allows to localize unconsciously the acoustic sources. Binaural technologies (4) for the spatial sound reproduction thus belongs to the simulation of the perception because this solution includes a part of the chain of the auditive perception, which is, in that case, the external ear. Cross-talk cancellation also belongs to the binaural techiques. Cross-talk cancellation, some time identified as transaural stereo, is requested while one wish to cancel the cross-talking between the left loudspeaker and the right ear of a listener. Once this is done, standard loudspeakers with cross-talk cancellation can act as a remote headphones for binaural reproduction. The conventional techniques of sound recordings and stereophony base themselves as for them on the observation, or the more or less empirical knowledge, of perceptive phenomena to establish guidelines for manual mixing (as for phantom imaging by means of intensity stereophony or time stereophony). Once again, a part of the chain of the spatial perception of sound is taken into account by means of perceptive experiments and observations resulting from a given set of “typical audio” actions (like gains or time delays). This also motivates the belonging of conventional stereophony to the broader simulation of the perception approach.

Few systems base themselves on sound field simulation because the technological and the implied physical concepts are usually more complex and require a fundamental background in acoustics and signal processing. In practical situation, the number of reproduction sources and the signal processing implied in sound field simulation are both more important than those evoked by the simulation of the perception.Such as mentioned quickly earlier, the appropriate hypothesis for the simulation of sound fields is the following one: by reproducing a sound field in the reproduction space (or in a more or less extended part of that space) with a faithful spatial distribution of the sound pressure, a complete hearing system (including natural filtering by the external ears) will be subjected to a physical stimulus which correspond to the virtual stimulus, that is the one that one tries to reproduce. Clearly, such a task presents a more physical nature and can not be reached by the sole theoretical or intuitive understanding of the spatial sound perception. Given the number of sources and the necessary signal processing, one can understand the lesser popularity of sound field simulation. “Wave Field Synthesis” (WFS), as its naming suggests, is a method intented for sound field simulation because the system tries to reproduce (objectively or, in a less ambiguous way, physically) a sound field prescribed for a zone surrounded with loudspeakers. Basic Ambisonic reproduction is also a sound field simulation since a sound field is partly recreated at the center of an Ambisonic surrounding loudspeaker array. The creation of a synthetic directivity pattern around a reproduction source, which can include several loudspeakers, is also an example of technology which belongs to the simulation of sound fields. It is important to note that these two problems present a fundamental complementarity. The first type (sound field reproduction inside a network of sources, as for the WFS) is an inner problem and the second type (synthetic directivity reproduction around a complex acoustical source) is a problem which one could qualify as outer. This new division, which may seems purely descriptive, has important repercussions on physics and technologies implied by these two additional objectives.

In this hierarchy, the place of the broadcasting on multi loudspeakers systems does not hold in a single specific compartment. The naming “multi loudspeakers” is, although conveniently and effectively used in many situations, too generic for it. It can indicate: a broadcasting on “Surround 5.1”, the perception simulation or even the usage of synthetic directivity sources arranged on a stage for a loudspeaker concert. Going further, the simultaneous presentation of audionumeric material, varying according to respective canals, on a series of headphones, could receive this same naming of broadcasting over multi loudspeakers system. So, although it is difficult to exactly attribute parts of the organization of the Figure 1 to the presentation on several loudspeakers, this expression resonates for many musical practices and completely finds there its utility.

With the exception of certain combinations and possible hybridizations, the classification of approaches such as presented in Figure 1 regroups the set of possible, even conceivable, technologies in audio spatialization. One notes on the other hand that such a taxonomy is not massively used; so, according to the possible view points, other qualifiers can be used to build classifications which could focus on obtained results (system in two dimensions, three dimensions, homogeneous or heterogeneous distribution of reproducible directions, etc.).

CURRENT AND FUTURE SPATIAL DESCRIPTION

Advent and standardization of new formats of structured multimedia data as MPEG-4 will be also one of the key operators in the development and the blooming of the spatial sound reproduction. This type of data formats, which can obviously contain audio tracks without any data compression, allows to include instructions of signal processing and descriptions of virtual scenes. For this reason, MPEG-4 formats held attention of those that are concerned by the interaction with a virtuality . With such an approach one speaks of spatial encoding and not of channel encoding. Contrary to the time when traditional methods of stereophony (channel encoding) where used (this implied mixing operations on a variable number of channels to manually reach the wished scene), the composer who will work from then on with spatial encoding will be a lot closer of the virtual scene which he tries to describe. Contact between the work and his creator should be more immediate with the spatial encoding. Finally, it is the reproduction system which will handle the virtual scene decoding to present it to the audience.

It is now interesting to return back to Figure 1 which had initially put a certain classification on the possible methods for the reproduction of the spatial character of the auditory experience. So, about the future of such technologies, one can question about the compatibility, about the complementarity or even about the competitive fight between simulation of sound fields and simulation of the perception. At this time, opinions are various. Some preach only by the simulation of the perception while the others believe only in sound fields simulation. Roughly, when a sound system addresses a single listener, the simulation of the perception is surely the most attractive because it remains simple to realize with a minimum of equipment and specialized knowledge. Obviously, for a system addressing a complete audience, it would be more effective to seek for sound field simulation. Knowing that the major benefit in favour of the simulation of the perception is essentially the simplicity of its implementation and that in presence of a complete audience such a type of simulation should inevitably include a tracking device to take into account the movements of each auditors (here making reference to binaural technologies), it is clear that a threshold of complexity equivalent to that of the simulation of sound fields is fast reached. As an example, it is an argument in favour of the reproduction of sound fields for concert applications.

POTENTIAL ROLES AND IMPLICATIONS OF THE ARTISTS

The previous sections have already implicitly suggested what could be the implications and the potential roles of the artist in all this collection of the spatial sound reproduction. Roughly, and according to a personal opinion, it seems to exists several possibilities and tendencies. They do not all inevitably go in the same direction, but they certainly complement each other.

The appropriation of the new techniques of spatial sound reproduction while they still at the stage of research or development seems to be one of the main possibilities to avoid that the artist gets the results of technological developments only when they are finished.Stated briefly, one would favor the artists’ formulation of a theoretical or artistic comment on currently developed technologies so that it is thinkable that collaborations and influences take places among the composers and the researchers.It is desirable thatsuch collaborations among artists and researchers become a reality so that technological productions pertain to the real necessities of the users, like composers, mainly concerned by such technical developments.In opposition to this former possibility, some artists owe to appropriate technologies in a independent and autonomous way by hacking them or by criticizing them.This second alternative is also important so that artistic practice remains autonomous and not in the trailer of commercial technologies. Generally, this second avenue supposes a certain understanding of technologies by the artist.

Now, what has to expect the artist, how does he have to get ready for changes since is not inevitably implied within technological novelties development? The first thing to do is certainly not to worry too much about the longevity of works for current configuration such as the "Surround 5.1. If the composer makes sure to keep track of the reproduction sources configuration which he uses for a current work, the compatibility of the future technologies with the present will insure the longevity of works in question.

This compatibility holds since the notion of spatial encoding should gradually replace channel encoding.Such as noted previously, spatial encoding is based on the description of a virtual scene rather than on the mixing for a set of transmission or recorded channels. The virtual scenes include information about the position of the virtual sources, the virtual sources type, and which audionumeric file feeds each virtual source. Many techniques, nearly all except conventional stereophony, are directly touched by the spatial encoding and can so include a description of a virtual scene. To reach a compatibility with systems as the 5.1, one have to virtually build the configuration of the corresponding loudspeaker layout. It is what is sometimes identified as "virtual panning".In the worst case, it remains easy to reconstruct a real multi loudspeakers system to admit an older format.

CONCLUSION

By way of a short conclusion, let us simply note that the future of spatial sound reproduction seems promising and propose many new possibilities which will have certainly much of repercussion on the manners to work with space as a musical parameter. The vitality of many current researches about the reproduction of the spatial character of the hearing also opens the door to collaborations and exchanges between the technical and artistic circles.

Examples of commercial technologies would be all the "Surround" ones with various configurations like 5.1 or any other one, hacked or not. Applications of binaural techniques are also available on the market.
Techniques at the stage of research are for example: cross-talk cancellation within the framework of binaural techniques, the synthesis of sound fields like "Wave Field Synthesis", creation of sources with synthetic directivity and Ambisonics of high order.
The notion of artistic will is here important since initially, it is the physical constraints which make inevitable the spatial character of the played music because the musicians are distributed in space and because the acoustic instruments show a directive sound radiation pattern.
These can include: binaural recordings with manikin, binaural synthesis or binaural filtering of direct sound recordings by the measured impulse responses of external ears.

References

The following documents present some interesting readings on the subject and supports some parts of the present article.

F. Delalande, Le Son des Musiques — Entre Technologie et Esthétique, Buchet/Castel, 2001.
E.R. Miranda, «At the Crossroads of Evolutionary Computation and Music : Self-Programming Synthesizers, Swarm Orchestra and the Origins of Melody», Evolutionary Computation, vol. 12, no 2, p. 137-158, 2004.
J. Blauert, Spatial Hearing, The psychophysics of human sound localization, The MIT Press, 1999.
P.R. Cook, Real Sound Synthesis for Interactive Applications, A K Peters, 2002.
J. Plegsties, O. Baum, B. Grill, Conveying spatial sound using MPEG-4, Proceedings of the AES 24th international conference, 2003, p. 58-65.
E.D. Scheirer, Structured audio and effects processing in the MPEG-4 multimedia standard, Multimedia Systems, 1999, vol. 7, p. 11-22.
M. Jessel, Acoustique Théorique : Propagation et Holophonie, Masson et cie, Paris, 1973.
F. Rumsey, Spatial Audio, Focal Press, 2001.
M.F. Davis, History of spatial coding, Journal of the AES, 2003, vol. 51, no 6, p. 554-569.

eContact!

eC!

Social top