FIXING HEADPHONES

A look at how unnatural headphones are and what is being done to fix them.

Portrait of Tammy Strobel

A look at how unnatural headphones are and what is being done to fix them. 

My Reading Room

THE PROBLEM WITH HEADPHONES 

For most people, headphones are the second thing we reach for the most after our phones. We listen to headphones every day but most of us are completely oblivious to the fact that they are a deeply flawed way to listen to music or anything for that matter. 

In the real world, sound travels to our ears usually from a distance. In doing so, it interacts with the environment. It might gain or lose energy, it might get absorbed by its surroundings, or it might be reflected. It takes on the character of the room as it reaches our ears, which is why you sound echoey when you speak in a small room and any venue where sound is important, i.e., the cinema or concert hall, requires a certain extent of acoustical treatment. Then there’s also the matter of cross-feed, which is sound from the right speaker entering our left ears and sound from the left speaker entering our right. This enables us to localize sound. 

Headphones, by virtue of being so close to our ears, don’t get affected in the same way. Though they are still affected by other factors, such as the headphone’s ear tips or pads and the shape of our ears, these are overruled by their proximity to our eardrums. Headphones are literally pumping sounds directly into our ears and into our heads. This means there’s no interaction with a room and no cross-feed, and ultimately results in a lack of soundstage - audiophile speak for the acoustic image presented to the listener.

This also explains why some dyed-in-the- wool audiophiles insist on listening only to loudspeakers. 

For many, the lack of soundstage is the primary problem of headphone listening. The goal of any good loudspeakers and headphones is realism and the lack of a believable soundstage is an obstacle for even the best and most technologically advanced headphones. Compared to loudspeakers, headphones sound considerably more congested and some even have a stuck in your head sound that makes it seem like the source of the sound is emanating from the middle of your skull. Obviously, this is completely unnatural and not the way any artist intended for the listener to enjoy their work. 

Fortunately, some of the greatest minds in the audio industry have been hard at work over the past few decades to come up with solutions to this problem. Here’s how audio companies are attempting to solve this problem.

My Reading Room

CURRENT POPULAR HEADPHONE TARGET RESPONSES 

DIFFUSE FIELD CALIBRATION 

Based on the premise that the headphone should produce the same acoustic response at the ear drum as a loudspeaker in a diffuse sound field. 

FREE-FIELD CALIBRATION 

Based on the premise that the headphone should produce the same acoustic response at the ear drum as a loudspeaker in a free field (e.g anechoic chamber) 

My Reading Room

HOW SHOULD HEADPHONES SOUND? 

The short answer is through a lot of research and development. Driver technologies have improved tremendously over the past decade thanks to new materials and new manufacturing processes. We know have drivers that are extremely rigid and well controlled and can reproduce sound across a wide frequency range with very little distortion. But all of this means nothing if we cannot get the headphone to sound right. But what is right? 

Up until about five years ago, headphone manufacturers were designing headphones using a combination of diffuse-field and free- field target responses. The diffuse-field target response is based on the premise that a headphone should sound like loudspeakers in a reverberant room whereas the free-field target response is based on the premise that headphones should like loudspeakers in an anechoic chamber. Target responses are crucial in designing loudspeakers and headphones as it provides a reference for engineers as to how their products should sound. But, as we all know, the average listening environment isn’t a perfectly anechoic room nor is it a reverberant room. It is typically somewhere in between. And so, headphone engineers soon realized that a different target response was necessary. 

One of the most prominent alternative target responses is the Harman target response. Led by Dr. Sean Olive, an acoustic research fellow at Harman International, it is the result of a scientific approach to define the ideal target response for headphones. Through extensive testing, Dr. Olive found that there was indeed a particular target response, or let’s say sound signature, that listeners preferred and that it differed quite substantially from the diffuse and free- field target responses that headphone manufacturers have traditionally been using. The most obvious difference of the Harman target response is that bass levels under about 200 Hz are noticeably elevated as compared to the diffuse and free-field target responses. This closely mimics the in-room response of loudspeakers in a good listening room. In other words, Dr. Olive’s studies showed that the majority of listeners preferred headphones that sound as if they were speakers in a room.

Headphones, by virtue of being so close to our ears, don’t get affected in the same way. Though they are still affected by other factors, such as the headphone’s ear tips or pads and the shape of our ears, these are overruled by their proximity to our eardrums. 

My Reading Room

HOW DO WE GET THEM TO SOUND RIGHT? 

So, if headphones should sound like good loudspeakers in a room, how do we get them to perform like that? There are two ways, either by design or by using a combination of design and digital signal processing. The first method is generally preferred by traditional audiophiles and purists who prefer as little digital intervention as possible. Headphone manufacturers who subscribe to this school of thought generally concentrate their efforts on finding the ideal curve and then making the best drivers they possibly can that can match their ideal target curve as closely as possible. There are a couple of upsides to this approach. Such headphones generally sound cleaner and purer, and more even in tone. The downside is that they still don’t really sound like loudspeakers in a room. So most headphones designed and manufactured in such a way still suffer from a lack of a realistic soundstage. 

As it turns out, our bodies also react with sound and that changes the way it reaches our ears and consequently the way we hear. Our head, the density of our skull, the shape of our ears, and our ear canals, transform sound and affect how we hear it. This phenomenon is referred to by industry insiders and enthusiasts as the head-related transfer function or HRTF in short. More accurately, it is the transfer function that headphones must compensate for in order for an individual to hear the sound as it was intended. 

Everyone has a different HRTF and it can vary wildly. These variations explain why it is possible for individuals to arrive at differing conclusions of a headphone. It also explains why headphones by themselves cannot recreate a convincing soundstage and sound like they were speakers in a room, at least not for everyone. To able to accurately recreate the sensation of listening to speakers in a room through headphones requires a radically different approach.

My Reading Room

A NEW APPROACH 

The use of digital signal processing to create virtual surround sound is not new. It has been around for decades. Dolby Headphone is an example of such technology. Although it uses the principle of HRTF to generate positional audio cues from two-channel audio mixes, it didn’t sound good to most people because, as I mentioned earlier, everyone has a different HRTF and a single fixed digital signal processing profile isn’t going to be good enough. However, new methods and technologies are being developed to maximize the potential of digital signal processing.

Smyth Research was probably one of the first to put out a system that could realistically render the sound of listening to speakers in a room through headphones by using custom HRTFs. Released around eight years ago, it was called the Realiser A8 and it features what Smyth Research calls this the Smyth Virtual Surround (SVS) technology. Its standout feature was that it could measure and create custom HRTF profiles of its user and then use digital signal processing to create the ideal target response so that headphones would sound like speakers in a room. 

Included with the Realiser A8 are measurement microphones that are able to capture the unique profile of each user’s head and ears. The only problem lies in capturing the HRTF of the user properly. To do so required a speaker setup and an ideal listening room, which is terribly ironic if you think about it. However, if you have all the ingredients in place, audio journalists and reviewers claimed that listening with headphones through the Realiser A8 was almost as good as listening to a set of good speakers in a room. In his report, long- time audio journalist Steve Guttenberg said that “The Realiser A8’s spatial localization is 100% convincing.” 

Recently, Creative came up with a more practical solution. Dubbed Super X-Fi, it made its debut at the Consumer Electronics Show 2018 earlier this year. Born from 20 years of research in room acoustics, human anthropometry, headphones, and music, it uses the same principle as Smyth Research’s Realiser A8, but with a twist. Realizing that it would be almost impossible for users to create their own personalized HRTFs with measurement microphones, a loudspeaker setup, and a listening room, Creative did the next best thing, which is to approximate users’ HRTFs using AI. 

Creative simplified the profile creation process by developing an app that would measure and take the key readings of a listener’s heads and ears. This app would then take these measurements and cross-reference it against Creative’s large database of ears and heads measurements and profiles that they have created whilst they were developing Super X-Fi. While they admit that this won’t be as accurate and precise as measuring in a proper listening room, this is infinitely more convenient. Once the profile is created by the app, it can be loaded into the SXFI amp, a small portable USB amplifier and DAC and users can enjoy surround sound wherever they go. 

Everyone has a different HRTF and it can vary wildly. These variations explain why it is possible for individuals to arrive at differing conclusions of a headphone. 

My Reading Room

THE ROAD AHEAD 

Now that Creative and Smyth Research have demonstrated that believable surround sound can be achieved through headphones, the next step is to adapt the technology for mainstream consumption. The problem with Creative and Smyth Research’s approach is that a personalized HRTF is required to get the best performance. Getting a personalized HRTF requires careful measurement, a loudspeaker setup, and a good listening room. These are resources that are out of the reach of most listeners and they are a major stumbling block to the widespread adoption of these technologies. Creative’s workaround of using apps and machine learning is innovative and ambitious, but it’s a technology still at its infancy. Initial feedback from early adopters gathered through social media and forums suggests that it is still hit-and-miss. Perhaps a greater database of heads and ears are necessary, but for now, the best possible surround sound in a headphone experience requires a personalized HRTF profile. 

Getting a personalized HRTF requires careful measurement, a loudspeaker setup, and a good listening room. 

Art Direction and digital imaging by Ashruddin Sani

PICTURES 123RF, CREATIVE, PSB, SMYTH RESEARCH