How noise cancelling and face tracking enable more productive voice and video calls

Senior Content Marketing Specialist

Dave Hitchins

Senior Content Marketing Specialist

With our busy modern lifestyles and hybrid working arrangements, we often have to attend meetings and video calls in less than ideal circumstances. Whether we are in a busy shared office, at home with builders working, or in an airport departure lounge, background noise, bad lighting, and shaky cameras can make effective collaboration difficult.

In this blog we look at how technology can help to reduce the impact of unwanted noise and meeting participants on the move to enable more efficient and enjoyable voice and video meetings.

We also look at how these features can be incorporated into an organization’s existing communication systems and infrastructure without the need to invest in completely new technology.

Reducing noise on voice and video calls

Let’s start with the most common and obvious complaint for anyone involved in a virtual meeting – unwanted background noise.

Just one person in a noisy environment can disrupt the entire flow of the meeting, making it an unpleasant and unproductive experience. Often the guilty person will be wearing headphones but using the mic on their laptop, so are oblivious to the mayhem they are causing.

In addition to background noise, audio distortion caused by electrical interference, patchy mobile signal, or even someone’s metal lanyard clip bumping their microphone can result in jarring sounds for all meeting participants.

This is where technology can help – through a combination of approaches known collectively as denoising which includes elements of noise cancellation and noise reduction.

While noise cancellation actively subtracts unwanted noise from an incoming audio signal, noise reduction encompasses various techniques that filter out and suppress noise. These are the most relevant tools for improving the quality of audio calls and online meetings.

What is noise cancellation?

We are all familiar with noise cancelling headphones. You might be wearing a set as you read this, or maybe have a colleague that has to be prodded to get their attention when they are wearing theirs.

These headphones use something called Active Noise Control (ANC) which involves using a system of microphones and speakers to generate sound waves that are the exact opposite of incoming audio wavelengths – effectively cancelling them out and resulting in near-silence so that the wearer can work in peace or fully appreciate the music they are listening to.

However, noise cancellation technology on its own is not an effective way of removing unwanted noise from voice and video calls. With multiple people on the call, some with colleagues talking in the background, and others with different types of ambient noise, ANC would not be effective. Other technology is therefore required.

What is AI noise cancelling?

AI noise cancellation is a new technology that uses machine learning algorithms to selectively filter out unwanted background noise while allowing desired speech or audio to come through clearly.

This is achieved by training deep learning models to be able to differentiate background noise from clean human speech so that the human voice can be enhanced, and the unwanted noises suppressed.

This is harder than you might think as the human voice shares a similar frequency with many other natural sounds, and there is huge diversity in the human voice in terms of pitch, tone, and rate of speech. For machine learning to become effective at this process, it requires access to a huge quantity of audio data.

It is worth it though. Along with applications in the music, pod casting, and streaming industries, the technology is also an important facet of speech recognition systems resulting in more accurate results for digital voice assistants, transcription services, and dictation software.

And of course, it is a significant benefit in business communication to remove background noise and audio distortion in teleconferencing platforms, Voice over IP (VoIP) services, and mobile communication networks.

So that is annoying background noise covered. What about video calls and the problem of maintaining eye contact?

What is face tracking?

Face tracking is a feature of video calls that automatically detects the human face on camera and keeps it in frame if either the person or the camera is moving.

Face tracking is the technological remedy for people on video calls that just can’t sit still. Maybe they are serial fidgeters, or someone that thinks better on their feet and is always walking around their home or office during a meeting, giving everyone else on the call a dose of motion sickness.

It is also used effectively in the education sector for distance learning where a teacher or lecturer is in front of a class of people with remote students watching via video link.

Facial recognition vs face detection vs face tracking

These three terms are sometimes used interchangeably, which can be confusing for people looking to introduce technology to improve video meetings.

For example, facial recognition and face detection are both used to refer to technology that detects the presence of a human face in a photograph or video feed.

Facial recognition can also refer to the identification of individuals based on their unique features, as used in retail store security solutions to prevent shoplifting, and in automated airport security systems.

In the context of video communication, a solution may use face detection to identify a human face and then face tracking to keep it in the center of the frame as the person moves about.

Enabling more productive voice and video communication

WebRTC (Web Real-Time Communication) is an industry standard that enables voice and video communication over the internet via webcams and microphones. In other words, it enables us to have high quality meetings from anywhere in the world using only our PC or mobile phone browser, without having to install any additional plugins or apps.

Without getting too technical, WebRTC is effectively a media engine standardized by an API layer defined in JavaScript. This means that it can be easily integrated into existing systems, which is how any business can benefit from the noise cancellation and facial tracking features that we have discussed in this blog.

For example, using Infobip’s Voice solution a number of use cases can be achieved using WebRTC and Call Link.

  • High quality voice and video conferencing for up to 15 participants, which includes automatic background noise cancellation and face tracking.
  • One-to-one calls between web browsers, mobile app users, and regular phone users which enables instant connections between customers and support agents.

Want to improve the quality of your calls and meetings?

Speak to an expert about noise cancelling and face tracking.

Contact
Dec 11th, 2023
5 min read
Senior Content Marketing Specialist

Dave Hitchins

Senior Content Marketing Specialist