Customer Success Story: Zebra Live meets European Publishing Congress

When language is no longer a barrier: the power of composite AI and how live events are becoming more accessible

Chris Guse and BosePark are true all-rounders. Together with Su Holder, both managing directors of BosePark Productions GmbH, he hosts a podcast about podcasts and is a creative visionary. Chris is a partner in various companies, including the podcast production company Bosepark Productions, the AI consultancy Berlin Hills GmbH and the AI translation company Zebra Live GmbH. So it’s no wonder that he is a true early adopter when it comes to technology.

Audio AI - the underestimated potential alongside large language models?

Artificial intelligence (AI) is already commonplace in the text sector, and many people use it on a daily basis. AI is no longer a novelty in the audio sector, either: every smartphone can transcribe and accept voice commands. But what does it look like in practice? According to Chris Guse, many processes in audio content production are still very manual and AI support is not widespread. Yet, the potential is huge – for almost all of us. If you want to find out more, you should listen to this podcast episode by Chris and Su.

The European Publishing Congress 2024 in Vienna, organized by Medienfachverlag Oberauer, brings together publishers and media companies from all over Europe to discuss innovation and change. This year, Chris Guse was also invited as a keynote speaker on the topic of innovation and podcasting, but that was not his only involvement. Together with Zebra Live GmbH and Managing Director Nino Mello Wagner, he also tested live transcription and interpreting based on AI for the first time.

Events without a language barrier

Every year, thousands of conferences take place that require translations. English is often chosen as a compromise, and only rarely can organizers afford simultaneous translators and the necessary technology. Together with aiconix Live, Chris has developed a solution to make events possible without language barriers – thanks to Composite AI!

Composite AI

Composite AI refers to the combined application of different AI techniques to improve the efficiency of learning and solve a wider range of problems more effectively. By integrating different methods and disciplines of AI, it becomes more powerful and versatile overall, just as our brain can combine different tasks to be more efficient. The following examples demonstrate this:

STREAMING PLATFORMS: Streaming services use composite AI to offer personalized content. By combining machine learning, language understanding thanks to natural language processing (NLP) and knowledge graphs, the service can accurately predict which series a user might like. This improves the user experience and increases customer loyalty.

EMERGENCY CALLS & CALLCENTERS: Emergency call centers use Composite AI to process incoming calls more efficiently. By combining speech recognition and decision support systems, emergency calls can be categorized and routed faster. In future, language barriers can also be overcome live during the emergency call, resulting in targeted assistance and fewer communication hurdles.

Composite AI as a key technology

To make the international conference in Vienna more accessible, it needed to be translated into five languages, including Czech and Dutch. Zebra Live GmbH solved this using Composite AI.

The audio signal from the stage was transcribed live using aiconix, including for accessible live subtitles, and also translated into five different languages. These translations were then converted back into audio using text-to-speech and output as separate channels. This allowed the audience to hear the presentations in their own language using a pleasant computer voice via a headset. Normally, special headsets, interpreters, booths and a radio system are required for such a conference. Today, everyone has a cell phone with headphones. Both were in use at the conference in Vienna, allowing visitors to listen live to both the AI translation and the human translation.

So, is AI a more cost-effective alternative to simultaneous interpreting?

Currently, this process, which requires several AI solutions, takes up to 20 seconds and computer-generated voices are still undergoing rapid development. The human brain is faster in this supreme discipline of translation, and the real voices are more pleasant in their modulation and emphasis.

However, simultaneous interpreting is very expensive, and the work is also very strenuous, especially for all-day conferences that require at least two interpreters per language. However, the first test at the European Publishing Congress showed that it is not only accessibility and new audio tracks that benefit – more languages are simply possible – but also content creation, which in combination is a game changer.

Secondary use: AI-based podcasts

As a podcast enthusiast, Chris Guse knows that a content-rich conference and recording is pure content gold. So why waste it? The transcript from aiconix Live was used to feed not only the generative audio AI, but also a language model that generated summaries of the presentations. These summaries were created using a specially developed engine called Audiomatika, which was developed at BosePark Productions for this purpose, so that each presentation slot received an automatically generated summary as an audio file. The entire conference is available as a special podcast episode (Spotify), and the text summaries are also suitable for minutes or event follow-up.

"The aim must always be to improve our lives and not just to organize technical gimmicks. If you can reduce a 10-hour day to half an hour, the use of artificial intelligence will clearly help us. And there is also enormous potential in terms of accessibility.

Chris Guse

Co-Founder & CEO of Zebra Live

What about errors?

Despite the high standards and extensive training, an audio transcription is of course not infallible; technical terms and dialects have become manageable in recent years, e.g. through the use of our Dictionairies, with which technical terms can be passed to the AI as dictionaries. For presentations where there is no room for errors, you can use the live editor to have an editor look over the transcripts again. Thanks to so-called partials, i.e. the immediate translation of partial words, terms can be corrected in seconds before they are output directly. This is mainly used for speeches and live streams by public officials with aiconix Live. This was not the case at the European Publishing Congress, where the main focus was on efficiency.

The first test was a success - more congresses and event streams to follow

It’s not just about saving money, but above all about enabling smaller events to expand their reach and become more accessible. Chris Guse and Nino Mello Wagner see this clearly: “The aim must always be to improve our lives and not just to organize technical gimmicks. If you can reduce a 10-hour day to half an hour, the use of artificial intelligence will clearly help us. And there is also enormous potential in terms of accessibility.”

After all, aiconix not only enables transcription, but also provides written subtitles as output for smartphones or on-site displays. This is particularly interesting for organizers of events, congresses and town hall meetings, as the European Accessibility Act will also play a role there in the coming year. In this application in particular, AI will be a real relief in the future – especially when it comes to languages where the necessary staff with language skills are difficult to find.

If you are interested in this topic, please get in touch with us or contact Chris and Zebra Live.

Share

Email
LinkedIn
Facebook
Twitter
Search

Table of Contents

latest AI news

Subscribe to our newsletter

Don’t worry, we reserve our newsletter for important news, so we only send a few updates once in a while. No spam!

📣   Meet us at IBC 2024 in Amsterdam!  
This is default text for notification bar