Customer Success Story: Zebra Live meets European Publishing Congress

When language is no longer a barrier: the power of composite AI and how live events are becoming more acces­sible

Chris Guse and BosePark are true all-rounders. Together with Su Holder, both managing directors of BosePark Produc­tions GmbH, he hosts a podcast about podcasts and is a creative visionary. Chris is a partner in various companies, including the podcast production company Bosepark Produc­tions, the AI consul­tancy Berlin Hills GmbH and the AI trans­lation company Zebra Live GmbH. So it’s no wonder that he is a true early adopter when it comes to technology.

Audio AI — the under­es­ti­mated potential alongside large language models?

Artificial intel­li­gence (AI) is already common­place in the text sector, and many people use it on a daily basis. AI is no longer a novelty in the audio sector, either: every smart­phone can transcribe and accept voice commands. But what does it look like in practice? According to Chris Guse, many processes in audio content production are still very manual and AI support is not widespread. Yet, the potential is huge — for almost all of us. If you want to find out more, you should listen to this podcast episode by Chris and Su.

The European Publishing Congress 2024 in Vienna, organized by Medien­fachverlag Oberauer, brings together publishers and media companies from all over Europe to discuss innovation and change. This year, Chris Guse was also invited as a keynote speaker on the topic of innovation and podcasting, but that was not his only involvement. Together with Zebra Live GmbH and Managing Director Nino Mello Wagner, he also tested live transcription and inter­preting based on AI for the first time.

Events without a language barrier

Every year, thousands of confer­ences take place that require trans­la­tions. English is often chosen as a compromise, and only rarely can organizers afford simul­ta­neous trans­lators and the necessary technology. Together with aiconix Live, Chris has developed a solution to make events possible without language barriers — thanks to Composite AI!

Composite AI

Composite AI refers to the combined appli­cation of different AI techniques to improve the efficiency of learning and solve a wider range of problems more effec­tively. By integrating different methods and disci­plines of AI, it becomes more powerful and versatile overall, just as our brain can combine different tasks to be more efficient. The following examples demon­strate this:

STREAMING PLATFORMS: Streaming services use composite AI to offer person­alized content. By combining machine learning, language under­standing thanks to natural language processing (NLP) and knowledge graphs, the service can accurately predict which series a user might like. This improves the user experience and increases customer loyalty.

EMERGENCY CALLS & CALLCENTERS: Emergency call centers use Composite AI to process incoming calls more efficiently. By combining speech recog­nition and decision support systems, emergency calls can be catego­rized and routed faster. In future, language barriers can also be overcome live during the emergency call, resulting in targeted assis­tance and fewer commu­ni­cation hurdles.

Composite AI as a key technology

To make the inter­na­tional conference in Vienna more acces­sible, it needed to be trans­lated into five languages, including Czech and Dutch. Zebra Live GmbH solved this using Composite AI.

The audio signal from the stage was transcribed live using aiconix, including for acces­sible live subtitles, and also trans­lated into five different languages. These trans­la­tions were then converted back into audio using text-to-speech and output as separate channels. This allowed the audience to hear the presen­ta­tions in their own language using a pleasant computer voice via a headset. Normally, special headsets, inter­preters, booths and a radio system are required for such a conference. Today, everyone has a cell phone with headphones. Both were in use at the conference in Vienna, allowing visitors to listen live to both the AI trans­lation and the human trans­lation.

So, is AI a more cost-effective alter­native to simul­ta­neous inter­preting?

Currently, this process, which requires several AI solutions, takes up to 20 seconds and computer-generated voices are still under­going rapid devel­opment. The human brain is faster in this supreme disci­pline of trans­lation, and the real voices are more pleasant in their modulation and emphasis.

However, simul­ta­neous inter­preting is very expensive, and the work is also very strenuous, especially for all-day confer­ences that require at least two inter­preters per language. However, the first test at the European Publishing Congress showed that it is not only acces­si­bility and new audio tracks that benefit — more languages are simply possible — but also content creation, which in combi­nation is a game changer.

Secondary use: AI-based podcasts

As a podcast enthu­siast, Chris Guse knows that a content-rich conference and recording is pure content gold. So why waste it? The transcript from aiconix Live was used to feed not only the gener­ative audio AI, but also a language model that generated summaries of the presen­ta­tions. These summaries were created using a specially developed engine called Audiomatika, which was developed at BosePark Produc­tions for this purpose, so that each presen­tation slot received an automat­i­cally generated summary as an audio file. The entire conference is available as a special podcast episode (Spotify), and the text summaries are also suitable for minutes or event follow-up.

“The aim must always be to improve our lives and not just to organize technical gimmicks. If you can reduce a 10-hour day to half an hour, the use of artificial intel­li­gence will clearly help us. And there is also enormous potential in terms of acces­si­bility.

Chris Guse

Co-Founder & CEO of Zebra Live

What about errors?

Despite the high standards and extensive training, an audio transcription is of course not infal­lible; technical terms and dialects have become manageable in recent years, e.g. through the use of our Dictio­n­airies, with which technical terms can be passed to the AI as dictio­naries. For presen­ta­tions where there is no room for errors, you can use the live editor to have an editor look over the transcripts again. Thanks to so-called partials, i.e. the immediate trans­lation of partial words, terms can be corrected in seconds before they are output directly. This is mainly used for speeches and live streams by public officials with aiconix Live. This was not the case at the European Publishing Congress, where the main focus was on efficiency.

The first test was a success — more congresses and event streams to follow

It’s not just about saving money, but above all about enabling smaller events to expand their reach and become more acces­sible. Chris Guse and Nino Mello Wagner see this clearly: “The aim must always be to improve our lives and not just to organize technical gimmicks. If you can reduce a 10-hour day to half an hour, the use of artificial intel­li­gence will clearly help us. And there is also enormous potential in terms of acces­si­bility.”

After all, aiconix not only enables transcription, but also provides written subtitles as output for smart­phones or on-site displays. This is partic­u­larly inter­esting for organizers of events, congresses and town hall meetings, as the European Acces­si­bility Act will also play a role there in the coming year. In this appli­cation in particular, AI will be a real relief in the future — especially when it comes to languages where the necessary staff with language skills are difficult to find.

If you are inter­ested in this topic, please get in touch with us or contact Chris and Zebra Live.

Share

Email
LinkedIn
Facebook
Twitter
Search

Table of Contents

latest AI news

Subscribe to our newsletter

Don’t worry, we reserve our newsletter for important news, so we only send a few updates once in a while. No spam!