Our Automatic Speech Recognition Engine

Artificial Intelligence Technology

Over the last decade, Voiceinteraction has been developing its AI-based proprietary recognition engine AUDIMUS. Constantly evolving, it allows for highly customizable speech processing tasks in real-time, as well as post-recording workflows in over 30 languages.

With its genesis in classical methodologies, an innovative daily deployment pipeline in addition to using Deep Neural Networks completely revolutionized the production of language models. The significant increase in available data, and the maximization of processing capacity allow us to offer high-quality results in various usage scenarios.

The evolution in terms of data volume and quality of transcribed materials allows for presenting optimized results in multiple business areas, when the technology is applied in practice.

Automatic Speech Recognition

Voiceinteraction has developed its AI-based engine AUDIMUS for customizable speech processing tasks in real-time and offline workflows, in over 30 languages.

Natural Language Processing

Text and acoustic based media enrichment with background classification, language identification, independent speaker analysis and metadata extraction.

On-premises Translation

Real-time translation framework, without relying on an internet connection, for translating captions to multiple languages during a live or pre-recorded event.

Dialogue Systems

AI-driven conversational agents for autonomous customer care, combining natural language understanding, text-to-speech and big-data analytics.

Machine Learning and NLP

Our R&D department focuses on a variety of issues in the fields of Natural Language Processing.

An algorithm for segmented classification identifies speech and noise sections, so not only is it possible to process audio-to-text in these languages, but also infer a variety of information:

– Detecting music and other specific acoustic events
– Identifying which language is being spoken
– Speaker analysis and segmentation for all active speakers
– Metadata gathering to assign dominant topics and detect keywords

Natural Language Understanding (NLU)

Since assembling customer insights has become imperative in the expansion and innovation strategy of any business, our proprietary technology assists in doing just that. Aside from automated transcriptions, a semantic analysis – assessing emotional polarity factors – of the content is also performed during each recognition task. This optimizes access to typically inaccessible metadata, where the information is filed, categorized and compiled into a comprehensive process of big data analysis. Measure clarity, emotional response, customer satisfaction rates and even the agents’ adherence to the company’s communication protocols.

