VoiceInteraction | AI-Driven Automated Transcription Software

Why Audimus.Server

An automated transcription platform through an intuitive web dashboard

Automated Transcription

Modernize your document production workflow with speech recognition software. VoiceInteraction’s acoustic and textual models are updated daily through Deep Neural networks, in order to fully deliver in both common and unexpected scenarios. Available in over 30 languages, with automatic translation even without an internet connection.

Seamless Setup

This intuitive setup works by selecting audio or video files into a folder, choosing the target language and the AI-driven platform does the rest – creating an output folder with document files. It’s possible to translate, edit in-app or export to any other document editor, search and manage resulting databases.

Flexible Deployment

It’s possible to access the Audimus.Server platform through a cloud-based web dashboard or on-premises. Readily available in any computer or mobile through an internet connection, or in the customer's data center without having to rely on it. The possibility of virtualization and parallel processing make this system a safe bet in scenarios with large volumes of data.

Learn more about Audimus.Server

Architecture Overview

How it works – an outline of Audimus.Server functional architecture

Capture and Media Inputs

A variety of supported formats without compromising the pre-existing setup

Supported Inputs

Our clients can expect a large array of input options – Audimus.Server was designed to be adaptable to multiple scenarios, without altering ongoing transcription workflows. The supported inputs include any non-proprietary audio or video format.

See it in Action

Automated Transcription

A reliable processing capability with convenient features

Proprietary Technology

Audimus.Server is supported by its own automatic speech recognition engine and constant updates, resulting in a platform driven by AI, supported by Machine Learning algorithms and Deep Neural networks. The underlying systems are constantly evolving, creating a self-sustainable cycle that guarantees a quality response in any environment.

High Accuracy

The AI-driven Language and Acoustic Models are created through a Deep Neural Network based training, derived from thousands of hours of gathered data. These models are robust enough to fully deliver in any scenario, whether breaking news or technical vocabulary, with high levels of accuracy.

High Accuracy

The AI-driven Language and Acoustic Models are created through a Deep Neural Network based training, derived from thousands of hours of gathered data. These models are robust enough to fully deliver in any scenario, whether breaking news or technical vocabulary, with high levels of accuracy.

Daily Vocabulary Updates

The combination of generic acoustic and language models allows for a daily selection of the most relevant words to form a base vocabulary used during the speech-to-text process. These models are refined using suggested terms by clients and a constant flow of transcriptions to improve the AI models.

Flexible Deployment

It’s possible to access the Audimus.Server platform through a cloud-based web dashboard or on-premises. Readily available in any computer or mobile through an internet connection, or in the customer's data center without having to rely on it. The possibility of virtualization and parallel processing make this system a safe bet in scenarios with large volumes of data.

Flexible Deployment

It’s possible to access the Audimus.Server platform through a cloud-based web dashboard or on-premises. Readily available in any computer or mobile through an internet connection, or in the customer's data center without having to rely on it. The possibility of virtualization and parallel processing make this system a safe bet in scenarios with large volumes of data.

Seamless Setup

This intuitive setup works by selecting audio or video files into a folder, choosing the target language and the AI-driven platform does the rest – creating an output folder with document files. It’s possible to translate, edit in-app or export to any other document editor, search and manage resulting databases.

See it in Action

Transcription Editing

Integrated editing tools with synchronized media player and navigation

Integrated In-App Editor

For subtitling workflow automation, an editing dashboard was created directly in the application. With the ability to correct or translate the generated text and the formatting itself, we offer our users the possibility to export subtitles in the native format of the most relevant current media editing programs.

Screenplays’ Timecoding

Audimus.Server is able to parse several screenplay formats and compute accurate timecodes, through the alignment of automatically extracted dialogues and the audio track. The caption files are then produced according to user-defined formatting features – all this processing is finished in less than half of the file’s playback time.

Screenplays’ Timecoding

Audimus.Server is able to parse several screenplay formats and compute accurate timecodes, through the alignment of automatically extracted dialogues and the audio track. The caption files are then produced according to user-defined formatting features – all this processing is finished in less than half of the file’s playback time.

Captions Translation

Live translation on top of produced captions in over 30 languages without relying on an internet connection. The on-premises deployment ensures minimal latency delays during the production and delivery of closed captions, while ensuring high levels of precision throughout.

See it in Action

Delivery Integrations

An array of integrations for high accessibility

MAM Metadata enhancement

The transcriptions are complemented by processed metadata that can be ingested into MAM systems, providing full indexation of any media archive. Retrieve information through simple textual searches and curb the limitations of manually-created, keyword-based, content descriptions.

See it in Action

Delivery Outputs

Suitable variety of export formats

Multiple output formats

There are a large number of output formats to export, infer and translate into. New documents can be saved as a pre-formatted .DOCX Word document, an .XML representation file or any closed captioning format suitable to your workflow, whether a .TXT or .SRT – Audimus.Server supports over 30 export formats.

Integration into existing workflows

With a variety of accepted inputs, highly accurate transcription results and an array of exporting formats, the platform Audimus.Server seamlessly integrates into existing workflows, clearing the barrier created by the complexity of manually transcribing files.

Integration into existing workflows

With a variety of accepted inputs, highly accurate transcription results and an array of exporting formats, the platform Audimus.Server seamlessly integrates into existing workflows, clearing the barrier created by the complexity of manually transcribing files.

See it in Action

Call us today at +1 646 504 7906 or Email us at info@voiceinteraction.tv

Stay in touch

Request a demo and experience what our Speech Processing platforms can offer you.

Call us today at
+1 646 504 7906
or Email us at info@voiceinteraction.tv

Request a demo and experience what our Speech Processing platforms can offer you.

Stay in touch

Automated Transcription

Automated Transcription

Why Audimus.Server

Automated Transcription

Seamless Setup

Flexible Deployment

Introducing version 7

Architecture Overview

Capture and Media Inputs

Supported Inputs

Automated Transcription

Proprietary Technology

High Accuracy

High Accuracy

Daily Vocabulary Updates

Flexible Deployment

Flexible Deployment

Seamless Setup

Transcription Editing

Integrated In-App Editor

Screenplays’ Timecoding

Screenplays’ Timecoding

Captions Translation

Delivery Integrations

MAM Metadata enhancement

Delivery Outputs

Multiple output formats

Integration into existing workflows

Integration into existing workflows