Transcript Editor
With the launch of the Transcript Editor last quarter, we have continued to develop the functionality and user interface, especially for handling larger volumes of transcripts with ease.
The biggest change is that we have introduced a standalone version of the Transcript Editor, accessible via JWT token authentication. This allows for seamless integration into custom workflows, providing greater flexibility and security for specific project needs. It allows you to integrate the Transcript Editor into your UI or software of choice.
We’ve added the DOXC (.docx) format to the Transcript Editor and the export function also now supports SRT and WebVTT captioning formats. In addition, a new argument in the Speech Recognition module allows users to turn off paragraph formatting.
Finally, a new Tag field has been added to the Transcript resource, allowing users to label transcripts with custom IDs or tags. This feature is particularly useful for users who manage large volumes of transcripts, improving organization and ease of reference.
Access Keys Integration and Management
To enable the use of the standalone Transcript Editor, we have introduced Access Keys, which enable the integration of various front-end components into user workflows. This enhancement allows users to streamline their processes by embedding key functionality directly into their applications. A new Access Keys management page in Preferences provides an easy-to-use interface for securely managing these keys.
Webhook system for events
Our new API-wide webhook system allows users to subscribe to specific events, enabling real-time notification and automation. This feature greatly enhances workflow automation and integration with other services.
We’ve added three new webhook events for exports (deepva.export.on_started, deepva.export.on_completed, and deepva.export.artifact.on_created). These additions provide more granular control and monitoring of export processes within automated workflows.
Model Updates for Face Recognition and Speaker Identification
The release of the Celebrities v31 model for face recognition and the v2 model for speaker identification bring significant improvements in recognition capabilities.
Bug fixes and improvements
Various bug fixes have been implemented, including improvements to transcript view errors, empty source acceptance, editor navigation, and the user interface. We’ve fixed an issue where creating a transcript from a job with multiple mining modules failed.
We now display descriptive network errors instead of generic unknown errors when service timeouts occur.
And we have added an action to the source list context menu to start a job from the source on the right pane when a job is selected for convenience. This small but useful feature streamlines the workflow by reducing the number of steps required to start a job from the Source List.
Deep Live Hub
A lot is also happening with the Deep Live Hub, formerly aiconix Live: The Live Hub made a major step towards a more flexible architecture. Not only has the scaling and the performance improved significantly, also delays, multiple outputs at the same time and in the nearby future new AI features will be available through AMT (aiconix metadata Transport).
Features:
- Added more options for file processing
- Added new Page for processed files
- API for file processing templates (to be able to re-use workflows)
- Added file extensions to download files
- Added Pagination for results to allow a faster initial loading
- Improved Thumbnails for live streams and workflows
- Added more filters and Sorting for results
Bugfixes:
- HLS Stability improved
- Memory footprint reduced
- Translations in AMT (formerly ATT)
- Editor Bugfixes
- Localization improvements
All DeepVA changelog updates are available here: https://docs.deepva.com/changelog/