Variable-Data-Rate Speech Encoder
This encoder could supplant older encoders that operate at diverse fixed rates.
Avariable-data-rate (VDR) speech encoder has been designed to be interoperable with, and eventually to supplant, the many different voice encoders now used in military communication systems. Because these older systems were designed to utilize specific radio links with fixed and limited channel capacities, these systems utilize many different voice compression algorithms operating at various fixed rates. The incompatibility of these systems is an obstacle to interoperability. Emerging net-centric communication systems promise to provide connectivity to all military users, but compatible encoding will be necessary for interoperability, and encryption will be necessary for secure communications.
The VDR voice encoder is designed to provide both interoperability and security in net-centric voice communications. The VDR speech encoder can operate at any or all of the various data rates of older military speech encoders. Notably, it can operate over a range of data rates up to 26 kb/s and is backward-compatible with the Multiple Excitation Linear Predictive (MELP) voice encoder, which is a Federal-standard encoder that operates at a data rate of 2.4 kb/s. The VDR speech encoder is interoperable at any and all rates simultaneously. The rate setting can be changed dynamically (that is, during operation) without disrupting operation, even when used with encryption: Hence, without compromising security, the VDR speech encoder can be dynamically adjusted to make efficient use of network bandwidth under changing network traffic conditions.
The heart of the VDR voice encoder is a multirate voice processor in which a single voice algorithm generates multiple data streams at rates from 2.4 kb/s to an average rate of about 23 kb/s for input speech at frequencies from 0 to 4 kHz. The algorithm provides for seven different operating modes (see table). Inclusion of a few more kb/s of data from the 4-to-8-kHz audio frequency band makes it possible to encode wide-band speech comparable in quality to that of standard frequency-modulation (FM) broadcasting.
The VDR bit stream has an embedded structure in which higher-rate voice data frames contain successively lower-rate voice data frames as subsets. Deletion of a certain portion of the superset (higherrate frames typically representing higher audio frequencies) makes it possible to reduce the data rate, even in the presence of encryption. Because of this embedded data structure, any of the VDR data rates are interoperable and can be switched, as often as 44 times per second, even when speech is present. Because the speech waveforms of all the VDR rates are synchronous, switching of data rates does not introduce such undesirable sounds such as clicks or warbles.
It must be emphasized that the multirate voice processor in the VDR voice encoder is a single processor running a single algorithm, in contradistinction to both (1) a collection of separate processors operating at different rates and (2) a processor running a multitude of speechcompression algorithms. Prior voice encoders that use multiple compression algorithms do not perform well when algorithms are switched while speech is present. Speech waveforms sometimes become cropped upon switching because different voice algorithms can have different internal delays. Such cropping degrades speech quality and is annoying to listeners.
The VDR speech encoder exploits the variable nature of the speech waveform, utilizing higher or lower data rates as needed (e.g., higher rates for vowels, lower rates for consonants). Unlike some prior speech processors, the speech processor in the VDR speech encoder processor does not eliminate gaps in speech for the sake of efficiency. Elimination of speech gaps that contain ambient sounds could be harmful in military communications because speech gaps often contain sounds that help listeners gauge battlefield conditions at transmitter sites. In the VDR speech encoder, speech gaps are encoded at appropriately low data rates that still provide audible information.
This work was done by Thomas M. Moran, David A. Heide, Yvette and T. Lee of the Naval Research Laboratory and George S. Kang of ITT Industries.
This Brief includes a Technical Support Package (TSP).
Variable-Data-Rate Speech Encoder
(reference NRL-0019) is currently available for download from the TSP library.
Don't have an account? Sign up here.
Top Stories
INSIDERManned Systems
Turkey's KAAN Combat Aircraft Completes First Flight - Mobility Engineering...
INSIDERMaterials
FAA Expands Boeing 737 Investigation to Manufacturing and Production Lines -...
INSIDERImaging
New Video Card Enables Supersonic Vision System for NASA's X-59 Demonstrator -...
INSIDERManned Systems
Stratolaunch Approaches Hypersonic Speed in First Powered TA-1 Test Flight -...
INSIDERUnmanned Systems
Army Ends Future Attack and Reconnaissance Helicopter Development Program -...
ArticlesEnergy
Can Solid-State Batteries Commercialize by 2030? - Mobility Engineering...
Webcasts
AR/AI
From Data to Decision: How AI Enhances Warfighter Readiness
Energy
April Battery & Electrification Summit
Manufacturing & Prototyping
Tech Update: 3D Printing for Transportation in 2024
Test & Measurement
Building an Automotive EMC Test Plan
Manufacturing & Prototyping
The Moon and Beyond from a Thermal Perspective
Software
Mastering Software Complexity in Automotive: Is Release Possible...