by Susanto RAHARDJA
uch of the audio entertainment technology we take for granted, such as fast music downloads and DVD surround sound, depends on technology that enables the reduction or compression of audio data to a fraction of its original size while losing little of its fidelity. Advanced Audio Zip (AAZ), the invention of two researchers from the Institute of Infocomm Research, Rongshan Yu and Susanto Rahardja, offers a unified and efficient solution combining scalability and interchangeability while replacing the current separate solutions for "lossy" and "lossless" audio compression. This could well be the next-generation audio codec - software that compresses data to a particular coding scheme and then decompresses it during playback.
Lossy compression techniques eliminate audio information in order to achieve high compression ratios. They preserve the essence of the sound but do not restore the precise bits; thus the higher the compression, the lower the sound quality. The magic of achieving up to 20 times compression is based on the "masking" property of the human auditory system - at the same frequency, a stronger aural component always masks a weaker one, rendering the weaker practically inaudible. As a result, not all aural components in a given audio clip have equal importance to auditory perception, so weaker ones can be dropped without significant distortion of the perceived sound.
Current state-of-the-art technology can deliver near compact-disc-quality stereo audio at 10-20 times' compression, with a quality degradation imperceptible to the normal user. Highly popular and successful examples - all of them lossy - include MP3 (MPEG-1 Audio Layer-3), Microsoft's Windows Media Audio (WMA) technology, and Dolby's AC3 technology.
Lossless techniques restore, after compression, every bit of the original audio data. Nothing is lost or distorted. The progress of network and storage technologies and the rapidly decreasing price per megabyte of storage encourage greater delivery of audio contents at lossless quality. This technology is good for archiving purposes, for studio applications when post-processing is required, and for high-fidelity sound. However, it achieves only limited compression ratios of between 1.5 and 3.0. Clearly a universal audio solution would merge the benefits of these two isolated, incompatible compression formats.
AAZ's embedded lossless audio-compression technology answers the need - it allows users to vary the quality of digital sound by compressing the sound signal by 1.5-40 times, depending on the core. The core is Advanced Audio Coding (AAC), or MPEG4 Audio codec, in which AAZ is backward-compatible. The core, set at 32kbps, gives a compression of about 40 times; at 64kbps, the compression range would be 1.5-20 times. It provides backward compatibility by embedding an MPEG AAC-compliant bit-stream in its lossless bit-stream, which can be broken down into simpler components from the lossless bit-stream and decoded with a standard AAC decoder. In addition, its fine-granular-scalability (FGS) feature enables the bit-rate of the AAZ bit-stream to be scaled easily from lossless to high-compression lossy in very small step size (<0.4kbps). This feature is extremely useful for applications of varying quality of service requirements, for example, an audio-streaming application with time-varying channel conditions.
Researchers have explored FGS coding extensively for almost two decades, but many technical challenges still exist in its application to audio compression, including poor coding efficiency and algorithmic complexity. A long-existing problem in signal processing is that scalable coding always results in significant performance loss during compression.
A novel embedded coder, namely bit-plane Golomb code (BPGC), solves this problem in AAZ. BPGC achieves low-complexity, high-efficiency scalable coding of audio signals by introducing a probability-assignment rule for bit-plane coding according to the statistical properties of the audio signals. With the help of BPGC, AAZ provides excellent compression-ratio performance, comparable to most state-of-the-art non-scalable lossless audio-compression algorithms, and abundant functionalities that would only be possible with its full support of scalability and backward compatibility.
In response to the official Call for Proposals on MPEG-4 Lossless Audio Coding, AAZ technology was submitted and evaluated by independent testing determined by MPEG. Under scalable lossless coding evaluation, the AAZ system had the best performance in terms of the mean lossless compression ratio for all word length and all sampling rates for all sequences in the test set. Word length is the number of bit resolutions, where a higher value is better. The researchers performed tests with 16-bit, 20-bit and 24-bit' resolutions, with the mean lossless compression ratio taken as the best for the three word lengths. For comparison, CD audio uses 16-bit.
In addition, AAZ provides critical functionalities and capabilities, such as fine-grain bit-rate scalability from lossy to lossless, high degree random access, backward compatibility to MPEG-4 AAC and simultaneously achieves remarkably low complexity. With that, AAZ was adopted as the reference model for scalable lossless coding in MPEG.
For more information contact Susanto Rahardja at rsusanto@i2r.a-star.edu.sg
|