|
Audio Asylum Thread Printer Get a view of an entire thread on one page |
For Sale Ads |
195.92.168.167
In Reply to: Re: Depends... posted by Frank.. on November 03, 2004 at 04:25:06:
>DTS9624 is currently the only consumer codec available in HT equipment that is able to deliver a lossy representation of 24 bit 96kHZ source material.<The whole point of a lossy system is to deliver a large amount of original source data in a smaller packet, with as little noticeable damage to that data as possible. DTS 96/24 dedicates a considerable amount of its available bitpool to delivering data that is completely inaudible in the first place, ergo, apart from generating new licensing revenu for DTS, it is a totally nonsensical system.
>The better result depends on the tradeoff between 'less lossy but lower resolution' and 'slightly more lossy but higher resolution'. Hires source material with higher dynamic range and frequency extension is probably better resolved through DTS9624.<
Your argument would be completely valid if DTS 96/24 delivered the audible content of the music with a sampling rate of 96kHz, to use your words "slightly more lossy but higher resolution", but it does not. The core data (all the audible part of the spectrum) has to remain at 48kHz to be backward compatible, in the decoder it is simply upsampled to 96kHz before being combined with the extension data. A 1kHz tone would be sampled at 48kHz in a standard DTS system and would also sampled at 48kHz in a DTS 96/24 system.
Incidentally, the same limitations apply to the word-length, the entire audible part of the data has to remain backward-compatible so is no better than in a standard DTS system.
Follow Ups:
You make some assumptions about DTS9624 encododing/decoding technology which may not be valid."DTS 96/24 dedicates a considerable amount of its available bitpool to delivering data that is completely inaudible in the first place"
This is probably not entirely true. The extended data section is encoded with differential data to reconstruct all the samples.
Encoder process:
1st psycho acoustic decoding drops 'inaudible' data from 24/96 datastream.
2nd Filter 24/96 data stream sharply at 22..24kHz to remove upper frequency band. Resulting data is still 24/96.
3rd Drop sample rate for filtered datastream.
-since it is filtered this is a matter of dropping every other sample.4th Encode difference data from the resulting 24/48 datastream upsampled to 24/96* and original datastream into extended data section.
-* The same upsampling algorithm as in the decoder is used guaranteeing perfect predictability.
The difference data stream requires far less bits to encodeAnd
Encode 24/48 data into core data stream.I have no doubt that it actually is a bit more complicated than this simplification of the encoding process. But this is what I understood from the explanation I got at the DTS booth at an exhibition.
"A 1kHz tone would be sampled at 48kHz in a standard DTS system and would also sampled at 48kHz in a DTS 96/24 system."
This assumption is totally incorrect. The 'band splitting' is what makes DTS9624 backward compatible with standard DTS decoders.
It happens after the audio signal is sampled and is part of the encoding process. It has nothing to do with the actual sampling of the audio data itself.Frank
> This is probably not entirely true. <It seems that you still haven't read the AES pre-print that describes the operation of DTS 96/24. As far as I'm aware it is the most detailed document available and what my post is based upon. I thought you would have more sense than to rely on any information you got from a "booth at an exhibition", staffed in all likelihood by marketing drones.
Let me quote some of the document, I've highlighted the most important paragraph in bold for the non-technical on the board, there are no "guesses" or "assumptions" involved.
"The increasing interest in audio coding technology capable of delivering multi-channel audio at sampling frequencies higher than 48kHz and word lengths longer than 16 bits has motivated our development of the DTS core+extension coding technology (4). The challenge with improving established coding systems, such as DTS, relates to the design of the new core+extension bit stream. The existing 48kHz decoders need to recognize and decode the 48kHz core coded data within a new bit stream. To ensure this, the data representing audio components introduced by higher sampling frequency and/or larger word length is transmitted as an extension to the core data. Older generation decoders are both unaware of, and unaffected by, the presence of extension data in the bit stream and in this way continue to operate normally at 48 kHz.
"The generalized concept of core+extension coding is illustrated in Figure 1. To encode 96 kHz LPCM the input audio stream is fed to a 96 to-48kHz down sampler and the resulting 48kHz signal is encoded using standard core encoder as in Figure 1 A). In the core+extension coding scheme that we first introduced in (4):
• The core data is fed to a local core decoder whose output is up sampled in a 48-to-96kHz interpolator resulting in an interpolated core LPCM audio, denoted as signal “2” in Figure 1 A). Both operations are performed in the “Reconstruct Core Audio Components” block.
• In the “Generate Residuals” block the interpolated core audio is subtracted from the delayed version of input 96kHz LPCM audio (signal “1”) to generate the LPCM residual (signal “3”). The “Preprocess Input Audio” block performs the delay operation.
• The extension encoder (“Generate Extension Data” block) processes the residual LPCM signal and outputs the extension data. This data, along with the core data, is assembled in a packer to produce a core+extension bit stream.
" To decode a core+extension bit-stream, Figure 1 B), the unpacker first separates the stream into the core and extension data. The core decoder decodes the core data and produces the core LPCM audio that is next up-sampled to 96kHz using the 48-to-96kHz interpolator. The core decoding and interpolation are both performed in the “Reconstruct Core Audio Components” block. The interpolated core LPCM audio is the same as the signal “2” generated in the encoder. The extension data is decoded using an extension decoder (“Reconstruct Residual Components” block) and its output is added to the interpolated core audio to produce a composite 96kHz 24-bit LPCM audio .
"When a 48khz-only (legacy) decoder is fed the core+extension bit stream, Figure 1 C), the extension data fields are ignored and only the core data is decoded. This results in 48kHz core LPCM audio output."
If there is a more technical or up-to-date white paper that I'm not aware of, then a reference would be welcome.
By-the-way, in your previous message you stated that "standard" DTS was a 16-bit system, when in fact it is a 20-bit system.
Well you're nice white paper is essentially telling the same as I outlined. It clearly indicates that the extension data is used to reconstruct all the samples in the decoder. The extened data doesn't contain high band frequency content only. (This would be inefficient as there is little hf content in common material.)DTS data streams are 16, 20 or 24 bit compatible. I referred to it as standard 16 bit as this is the lowest wordlength option for a DTS bitstream.
"It seems that you still haven't read the AES pre-print that describes the operation of DTS 96/24. As far as I'm aware it is the most detailed document available and what my post is based upon. I thought you would have more sense than to rely on any information you got from a "booth at an exhibition", staffed in all likelihood by marketing drones."
The problem seems that it is you not understanding the technology outlined.
PS. the fellow wasn't a 'marketing drone' at all but a genuine technician/engineer who knew his stuff.
> Well you're nice white paper is essentially telling the same as I outlined. It clearly indicates that the extension data is used to reconstruct all the samples in the decoder. The extened data doesn't contain high band frequency content only. <I disagree and we're obviously interpreting the technical data in different ways with the confusion arising over the word "reconstruct".
"The core decoder decodes the core data and produces the core LPCM audio that is next up-sampled to 96kHz using the 48-to-96kHz interpolator."
This is very clear and open to very little misinterpretation. The core 48kHz data is upsampled before being combined with the extension data.
At no point does the document state that the extension data is used to increase the inherent resolution of the core data or how such a thing would even be possible. If you consider the practicalities involved, to enhance the resolution of the core component to 96kHz a far greater amount of data would be required than could be delivered by the extension component, unless of course DTS considers the need to exceed Nyquist unnecessary and are saving the bitpool in that way, in which case we're back to square one, a core component that is no better than 48kHz.
> > >
"The core decoder decodes the core data and produces the core LPCM audio that is next up-sampled to 96kHz using the 48-to-96kHz interpolator."This is very clear and open to very little misinterpretation. The core 48kHz data is upsampled before being combined with the extension data.
< < <What you are probably fail to understand that the decoder operation is used during the encoding process.
This operates as a kind of 'error feed forward'.
In essence the output from the decoder which is part of the encoder operation is fed back and compared with the original signal.
The measured error is packed in the extension data stream and is basically the error 'feed forward signal'.> > >
The extension data is decoded using an extension decoder (“Reconstruct Residual Components” block) and its output is added to the interpolated core audio to produce a composite 96kHz 24-bit LPCM audio.
< < <As far as I understand this it's telling that the decoded extension data is added to the interpolated core audio which is already at 96ks due to the interpolation.
in general i agree with your interpretation, but i think i can see where frank is coming from.the core data is constructed from a downsampled 48kHz stream. an anti-aliasing filter would have been used as part of the downsampling process, so some "damage" has already been done to the stream.
Frank is wrong (or at the very least misleading) when he suggests the core data decoding reconstructs "all" the samples. as we know, dts is a lossy encoder, so the reconstructed samples merely approximate the 48kHz stream.
the extension stream encodes differences between the original 96/24 stream and the *decoded* core data upsampled to 96/24. so if there are any differences between the original samples and reconstructed samples due to the lossy algorithm, there is a chance for these differences to be captured in the extension stream.
at least that's what i think frank really means.
however, in general i think your point is valid - it's likely that more damage is done by the downsampling to 48kHz and the reduction in the bitrate for the core stream than can be recovered through the extension stream.
It states clearly how during the encoding process a difference data stream from the original 24/96 data stream is created by comparing it with the upsampled version from the downsampled core data.> > >
"The generalized concept of core+extension coding is illustrated in Figure 1. To encode 96 kHz LPCM the input audio stream is fed to a 96 to-48kHz down sampler and the resulting 48kHz signal is encoded using standard core encoder as in Figure 1 A). In the core+extension coding scheme that we first introduced in (4):• The core data is fed to a local core decoder whose output is up sampled in a 48-to-96kHz interpolator resulting in an interpolated core LPCM audio, denoted as signal “2” in Figure 1 A). Both operations are performed in the “Reconstruct Core Audio Components” block.
• In the “Generate Residuals” block the interpolated core audio is subtracted from the delayed version of input 96kHz LPCM audio (signal “1”) to generate the LPCM residual (signal “3”). The “Preprocess Input Audio” block performs the delay operation.
• The extension encoder (“Generate Extension Data” block) processes the residual LPCM signal and outputs the extension data. This data, along with the core data, is assembled in a packer to produce a core+extension bit stream.
< < <Above is describing encoder operation.
It's a bit like the following
Original data:
10 12 14 16 14 12 10 8After filtering and downsampling you get core data:
11 15 13 9Upsampling algo gives :
11 12 15 14 13 11 9 8Subtract from original gets extension data:
+1 0 +1 -2 -1 +1 -1 0Pack core data + extension data into dts stream
Decoding is fairly easy.
Standard DTS decodeer just decodes the core data.DTS9624 decoder decodes the core data. Interpolates it with the same algo as used in the encoder and adds the extension data.
It's an elegant solution.
The 48kHz core data is a decent downsampled audio signal.
The 96kHz reconstruction is augmented by the additional extension data requiring a lower bitrate. Difference data generally requires far fewer bits to encode.
... you probably did not realise this, but my post was agreeing with your view and in fact was trying to clarify it.i think you got so used to contradicting everything i say you don't even read what i post anymore, you automatically assume i'm criticizing you. of course, you are "usually" wrong and "deserve" to be corrected but that's another topic for another day :-)
I wasn't shure you understanded what I meant:> >
Frank is wrong (or at the very least misleading) when he suggests the core data decoding reconstructs "all" the samples. as we know, dts is a lossy encoder, so the reconstructed samples merely approximate the 48kHz stream.
< <I just tried to clarify that I didn't suggest what you wrote in the quote above.
*** It clearly indicates that the extension data is used to reconstruct all the samples in the decoder. ***No.
The extension data will never reconstruct *all* the samples, or even *any* of the samples. It's a lossy process, you never get back the original 96/24 stream. Only a close approximation.
I never implied a lossless process.My rough outline of the encoding process included the psycho acoustic lossy encoding of the 24/96 data as a first step.
It's obvious to all that information lost after that stage can never be reconstructed.
... that statement would still be wrong, when taken literally.Don't worry, Frank, I know what you meant (and I said so, remember?) I'm just teasing you as usual :-) No need to get into your usual "tortured justifications" :-)
This post is made possible by the generous support of people like you and our sponsors: