|
Audio Asylum Thread Printer Get a view of an entire thread on one page |
For Sale Ads |
In Reply to: Does DVD-A have significant hardware and firmware advantages? posted by Jim Pearce on April 28, 2003 at 10:46:30:
32 bit floating point PCM DSP's with good algorithm's can give very good results. After the math is applied the signals much be rounded back into 24 bits and dither is used to linearize the result.
The cost for this is 2 bits from the 24bits available.
The signal degradation is insignificant if all is done right.There is no such thing as native DSD DSP's.
After a processing step DSD always ends up as pcm. To get back to a 1 bit stream dithering is needed wich needs at least needs a 3 bit PCM sample otherwise the signal will be compromized to much.
Too much processing steps with dsd and signal degradation becomes noticeable very quickly. This includes ALL processing steps in the complete chain. The ill effects of dsd editing during mastering and the steps applied in the player, for BM filtering or even level adjustments in the digital domain, are cumulative.Another problem is signal headroom. If two or more signals are added the dsd signal can saturate. They strongly advise to keep the signal level below -6dB of the full input range to prevent trouble.
PCM processing doesn't have this problem, a 24 bit signal handled by a 32 bit processor has plenty of processing offset.There are solutions for simple crossfades in dsd 'dsp' implemented.
The output is switched between the native dsd stream and the processed stream during the crossfade and if that's finished the signal is switched to the other native dsd stream. During this switching offset clicks at -50dB signal level are produced.
According two the philips engineer who presented this solution these clicks are insignificant.Frank
Follow Ups:
Another problem is signal headroom. If two or more signals are added the dsd signal can saturate. They strongly advise to keep the signal level below -6dB of the full input range to prevent trouble.
PCM processing doesn't have this problem, a 24 bit signal handled by a 32 bit processor has plenty of processing offset.
I fail to see how this is a problem for DSD but not PCM. The fact that a 32 bit DSP has 32 bits doesn't make any difference if the 24 bit source is fed into the 24 most significant bits of the DSP.
But anyway, I think an important point about bass management is getting lost in this discussion. Even if you use a 128 bit DSP running at 384 KS/s, you're still not going to get transparent bass management. The filter characteristics aren't dependent on the sample size or rate. Adding additional filters into the playback chain is always going to degrade the sound somewhat no matter how they're implemented.
Add two 24 bits pcm words and you could end up with a 25 bits word if the result overflows 24 bits. With 32 bit dsp that's not a problem. You need a lot of processing steps to run out of bits.
Feeding the 24 bits into the most significant bits of the 32 bit processor would be stupid.The end result has to be dithered down into 24 bits only once. Here you lose some of the bits and the s/n ratio degrades a little. But with 22 bits left (2 are used by the dithering) it's still very transparant and very linear.
The actual algoritmths implemented in dsp are very important to.
Bad filter design in dsp results in sonic degradation.With DSD it gets tricky. There really is little headroom for arithmatic.
In practise you end op with requantizisation or dithering down into 1 bit on many occassions after a processing step. And each time the signal deteriorates.The moment all the bits in the dsd stream are used up to express a high signal level less or no bits are left to express the finer details of the signal.
"Even if you use a 128 bit DSP running at 384 KS/s, you're still not going to get transparent bass management"
In theory not because there is still a finite resolution.
But in practise it will be very transparant if correct filter arithmetic is applied.
128 bit is a LOT of resolution and dynamic range.
A 32 bit dynamic range is already deadly.Frank
Add two 24 bits pcm words and you could end up with a 25 bits word if the result overflows 24 bits. With 32 bit dsp that's not a problem. You need a lot of processing steps to run out of bits.
Feeding the 24 bits into the most significant bits of the 32 bit processor would be stupid.
Frank, if you use the extra 8 bits as MSBs as you seem to suggest, you haven't eliminated any issues stemming from a lack of headroom on the input. You've only temporarily avoided the issues until you have to fit the result back into 24 bits, at which point you will either need to attentuate or compress the result. Further, without any additional LSBs, any processing steps that involve roundoff error will add to the noise floor. That defeats the whole purpose of using 32 bits. If the input & end result are 24 bit, then the advantage of doing intermediate processing at 32 bits is specifically to make sure that any noise and distortion introduced during processing is well below the 24 bit noise floor.
I believe that if you ask around, you'll find that it's industry standard practice to leave 2-3 bits of headroom in the final result during PCM mastering and 6 dB of headroom in the result of DSD mastering. I believe you'll also find that when higher bit depths are used during intermediate processing, the extra bits are usually used on the LSB end.
With DSD it gets tricky. There really is little headroom for arithmatic. In practise you end op with requantizisation or dithering down into 1 bit on many occassions after a processing step. And each time the signal deteriorates.
The moment all the bits in the dsd stream are used up to express a high signal level less or no bits are left to express the finer details of the signal.
I don't want to argue this too much since I really have no idea what Sony is doing in their DSD DSP. I would only point out that Sony recommends leaving 6 dB of headroom in the master specifically to avoid this sort of problem in the playback system.
I said:
"Even if you use a 128 bit DSP running at 384 KS/s, you're still not going to get transparent bass management"
To which you replied:
In theory not because there is still a finite resolution.
But in practise it will be very transparant if correct filter arithmetic is applied.
128 bit is a LOT of resolution and dynamic range.
A 32 bit dynamic range is already deadly.
Sure, 32 bits is more than enough given that the end result has to fit back into 24. But my point is that the desired filter characteristics don't depend on how precise the arithmetic is. No matter how many bits you use to represent a sample, the filter designer faces the same tradeoffs between the magnitude response, phase response, and other factors. Implementing the filter in 32 bit arithmetic instead of 24 bit arithmetic only means that the noise you introduce through roundoff error is much lower. It doesn't change the filter's frequency response, and thus it doesn't eliminate any audible artifacts of that response.
Dave
"Frank, if you use the extra 8 bits as MSBs as you seem to suggest, you haven't eliminated any issues stemming from a lack of headroom on the input."I assume that the recorded material uses the dynamic range to it's fullest potential in the max department but, of course, just under clipping.
Utilizing more defensive methods during recording to prevent problems in the rest of your production chain is fine. But they should consider upgrading to better processing equipment, like 32 bits fp processing."You've only temporarily avoided the issues until you have to fit the result back into 24 bits, at which point you will either need to attentuate or compress the result. Further, without any additional LSBs, any processing steps that involve roundoff error will add to the noise floor."
That's actuallly the case, but it's the best you can do to fit back the resulting data into 24 bit's with minimal signal degradation.
"That defeats the whole purpose of using 32 bits."
Not at all, the 32 bits processing is providing the headroom to get the maximum result from the recording.
It also obseletes the 'old school' thinking of the defensive recording practises that where neccesairy to circumvent the 'processing' bottleneck (be it analog or digital).
"If the input & end result are 24 bit, then the advantage of doing intermediate processing at 32 bits is specifically to make sure that any noise and distortion introduced during processing is well below the 24 bit noise floor."Exactly.
"I believe that if you ask around, you'll find that it's industry standard practice to leave 2-3 bits of headroom in the final result during PCM mastering and 6 dB of headroom in the result of DSD mastering."
"I believe you'll also find that when higher bit depths are used during intermediate processing, the extra bits are usually used on the LSB end."
I don't think that's true for processing.
For sample rate conversions,attenuation and filtering processes it's best to use the extra bits as lsb.
But for mixing (summing) and gain applications the extra bits are best used to maximize the headroom.
In practise the extra bits should be split equal at both ends of the pcm word.
In the last stage the file could be adjusted to use the maximum bitwidth by a fairly 'simple' bitshift before dithering it down to 24 bit.
This will always ensure maximum performance from the dac utilized in the players.
Frank
My Denon AVR-4800 uses an analog filter on an "analog direct" input stream, processes the bass only in the digital domain and then mixes in analog. The only effect I hear when it is turned on or off is more or less bass. Similarly, the analog "bass management" performed by the Paradigm X-30 active crossover which comes with my Servo 15 subwoofer comes at the cost of a subtle brightness which is just barely detectable.
Jim,I didn't mean to suggest that all filters are highly audible, only that implementing them in the digital domain doesn't eliminate the basic tradeoffs associated with finding the most transparent filter response.
Also, I've found that for me, the only kind of bass management I can live with is to run all of my speakers full range and use a sub to fill in the bottom end of their response. This seems to work best with subs that are designed to be run this way, as they tend to have a gentler LPF which blends better with the rolloff of the mains. For a long time, I bought into the theory that you need to HPF the mains because sound quality would somehow suffer if they were driven into LF excursions below their cutoff. But now I find that any kind of traditional bass management involving an LPF on the sub and a HPF on the mains (typically with fairly steep matched rolloffs) compromises low frequency imaging somewhat and makes the sub stand out more rather than blending in transparently. Obviously, this is less of an issue the lower your mains go, and yours go lower than mine, so your mileage may vary.
Dave
I'm not sure you should only focus on the 1 bit issue, there is a lot of interesting ideas for sample rate conversion. Several posts (by Graemme, Stephen) were made recently on the HRH which I really found useful. I had not realized that the sampling rate of DSD provides the best conversion path from and to any sampling rate of the 44.1 family, and probably the best conversion from 44.1 or 88.2 to 48 and 96k, using only integer multiples. If it wasn't for technical limits (upsampling to 14.1MHz is not easy, obviously), I think this methodology used in DSD could apply for PCM.Anyway, I'm sure you already know that stuff :)
Best
That's common knowledge. Oversampling for 44.1 at 256 times was done 10..15 years ago in 1 bit dacs with noise shaping.The stupid part of dsd is that it cannot handle 48k and it's multiples.
And these are becoming the standard bitrates for the future.
Frank, I'm not discussing the 1 bit aspect, nor the noise shaping (I'm not qualified), just the "translation" path.Correct me if I'm wrong:
Couldn't the DSD conversion tables work both ways ? (in theory at least) So if this is correct it should be able to handle 48k, and 96k "naturally", only through integer multiples. If I understood the information posted on the HRH:
48 x 294 =14112000 / 5 = 2822400
2822400 / 64 = 44.1Whether this can be implemented in practice, I have no idea. I just wanted to say that this is also a conversion path between the two PCM families, and probably better than going through interpolation.
If you maintain a 24 or 32 bit depth, and if you have to store any information at those crazy rates during the intermediate steps of your manipulations, however, you probably need a server and a rack of 120Gb hard discs..
With the advent of peer-to-peer and grid computing, this kind of processing is just around the corner.
Just musing...
Best
Eric
I think it is safe to say that Pioneer found that conversion of DSD to 88.2/32 PCM and leaving the signal in PCM was more transparent and more convenient than Sony's 1 bit method. This is clearly a lossy conversion, as is any conversion of DSD to PCM other than 1 bit. But, if you do a 1 bit conversion you run into the headroom issue. Conversions from PCM to DSD, on the other hand, should be lossless provided that they are feasible.
as you know I don't trust Pioneer all that much with their implementation of standards :)
But it makes sense, because DSD is apparently compatible with all variants of the PCM 44.1 family.Best
Eric,It's only integer multiples of the sampling frequency, it's decidedly not integer arithmetic for the actual data reduction.
As a side effect of the DSD--> PCM translation you are left with the sonic signature of DSD (diminishing SNR as a function of frequency) on the resultant PCM output since PCM has a linear noise response inside the pass band.
No one has yet demonstrated that the process of Super Bit Mapping Direct gives the same result as a capture at the native rate.
Keep in mind, that to do an SBM Direct transfer, you have to do a 5x oversample -- it cannot be done natively. In other words, insufficient data exists in and of itself within the base sampling rate to reproduce a substantially lower fs.
Further, if you want to goto 192kHz you have to use 5x, followed by 2x oversampling.
In the end, you have a 147:24 reduction for 96K PCM (from 5x oversampling) and a 147:24 reduction for 192K PCM (from 10x oversampling).
Here I have assumed that sampling depth follows current practice, and results in a 24-bit PCM word.
Just my opinion!
I was not disussing the transfer from DSD to PCM, but only the methodology, which makes sense (to me). There is no reason you could not follow the same methodology, even retaining a 24 bit or 32 bit word. The conversion paths should work even within and between the two main PCM families (forget the Laserdisc for now :)Just a quick note on 192k: It would probably make more sense to oversample x2 the final 96 than go through 28MHz.
Best
And yet, the SBM direct process does indeed do a 5x, followed by a 2x oversample to generate enough data points to have a 192K datastream.There is (IMO) no point to 2x oversampling a 96K PCM datastream to get to the 192K datastream.
As far as a final value.... As I understand it, it's merely a summation of 1s (+2v) and 0s(-2v) divided by the number of samples.
nsamples == 147
1 samples == 100
0 samples == 47Aggregate voltage == 200 -94 or 106.
Divide aggregate voltage by number of samples (106 / 147) which is approximately .72 which gets encoded as a signed 22-bit value, then dithered out to 24 bits.
I know how the basics of the process works, I just don't think it's a sound idea above 48kHz fs. I can't prove it either ;-)
Regards,
JohnYour knowledge of DSD is far greater than mine.
The increased I/O processing added for that extra x2 step is enormous. There must be a good reason why there is an oversampling of x5 and x2 instead of x2 on the final 96k result, probably to prevent interpolation as much as possible.
The more I think about it, the more I find it intriguing.. 192k is the only rate where things do not translate smoothly with that method.As for the rest, I hope someone can challenge your post better than me. I'm out of my league at this point. : )
(Damn, where's Frank?)
Best
You don't get more information from dsd by oversampling it five times.
The oversampling is done to make things easier on the math applied.
Nope, I'm talking about having sufficient data to make the reduction meaningful.
"No one has yet demonstrated that the process of Super Bit Mapping Direct gives the same result as a capture at the native rate."I have listened for differences with DSOTM the 'redbook' layer of the hybrid against the 20th anniversary edition.
The latter has better detail and has a wider 'soundstage'.The cd layer of the hybrid has less depth and the instruments are sculpted a bit 'smaller'. It sounds a bit recessed.
I think the CD layer of the new DSOTM is the version of the "Shine On" box set. It would be interesting to compare the files using EAC.Best
Would make the file different anyway.The differences between the 20th anniversary and the redbook layer are very small.
I suspect the same source is used.
I will run some more comparisons at louder volume levels.
I used EAC "compare WAV" feature between the CD layer of some SACDs and the equivalent "CD-only" edition. For example I compared the redbook layer of the Ray Brown Monty Alexander etc from Telarc, with the CD (which also has a good bonus disc of Ray Brown). These two were published by Telarc, and were released almost simultaneously. So what you find is that all the tracks are identical... except for 638 "missing samples" in each track in the CD edition. These missing samples are located at the very begining of the track (usually in a silent place), you cannot hear anything in that section when you play it, and I cannot figure out what they are.At one point I thought of posting this on the HRH to ask what it could be, because I didn't feel like being accused of trolling.
Anyway, I'm afraid record companies won't bother re-mastering the CD layer so if you buy a hybrid SACD reedition, you only get the latest remastered CD version of that title if there was one.
Best
The 20th anniversary set and the new release are new masterings, and therfore no longer can be considered identical.
We'll never be able to hear much more than 20 bits.
This post is made possible by the generous support of people like you and our sponsors: