sample rate

by **Old School** » Tue Feb 06, 2024 10:13 pm

Hi All,
I could have brought this up in another thread, but didn't want to hijack the OP's question. Y-My-R said

"unless you're working for video/film, I see no reason at all to work in 48 kHz, and as stated in various posts before, the seemingly "higher" sample rate, at least in theory makes a "for CD" end-result worse, because the re-sampling and "odd math" that happens in the process (i.e. the pulse=on/pulse=off stuff I mentioned in that other post fall in-between all the time, and create an additional noise floor, that wouldn't be there, if you'd start in 44.1 kHz to begin with)."

First let me say that I agree that there is no audible difference between 44.1 and 48k. But....there seems to be a big difference if you are going to do any kind of deep processing like pitch correction or pitch shifting. I record a lot of vocal groups. Most all of them require pitch correction and I noticed quite early in my career (first with Antarres auto tune and later with Melodyne) that this kind of processing introduced a lot more noise at 44k than on projects recorded at the higher sample rate. Perhaps this could have just been in my head, but this was later confirmed to me when I started getting requests from groups that sang with prerecorded music tracks to raise or lower the pitch of the tracks. If a group sent me an Mp3, the results were terrible compared to starting with a wav file of the same track. I could save myself a ton of hard drive space if I went back to 44K, but I think my vocals would suffer. I would like to know the experience of others in this area.

Have a blessed day,
Mike

by **Y-my-R** » Wed Feb 07, 2024 2:42 am

It's entirely possible, that the algorithms used in in Melodyne or Autotune have their "math" be better suited for one sample rate over another. I do use Melodyne Assistant, occasionally, but haven't noticed added noise.

How do you move the audio from the DAW (or recorder) to Melodyne? Were those signals ever converted between 44.1 and 48 kHz in the process? If so, that conversion "could" be causing the added noise.

When I use Melodyne, it's usually via direct ARA integration between Studio One and Melodyne. Again, I haven't noticed any noise being added, but I have also never compared this to how it is when using 48 kHz, because I don't usually use that sample rate.

If there indeed IS an audible difference and Melodyne/Autotune etc. DO sound better when working with 48 kHz, then that IS of course a very good reason to use 48 kHz. I just never heard about that, and never noticed any added noise when working in 44.1 kHz.

Incidentally, I am at the exact point in the project I'm working on right now, where the next step is to fine-tune the vocals with Melodyne. I recorded all that in 44.1 kHz/24-bit though... I'll pay very close attention while doing the pitch edits, if there's any noise added in the background... and if I do notice that, will do another test recording at 48 kHz and see if that's better (it's my own vocals I need to tune a little, so "that singer" is always available, hahaha).

But just to further comment on why I said what I said... let's say you have 48 "samples" for easier comparison (instead of 48,000), that are represented by the following "dots":

................................................
but you need to convert this to 44(-ish) samples, but the time-line (aka the length of the line above) needs to stay exactly the same. I can't do this in this text editor, but for comparison sake, here are 44 dots in a "bold" font, in the hope that this will come out longer (sorry... can't do the .1 dot in this example):
............................................

Hopefully it's visible that these won't align if the 44 dots need to fill out the same length/distance. Here the same thing again, below one another, with the 44 dots on the bottom in bold, so they hopefully use up more space:

................................................
............................................

[Hmm... I just submitted and took a look... unfortunately, the 44 dots in bold don't make it much longer... just try to imagine how the spaces between the dots in the lower line, would have to get larger to make it just as long as the upper line, and think about how those dots will NOT align above/below each other].

If you try to "press" the 48 dots, into the 44 dot shape, but at the over-all same length (i.e. "time" in the real world) a lot of the dots will overlap or fall in-between dots, etc.

...and that's what's happening when you resample 48 kHz to 44.1 kHz - but every time there's a partial overlap, every such overlapping "new" dot position, adds a little bit of noise, that is usually called "quantization noise."

Or in other words, if you overlay that nice square wave that makes up the word clock and sample frequency at 48 kHz with a signal at 44.1 kHz, it doesn't match up at all. And to put them in the right spots, you need to "hack" the square-wave apart, because you can't just shift the timing of each of those samples around and drop a few.

All that "in-between" placement or "chopping up" of the previously nice and even word clock chain of pulses, now gets some noise added, because the math is not even between 48000 and 44100, and that creates noise.

I mean... I hope Melodyne or other "tuning" apps like that don't resample internally. That would explain the added noise. But as mentioned before, I could totally imagine that the Melodyne algorythms work better, "math-wise" with one sample rate over another. But I guess only the Melodyne guys would know (... not sure if they'd admit that, though, if so).

But again, if it is as you describe (less noise added at 48 kHz when using "tuning" apps like this), then that is absolutely a valid reason to work in 48 kHz. Especially if you stack a lot of tracks, where the noise would get stacked as well, and potentially become a lot more audible, than when doing a single sample rate conversion from the stereo-master, when the mix is done.

Anyway... I don't want to come across like an annoying know-it-all, and totally stand corrected if there's reasons like that, to record in 48 kHz. But the resampling from 48 kHz to 44.1 kHz for "for CD" productions, would otherwise diminish the signal more, than the theoretical quality gain at the "higher" sample rate of 48 kHz.
It's just a half-tone or so of difference, if you play a 44.1 kHz sample at 48 kHz (without conversion) or vice versa. So, that difference is not going to capture the sort of "extra harmonic overtones" or make the waveform representation smoother to a point where this "could" matter, such as when using much higher frequencies such as 88.1, 96 or even 192 kHz.
From that point of view, the difference between 44.1 and 48 kHz is meaningless.

...if there are more artifacts at 44.1 when doing certain types of edits, that is, of course, totally valid, though. I haven't noticed myself, yet, but will try to pay attention when I do the upcoming vocal edits/tuning.

by **Y-my-R** » Thu Feb 08, 2024 1:26 am

...I ended up looking at this post again, and totally missed, that you had said that it was better when working off of a WAV file than an MP3 file.

Well - yes. The way MP3 encoding works, is by removing audio "data" that is "masked" anyway and not really audible. But when editing the audio, that missing data can become a problem and you can get all kinds of artifacts (also when trying to EQ specific frequencies, etc. Certain things you might have been able to "work out towards the front" with an EQ on the WAV file, would just no longer be present enough to be able to boost it in the MP3).

You might be familiar with the FLAC audio format. This is supposedly "lossless compression" - but even with that, I get artifacts when editing, that are not present when working from a WAV file. I do sometimes have to pitch-correct FLAC-based files (but usually in entire half-note steps, and not just fine-tuning), and DID notice odd clicks in the audio, that are NOT present, if I do the same thing with the original WAV file (in the rare cases that I have that available).

Before that happened to me, I always thought that FLAC does NOT work on the basis of removing "masked" audio data that is inaudible, and that it's basically like a WAV file, but smaller. But from my personal experience, that is only the case when you leave the file alone, and don't do any edits that seriously alter the audio material - so, just as you described.

I have never noticed anything like that when editing with Melodyne in 44.1 kHz off of a WAV file, though. I worked on pitch corrections for a few hours, yesterday (WAV files at 44.1/24-bit) and didn't notice any added noise. I mean... as usual with Melodyne, it's best to FIRST clean up the track, remove breathing sounds (depending on the material... sometimes you want to leave that in, of course - in this case, I removed all of that) or lip smacks and stuff like that, and THEN import into Melodyne. Otherwise, you have all these atonal blobs in Melodyne, that are extra work to clean up, that would have been easier if removing that stuff before moving the audio to Melodyne. (I cleaned the vocal tracks first, yesterday, bounced the track, then opened the bounced track in Melodyne via the ARA integration).

I thought you were talking about some minimal, very quiet noise in the background, and almost "broke my ears" trying to hear what wasn't there. But if you were talking about pitch-shifting MP3s in Melodyne, this will create OBVIOUS and really ugly noise, that often sounds "washed out" or like some oscillation going on in the high frequencies. Similar to how it sounds when exporting an MP3 at a very low rate, like 64 kbps (or if you use a REALLY old MP3 encoder at a "modest" rate, such as 128 kbps. The early MP3 codecs from the late 90s were nowhere close in quality to the newer ones and made regular playback sound washy with modulations in the high frequencies... like, every hi-hat hit got a different pitch and noise over it. That improved a lot over the years with newer codecs, but is still present to some degree).

If I do anything in MP3, I always go with the maximum available rate at 320 kbps... but even with those, you'll get artifacts if doing stuff like pitch-editing. I just sometimes send out pre-mixes as MP3s at 320 kbps.

But even with those, you CAN hear the difference to uncompressed WAV files - even during regular playback without editing (but it's not obvious but very subtle).
When I mentioned that 44.1/48 to 88.2/96 kHz shootout we did at a past company, that EVERYONE failed, there was actually a third option in there, which were MP3s at the maximum rate (but this was around 2006 or so... the codecs were still worse, then).
Well over half the people in the room, were able to pick out the MP3 in the blind test... just not the 44.1/48 vs 88.2/96 change (and many of those who failed the MP3 test, weren't audio people but like, the front desk lady or repair personnel with no specific audio background, etc. - but also SOME audio guys, haha. I think I failed a couple of the MP3s, but picked out most of them.)

So, did you actually compare 44.1 WAV files with 48 kHz WAV files?
Or just MP3s that were based on 44.1 kHz, compared to 48 kHz WAVs?

I thought you were talking about 44.1 vs 48 kHz WAVs - you can't compare WAVs and MP3s, because of the frequency masking that becomes apparent when doing edits like that.

A similar thing applies to Mini-Disks that were popular for a while. They were removing stuff based on frequency masking as well, and because of that, fit the music on those smaller disks. I skipped that entirely, because of the data reduction. The smaller size (compared to regular CDs) wasn't convenient enough to justify the purchase, if I can't even use that for archiving, since I wouldn't be able to edit the audio from that anymore (after copying back to the computer), without risking that there will be artifacts.
But I guess if you didn't do any edits to the audio, coming off of Mini-Disks you woulnd't notice that there was a problem.

Anyway, are we talking about 44.1 WAV vs 48 kHz WAV pitch editing? Or are we comparing "uncompressed audio" (aka WAV) vs highly compressed audio on the basis of frequency masking/removal (aka MP3s)?

by **Old School** » Sat Feb 10, 2024 12:52 am

Hi,
Yes, I think there is a difference in generated artifacts when I'm using Melodyne for pitch correction, but let me qualify that statement. Not much difference when pitch correcting most lead vocals ( most of them are never off by more than 3o cents). But I record a lot of quartets and they don't always get their parts right, sometimes I might have to move a word or two quite a bit to get everyone on the right part. And then there is the phrasing, I have to make the harmony parts track the lead vocal as close as possible and sometimes the lead vocal puts a move in the middle of a word that the other parts didn't track, so I have to "make" them slur that word, and there are lots of notes that I have to strectch or shorten in time. I know this sounds time consuming and it is, but it is still faster than doing a bunch of takes and trying to get a singer to "unlearn" something that he learned wrong. In the singers defense, these guys learn these songs cover songs by listening to the original tracks and sometimes the changes are very quick and masked by the music. At any rate, they happily pay me to "fix it in the mix" and so my Melodyne proccessing is severe and sometimes these background vocals have a grit or graininess to them that is more apparent at 44 than 48k. At least it seems that way to me. I guess we're always going to get noise from somewhere, but in the genre of music I deal in the most, the vocals are always the focus of the mix and I believe any noise there will affect me more than the conversion math. At any rate I don't really hear a background noise floor in my final mixes, so even if I am halucinating, I'm stil good.

Have a blessed day,
Mike

sample rate

sample rate

Re: sample rate

Re: sample rate

Re: sample rate

Who is online