I'm shooting some 24p 4K footage on a Sony A7s2. And I am recording audio separately on a Zoom H4. When I go to sync them up in post, I am finding that the audio is every so slightly slower than the video. When synced, a 20 second clip stays bang on, but when the clip is 2, 3, 4 mins, the audio slowly goes farther and farther off time with the video.
Does anyone know what setting I have wrong on the Zoom / what I have to adjust to ensure I'm recording video and audio at exactly the same speed?
Is your project set to 24 fps? My guess is that the camera is actually recording at 23.98 fps, so dropping it into a 24 fps timeline speeds upp the footage ever so slightly. Try creating a new 23.98fps project and trying out if it syncs up better with your audio.
Sample Rate is the most common culprit of audio drift and sync issues. Sample Rate is how many "samples per second" the sound is recorded at. Like frames per second in video. Problem is, they don't get handled like video frames. All audio has to be the same Sample Rate to match up in the edit to keep everything in sync.
44.1 means the audio is being sampled 44,100 times per second. 48kHz is 48,000 samples per second.
Drop those into a Timeline, and the samples (in all NLE's) are all used, and timed out according to the Timeline. So that 44.1kHz per second of samples gets stretched out and re-timed (so to speak) as if it were native 48kHz.
It would be nice if NLE's could start to treat audio sample rates just like video frame rates and reconfirm them "properly" to keep sync in a Timeline of a different sample rate.
FCPX.guru wrote: Drop those into a Timeline, and the samples (in all NLE's) are all used, and timed out according to the Timeline. So that 44.1kHz per second of samples gets stretched out and re-timed (so to speak) as if it were native 48kHz.
Can you give an example when this happens? I've never seen any NLE do this. Mixing 44.1 and 48 kHz audio in the same timeline works fine for me. Ripped CD tracks have always worked as expected in 48 kHz timelines, for instance. If a 44.1 kHz were retimed to play back at 48 kHz, that would speed up the track by about 8%. That means that sync would drift by one second every 12 seconds. A two and a half minute song would play back in just about 2:18. Plus the pitch would be way up. I've never seen this behavior.
FCPX.guru wrote: It would be nice if NLE's could start to treat audio sample rates just like video frame rates and reconfirm them "properly" to keep sync in a Timeline of a different sample rate.
This is exactly how FCPX works, in my experience.
The way sample rate does affect sync is that different devices, using there own clocks, don't necessarily sync up perfectly. That is; 1 second on one device might be considered 1.002 sec on a second device; or 48000 samples/sec on device A is considered 47904 samples/sec by device B. When playing recordings from both devices in a single timeline, both tracks are played back at 48 kHz so they drift out of sync. So, it is the devices' difference in actual clock speed, more than sample rate settings, that cause issues.
FCPX.guru wrote: Take two shots for a multicam, two different sample rates, they will drift out of sync. I see it weekly.
I have heard this many times but I don't see it myself. I just imported a 27 min 39 sec 44.1 Khz mp3 file from a Tascam DR07 and a 44 min 23 sec 48 Khz wav file from a Tascam DR10L (both made of the same event) and they synced perfectly using an FCPX sync clip with no drift. I then created a project and dropped in the 48 Khz file which set the project audio characteristics to 48 Khz, then dropped in the 44.1 Khz file, manually synced the beginning and they were stayed in sync to the end.
However -- the indicated program length of each file differed from that shown in FCPX, assuming we trust the program length as shown by Finder, MediaInfo or Invisor. The difference was 27 min 39 sec 533 mS reported by Invisor or Finder vs 27:34:01 and 45 subframes in FCPX. The wav file was 44 min 23 sec 95 mS reported by Invisor & Finder, vs 44:20:10 and 0 subframes reported by FCPX. I'm not sure what that means. Subframe display is enabled in preferences.
We tend to assume the "timeline" time or program length shown by Quicktime player is true clock time, but in some cases this is not true. The timecode shown in FCPX is more accurately called a "time like scale" and just because a frame is labeled 00:00:01:00 does not mean 1 sec has passed. Rather something akin to 1 sec has passed, and the timecode is actually an indexing system for finding material. Apple explained this back in the FCP7 documentation:
The bottom line is you cannot trust anything except what you actually measure, so the only way to be certain is play the file and time it with a stop watch.
It is true the audio data is sampled at two different rates but the playback should always be true to real time, IOW 30 min. of 44.1 Khz material plays in the same time as 30 min of 48 Khz material. When those two sample rates are combined on a timeline they must be conformed somehow. The software could stretch or shrink one or the other by 8.8% to make the samples line up but FCPX does not apparently do that. If it did, then at a normal playback rate one or the other audio track would be tone shifted and the program length would be 8.8% different.
I don't really know what FCPX does to conform 44.1 and 48Khz audio sample rates but it's not stretching or shrinking the track length by 8.8%.
One possible explanation is the smallest FCPX time granularity is actually 1/80th of a frame at the subframe level. You can't actually see individual audio samples. At 29.97, each 1/80th of a frame is 0.417 milliseconds. In each of those sub-frames, at 48 Khz there are 20.02 samples per sub-frame. At 41 Khz there are 18.393 samples per sub-frame.
So in either case there will be a periodic "leap sub frame" situation where occasionally a sub-frame will contain more samples than others. From this standpoint it's not really necessary that each audio sample line up vertically with adjacent tracks because you can't access that level of granularity anyway.
I think in many cases audio drift is caused by drifting timebases in each device. The root source of the timebase is an inexpensive quartz oscillator, just like a cheap quartz wristwatch. Oscillators don't have perfect frequency stability, which is why two quartz watches will gradually drift apart. Actual surveys show some quartz watches drift by 2 sec per day (much worse than the spec), so two watches with opposite drift could move apart at 4 sec per day. For two audio recorders this equates to 100 milliseconds per hour, easily enough to cause a lip sync issue on longer programs.
It is possible to make precision temperature-controlled crystal oscillators that have low drift, or use GPS-corrected timebases, but I doubt most cameras and audio recorders use those. That said, the two Tascams I mentioned above are inexpensive, yet they didn't drift any in 30 min. Maybe most of them are fairly stable but older or sick units develop drift.