Diving into seeking issue with MP3 files
The following story is kindly provided by Eduardo, an Angular developer at Valor Software.
Playing stops before the end of the track
While working on a 10 million-plus user meditation app, I ran into a very weird issue. While the Quality Assurance team was trying to play a track, it would stop playing with around 13 seconds remaining and report as completed. After some testing, it seems the issue was only happening when there was seeking involved. Playing the whole track always ended at 0s remaining, pausing and playing had no impact either.
Seeking within MP3
Looking into the file format, I found it is an MP3. There are many ways to seek within an audio file. One way is to build an index of timestamps and byte offsets, like checkpoints in the track. Option two is what MP3 does: nothing. MP3 has bitrate, so you seek it by multiplying bitrate with time. Want to seek 6.3 seconds in a 100 kbps file? — Jump to the 630kb position (small b here means bits, not bytes). Should be somewhat accurate, but a 10s+ difference seems a bit too much. That is until you analyze the files:
$ mpck ./track_with_issue.mp3
SUMMARY: track_with_issue.mp3
version MPEG v1.0
layer 3
average bitrate 279491 bps (VBR)
samplerate 44100 Hz
frames 31503
time 13:42.935
unidentified 0 b (0%)
errors none
result Ok$ mpck ./other_track.mp3
SUMMARY: ./other_track.mp3
version MPEG v1.0
layer 3
bitrate 160000 bps
samplerate 44100 Hz
frames 131786
time 57:22.573
unidentified 0 b (0%)
errors none
result Ok
Notice the first track has an Average Bitrate. Welcome to the marvelous world of VBR, also known as Variable Bit Rate. MP3 encoded in VBR means you don’t know where exactly to seek to because the bitrate isn’t constant, so you “guess” based on the average bit rate. Imagine you have 10 seconds of 100 kbps and 10 seconds of 200 kbps bitrate in a single file. You have 100kbps from 0 to 1000kb and 200kbps from 1001 kb to 3000 kb. The average bitrate is 150 kbps, but if you seek to 10 seconds using the previous formula, you’d seek to 1500 kb, which is already in the 200 kbps part, jumping to about 12.5 seconds of the track. Also, note that if you inverted the parts, you’d be at 7.5s.
Disclaimer.
Please, keep in mind that calculations have been simplified to make them easier to understand and are not 100% accurate.
Solution time!
So what could be done? — Well, the first obvious solution would be to just re-encode it in MP3 with CBR (constant bit rate). The most elegant solution, though, is to drop MP3 completely and start using better audio formats like Vorbis/Opus (.ogg), which have proper support for seeking.
Thanks for attention! Hope to be back with the next discovery soon :)