ID3 Tags (#136)

The MP3 file format, didn't provide any means for including metadata about the song. ID3 tags were invented to solve this problem.

You can tell if an MP3 file includes ID3 tags by examining the last 128 bytes of the file. If they begin with the characters TAG, you have found an ID3 tag. The format of the tag is as follows:

TAG song artist album year comment genre

The spaces above are just for us humans. The actual tags are fixed-width fields with no spacing between them. Song, album, artist, and comment are 30 bytes each. The year is four bytes and the genre just gets one, which is an index into a list of predefined genres I'll include at the end of this quiz.

A minor change was later made to ID3 tags to allow them to include track numbers, creating ID3v1.1. In that format, if the 29th byte of a comment is null and the 30th is not, the 30th byte is an integer representing the track number.

Later changes evolved ID3v2 which is a scary beast we won't worry about.

This week's Ruby Quiz is to write an ID3 tag parser. Using a library is cheating. Roll up your sleeves and parse it yourself. It's not hard at all.

If you don't have MP3 files to test your solution on, you can find some free files at:

mfiles

Here's the official genre list with some extensions added by Winamp:

Blues
Classic Rock
Country
Dance
Disco
Funk
Grunge
Hip-Hop
Jazz
Metal
New Age
Oldies
Other
Pop
R&B
Rap
Reggae
Rock
Techno
Industrial
Alternative
Ska
Death Metal
Pranks
Soundtrack
Euro-Techno
Ambient
Trip-Hop
Vocal
Jazz+Funk
Fusion
Trance
Classical
Instrumental
Acid
House
Game
Sound Clip
Gospel
Noise
AlternRock
Bass
Soul
Punk
Space
Meditative
Instrumental Pop
Instrumental Rock
Ethnic
Gothic
Darkwave
Techno-Industrial
Electronic
Pop-Folk
Eurodance
Dream
Southern Rock
Comedy
Cult
Gangsta
Top 40
Christian Rap
Pop/Funk
Jungle
Native American
Cabaret
New Wave
Psychadelic
Rave
Showtunes
Trailer
Lo-Fi
Tribal
Acid Punk
Acid Jazz
Polka
Retro
Musical
Rock & Roll
Hard Rock
Folk
Folk-Rock
National Folk
Swing
Fast Fusion
Bebob
Latin
Revival
Celtic
Bluegrass
Avantgarde
Gothic Rock
Progressive Rock
Psychedelic Rock
Symphonic Rock
Slow Rock
Big Band
Chorus
Easy Listening
Acoustic
Humour
Speech
Chanson
Opera
Chamber Music
Sonata
Symphony
Booty Bass
Primus
Porn Groove
Satire
Slow Jam
Club
Tango
Samba
Folklore
Ballad
Power Ballad
Rhythmic Soul
Freestyle
Duet
Punk Rock
Drum Solo
A capella
Euro-House
Dance Hall


Quiz Summary

This quiz was another idea I got out of the Erlang book. The author uses a similar example to show how smooth processing binary data in Erlang can be. I'm happy to say that I found the submitted Ruby solutions to be equally smooth, if not more so.

The secret to binary parsing in Ruby is generally the String.unpack() method and the majority of the solutions capitalized on this technique. Technically, ID3 tags are mainly in plain text, with some null characters thrown in. Still, I think it's a good idea to get into the unpack() mindset anytime you start slicing up binary data.

I want to take a look at Eugene Kalenkovich's code below. It's a pretty typical usage of unpack() to parse some data. It also includes a nicety when reading the file that I'm ashamed to admit I didn't think of. Let's start with that:

ruby
def fileTail (file, offset)
f=File.new(file)
f.seek(-offset,IO::SEEK_END)
f.read
end

# ...

In my own code, I read the whole file into memory and indexed out the last 128 bytes. That's almost always the wrong approach and Eugene shows the correct strategy above. This code just opens the file, seek()s to offset bytes before the end, and read()s the needed data. That scales much better when the data sizes are significant.

As a quick aside, file_tail() would probably be a more Rubyish method name.

The code now builds a data structure class to hold the tag details. It starts like this:

ruby
# ...

class ID3Tag
GENRES=["Blues","Classic Rock","Country",…,"Dance Hall"]
attr_reader :title, :artist, :album, :year, :comment, :genre, :track

# ...

You can see that this class is mainly just a data structure that defines readers for all of the elements in a tag. I've trimmed the GENRES listing here, but the code included the full set.

I will say that some found more clever means to load the GENRES Array. Several people did fancy heredoc manipulations, but the most clever pulled the list out of the quiz document using open-uri and hpricot. That was especially wise this time since I made so many mistakes in the quiz description.

We're now ready for the actual parsing code:

ruby
# ...

def initialize fname
tag,@title,@artist,@album,@year,@comment,@genre=
fileTail(fname,128).unpack "A3A30A30A30A4A30C"
raise "No ID3 Info" if tag!='TAG'
s_com,flag,track=@comment.unpack "A28CC"
if flag==0 and track!=0
@comment=s_com
@track=track
end
@genre=GENRES[@genre]
@genre="Unknown" if !@genre
end
end

# ...

As you can see, the majority of the work is done on the first line with a single call to unpack(). The template fed to unpack() is the key to the whole puzzle. An "A" in the unpack() template instructs it to extract a String, removing any trailing spaces or null characters. By default the String is just one character long, but you can provide a number after the "A" to increase that count. The only other character used in the template is a "C" which is used to extract one character as an unsigned Integer. The unpack() call returns an Array which Eugene just mass-assigns to the relevant variables.

The rest is simple. The code checks the first chunk for the identifying "TAG" String and throws an error if it's not there. Then another call to unpack(), with a template much like the first, pulls the track field out of the comment. The if statement makes sure that assignment only happens when it is present. The final two lines are just a longhand form of:

ruby
@genre = GENRES[@genre] || "Unknown"

With all of the fields stored away in the proper variables, reader calls can be used to extract as needed. Eugene's actual application code just punted on that point though:

ruby
# ...

p ID3Tag.new(ARGV[0])

My thanks to all who have helped me with my Erlang comparisons these last two weeks. I promise, we're on to new topics now.

In fact, tomorrow we will tackle an interesting subproblem from this year's ICFP contest...