ID3 - ID3v2

ID3v2

In 1998, a new specification called ID3v2 was created by multiple contributors. Although it bears the name ID3, it has little to no relation to ID3v1.

ID3v2 tags are of variable size, and usually occur at the start of the file, to aid streaming media. They consist of a number of frames, each of which contains a piece of metadata. For example, the TIT2 frame contains the title, and the WOAR frame contains the URL of the artist's website. Frames can be up to 16MB in length, while total tag size is limited to 256MB. The internationalization problem was solved by allowing the encoding of strings not only with ISO-8859-1, but also with UTF-16.

Textual frames are marked with an encoding byte.

$00 – ISO-8859-1 (ASCII). $01 – UCS-2 (UTF-16 encoded Unicode with BOM), in ID3v2.2 and ID3v2.3. $02 – UTF-16BE encoded Unicode without BOM, in ID3v2.4. $03 – UTF-8 encoded Unicode, in ID3v2.4.

However, mojibake is still common when using local encoding instead of UTF-16. In particular, some Japanese editors are known to use shift-jis code, which usually has disastrous effects: it will assuredly not work with any standard-compliant software regardless of local settings (since it is not supported by the standard), assuredly not work outside Japan (since shift-jis has very little support outside of Japan), and will not even work on all Japanese computers even with a specifically non-compliant reader (as it is software-dependent and settings-dependent).

In the latest ID3v2 specification there are 84 types of frame, and applications can also define their own types. There are standard frames for containing cover art, BPM, copyright and license, lyrics, and arbitrary text and URL data, as well as other things.

There are three versions of ID3v2:

  • ID3v2.2 was the first public version of ID3v2. It used three character frame identifiers rather than four (TT2 for the title instead of TIT2). Most of the common v2.3 and v2.4 frames have direct analogues in v2.2. Now this standard is considered obsolete.
  • ID3v2.3 expanded the frame identifier to four characters, and added a number of frames. A frame could contain multiple values, separated with a / character. This is the most widely used version of ID3v2 tags.
  • ID3v2.4 is the latest version published, dated November 1, 2000. Notably, it allows textual data to be encoded in UTF-8, which was a common practice in earlier tags (despite the standard, since it was not supported yet) because it has several noticeable advantages over UTF-16. It uses a null byte to separate multiple values, so the character "/" can appear in text data again. Another new feature allows the addition of a tag to the end of the file before other tags (like ID3v1).

Windows Explorer and Windows Media Player cannot handle ID3v2.4 tags in any version, up to and including Windows 8 / Windows Media Player 12. Windows can understand ID3v2 up to and including version 2.3.

Read more about this topic:  ID3