-
Atoms, Boxes, Parents, Children & hexMedia_Dev 2021. 1. 7. 18:04반응형
http://atomicparsley.sourceforge.net/mpeg-4files.html
An MPEG-4 file is made of a number of discrete units called atoms (well, they were called atoms in the first version of the specification, now they are prosaically called 'boxes'). An atom has a format:
Anything beyond that basic 8 bytes is either optional & defined by the hierarchy it is found in (moov.udta.meta.XXXX atoms have a format defined by Quicktime), or defined by the atom itself. The ftyp atom is ALWAYS first, and has a certain type of format - it tells what type of file it is & the basic versioning of the atom structures.
In the above example the moov atom has a length of 0x00001D38 or 7480 bytes. Immediately following the moov name however is a new atom. This is the mvhd atom, and its length is 0x0000006C or 108 bytes. Because 108 bytes is less than 7480 bytes, the mvhd atom is a child atom of the moov atom. The MPEG-4 specification says that either an atom can be a parent atom (as moov is a parent to mvhd) or it can carry some sort of information on it (as ftyp & mvhd show above), but not both.
The length of the atom is determined by the length of itself PLUS any and all atoms in the level immediately below it - not all the way down to the end of the hierarchy. For example, the moov atom sums the length of the mvhd atom and other atoms on the same level (not shown), but not children of mvhd - mvhd sums those lengths. The atoms in the level below sum the lengths in the atoms below them until you get to the end of a hierarchy. At that point the sum of that atom is:
4 bytes for the atom length
4 bytes for the atom name
??? bytes that are optional for any data it might holdThe minimum length of an atom then would be 8 bytes.
The 'Atom Is A Parent Or Holds Data' rule is made to be broken . Often the atom under moov.trak.mdia.minf.stbl.stsd is a parent and contains data. Apple's drm implementation breaks this rule further. The other standard atom that breaks this rule is moov.udta.meta for historical reasons. Still, the MPEG-4 container is relatively easy to understand & highly flexible.
The most important part of an MPEG-4 file is the mdat atom - its where the actual raw information for the file is stored. This top level atom takes up the bulk of an MPEG-4 file. However, the moov atom comprises a number of different atoms and hierarchies, and provides for basic functionality - like specifying the dimensions of a video file, or the duration of a song.
uuid atoms are user-defined atoms, and are similar to normal atoms, but their name is 8 bytes (4 bytes holding uuid and the name of the uuid atom). Sony PSP mp4 files notably use uuid atoms. AtomicParsley supports setting & reading its own uuid atoms to carry supplemental metadata.
stco & mdat
What happens when atoms are added, modified or removed is that the tree gets changed, and then the lengths of the atoms needs to be re-determined. If the mdat atom moves relative to the beginning of the file, further adjustments need to be made. The free atom is meant to minimize this exact behavior.
The mdat data is made up of 'chunks' - these chunks are referenced in moov to provide for seeking within the file, and to tell the player where the beginning of the media data is. This information is stored on the moov.trak.mdia.minf.stbl.stco Sample Table Chunk Offset atom. This atom has a particular structure:
Each entry in the stco atom (and there can be mutliple stco atoms) needs to be readjusted.
Known iTunes Metadata Atoms
Metadata to be used with iTunes comes in the moov.udta.meta.ilst hierarchy. The atoms directly under the ilst atom have specific names, but they do not carry the data directly. The children of these named atoms (the data atom) carry the actual information. The 4 letter code of the parent is listed below, while the atom flags after the data atom are listed in the Class column. It is the class of the data atom that broadly determines whether text or numbers or binary data is contained.
4char code
Name
Class/Flag
Appearance
©alb
Album
1
text
iTunes 4.0
©art
Artist
1
text
iTunes 4.0
aART
Album Artist
1
text
??
©cmt
Comment
1
text
iTunes 4.0
©day
Year
1
text
iTunes 4.0
©nam
Title
1
text
iTunes 4.0
©gen | gnre
Genre
1 | 0 1
text | uint8
iTunes 4.0
trkn
Track number
0
uint8
iTunes 4.0
disk
Disk number
0
uint8
iTunes 4.0
©wrt
Composer
1
text
iTunes 4.0
©too
Encoder
1
text
iTunes 4.0
tmpo
BPM
21
uint8
iTunes 4.0
cprt
Copyright
1
text
? iTunes 4.0
cpil
Compilation
21
uint8
iTunes 4.0
covr
Artwork
13 | 14 2
jpeg | png
iTunes 4.0
rtng
Rating/Advisory
21
uint8
iTunes 4.0
©grp
Grouping
1
text
iTunes 4.2
stik
?? (stik)
21
uint8
??
pcst
Podcast
21
uint8
iTunes 4.9
catg
Category
1
text
iTunes 4.9
keyw
Keyword
1
text
iTunes 4.9
purl
Podcast URL
21 | 0 4
uint8
iTunes 4.9
egid
Episode Global Unique ID
21 | 0 4
uint8
iTunes 4.9
desc
Description
1
text
iTunes 5.0
©lyr
Lyrics
1 3
text
iTunes 5.0
tvnn
TV Network Name
1
text
iTunes 6.0
tvsh
TV Show Name
1
text
iTunes 6.0
tven
TV Episode Number
1
text
iTunes 6.0
tvsn
TV Season
21
uint8
iTunes 6.0
tves
TV Episode
21
uint8
iTunes 6.0
purd
Purchase Date
1
text
iTunes 6.0.2
pgap
Gapless Playback
21
uin8
iTunes 7.0
1 Genre comes on 2 atoms - standard genres are on gnre; custom genres are on ©gen; only 1 is permitted at a time.
2 Coverart is the only atom that permits more than 1 data child atom. If there is a limit, its > 16.
3 Lyrics is the only text atom that doesnt't fall under a 255byte limit.
4 Apple changed from the original 21 to the current 0 around the release of iTunes 6.0.3
(there are also iTMS atoms of akID, sfID, geID, plID, atID, cnID & apID; some metadata like Soundcheck information is carried on ---- atoms)Text metadata has a limit of 255bytes. It comes in UTF-8 (no BOM), and isn't null terminated.
Unsigned integer metadata is 8bits wide (a limit of 255 for tracknum for example). Most have a format (cpil is 4 NULL bytes, then the value) specific to that atom. Only numerical data can be carried for most of these (except purl & egid). Vinyl taggers of "A1": complain to Apple.
Here is a sample of metadata - compilation (true) & tracknumber (2 of 5):
And for those thinking "Heavens to Murgatroid, how did cpil's 21 become 15 in the pic above... gosh, golly" - hex.
There is also another form of tagging that iTunes uses internally by a few inaccessible tags. Called the reverse DNS style (or something along that line), this form is pictured below:
Atom ---- @ 39852 of size: 72, ends @ 39924
Atom mean @ 39860 of size: 28, ends @ 39888
Atom name @ 39888 of size: 16, ends @ 39904
Atom data @ 39904 of size: 20, ends @ 39924
where the mean atom carries the reverse DNS domain (com.apple.iTunes) & the name atom carries the descriptor for the contents of the data atom.
Known names/descriptors:
tool
iTunNORM
iTunSMPB
iTunes_CDDB_IDs
iTunes_CDDB_TrackNumber
Tagging implementations
The only style of metadata defined in the ISO Base Media File Format is what amounts to a single atom cprt - and the format described is in the 3gp asset style. In fact, the ISO copyright notice is identical to the 3gp copyright asset. This copyright notice is the only common tag available to all mpeg-4 files and derivatives.
The major brands that iTunes writes are listed at http://www.mp4ra.org/filetype.html, but iTunes-style metadata isn't defined in any publicly available document - its format is determined by the types of files that iTunes & the iTunes Music Store produce & provide. Since the goal of AtomicParsley is to set metadata is be maximally compatible with iTunes, the iTunes-style format of metadata is fully supported.
The 3GPP assets are family of metadata tags that the 3gp specification allows. These atoms differ in a number of ways from the more common iTunes style. There is no data atom; information is carried directly on the atom. Most 3gp assets have a language setting - so dozens of a like named atom are permitted that differ in the language used (around 480 languages).
A new style of metadata emerged with the foobar2000 0.9.x series. For whatever reason, this style typically duplicates the iTunes-style metadata. There is a double artwork tag, the artist is listed twice - it is heavy with redundancy. It is also non-compliant. A generic tool isn't allowed to create their own atoms - a mechanism exists to extend for supplemental functionality - the uuid atom form. foobar2000 doesn't use this mechanism. Nero has also adopted this tagging style with their freeware tagging tools. It seems to also write some tags in the reverse DNS form in the com.apple.iTunes domain.
The newest style of tagging was recently added at the MPEG4 Registration Authority. Currently, there is no known tool that can read or set this style of metadata.
반응형'Media_Dev' 카테고리의 다른 글
Go언어 시작하기 - Go developer를 위한 Rust 패러다임 / Paradigms of RUST for GO developers (0) 2021.01.09 Atom(Box) Viewer (0) 2020.12.21 Spoon Job Description [스푼라디오] 미디어 서버 개발자 (Media Server Developer) 추천 (0) 2020.11.18 Who are the WebRTC Market Global Key Players? (0) 2020.03.05 CMAF Chunked for low latency (0) 2020.02.17