The Tag Mask importer scans a number of root folders for any files that match the given include masks and extracts metadata (tags) from the folder and file names using Tag Masks.
In a lot of cases, we use directory and file naming schemes to organize our files in a way that makes sense to us. The Tag Mask importer has a simple yet powerful mechanism for conveying these naming schemes to the media library.
A very common example is seen when we store music. We typically put the music files into a directory structure that has a meaning attached to each level and is easy for us to use. In some cases, we may decide to create a “music” folder and say that it will contain a folder for each artist. Inside each artist folder, we may create a folder for each album and put the music files there. Furthermore, we may decide that each music file will have as its name, the track number followed by a dash and then the name of the track.
To put this more succinctly, we could say that we have 3 directory levels.
Level 1 is called the “root” and corresponds to the “music” folder.
Level 2 is called the “artist” level
Level 3 is called the “album” level
And, each file in level 3 is named “track”-“name”.”extension”
A Tag Mask is nothing more than a language to convey the information we described above. A good Tag Mask for this case would be:
<artist>\<album>\<track>-<name>.<extension>
Because Level 1 is the “music” folder, we don’t have to specify it as part of the Tag Mask. But, we name each subsequent level and parts of the file name using the Tag Mask language.
In this language, we simply give each part a name in between < and >; this is called a tag. Text that is not within < and > is considered a separator.
So, given that the importer found the file
C:\music\Bob Marley\Legend\1-Is This Love.mp3
And that the root folder is “c:\music” we can easily see that using the Tag Mask above, the importer would extract the following pieces of information from this file name:
artist = Bob
Marley
album =
Legend
track = 1
name = Is
This Love
extension= mp3
That is the basic premise behind the Tag Mask importer: to glean the information you have stored in your folder and file names.
This information can then be easily placed in the media library and used for browsing your files in different ways.
Tag Masks can also divide parts of folder names, not just file names. For example, I like to keep my digital pictures in a two level directory structure. The first level is the name of the camera used. The second level contains the date the pictures were taken in YYYY-MM-DD format followed by a text description of the event, if any. The pictures themselves are stored below these folders and have obscure file names that don’t provide any additional information.
So, in this case, we could use the following Tag Mask:
<camera>\<date>
<event>\<name>.<>
Note that we used a space between <date> and <event>; this is what distinguishes the two parts of the folder names at that level. We have also used an empty tag mask after the period. This accomplishes two things: 1) it separates the file extension from the file name and 2) it ignores the file extension, since we don’t care about it.
We could have gone one step further and separated each part of the date as follows:
<camera>\<year>-<month>-<day>
<event>\<name>.<>
This would allow us much more fine-grained control from within the library; we could browse the collection by camera, year, month, day and event.
You can also use Tag Masks without a folder name part, just to catalog file names. This will be the case if the Tag Mask does not include any path delimiters “\”.
In all of the examples above, we used the file name as the <name> tag. This is not mandatory. You can just as easily use a part of the folder as the name for the imported item.
For example, if we have our movies stored as ISO’s with each movie being in a folder with the movie’s name, we could use the following Tag Mask:
<name>\<>
This one will take the folder name as the item’s <name> tag and will ignore the file name completely.
Tag Masks also attempt to handle variable folder levels gracefully rather than to simply ignore them. The rule is that any levels beyond those specified in the Tag Mask will be appended to the last tag in the path portion of the Tag Mask.
That is a mouthful, so let’s do an example given the following Tag Mask:
<artist>\<album>\<name>.<>
This Tag Mask has 2 folder levels (<artist> and <album>) and the file name level. But, if the importer encounters:
Bob Marley\Greatest Hits\Disc 1\Is This Love.mp3
We have a potential problem because this path has 3 folder levels and the file name. As the rule states above, the additional level is appended to the last tag in the path portion of the Tag Mask, which is the <album> tag. So, in this case, the result is:
artist = Bob Marley
album =
Greatest Hits\Disc 1
name = Is This Love
As you can see, the additional “Disc 1” level was appended to the <album> tag. This behavior may or may not be desirable to you, and you can change it by modifying the Tag Mask accordingly. If you wanted to capture the disc information separately, you could add a <disc> tag to the Tag Mask.
Of course, if the path has fewer levels than the Tag Mask, any tags for the non-existent levels will be ignored.
The Tag Mask importer supports the use of multiple Tag Masks. Since there can be multiple Tag Masks specified for the same set of files, there has to be a way for the importer to choose which one to use for each file encountered.
To accomplish this, each tag in a Tag Mask can have what is called a selector. A selector is a regular expression that is added to the tag name after an equals sign. When working with multiple Tag Masks, the importer checks each file name encountered against each Tag Mask in order until a match is found. If no match is found, the last Tag Mask is used.
Let’s continue to use this Tag Mask for our next example:
<artist>\<album>\<name>.<>
This works just fine for our collection, except for one folder. We have a folder called “singles” that has single tracks, which don’t belong to an album. Each file in the “singles” folder contains the name of the artist, a dash and the name of the song.
We could add the following Tag Mask to process this folder in a different way, keeping the other Tag Mask as the last one.
<album=singles>\<artist>-<name>.<>
<artist>\<album>\<name>.<>
In this case, we added the selector “singles” to the album tag. This means that this tag mask will only be applied when the first folder level is equal to “singles.” Any files that match those criteria will be processed using that Tag Mask; all others will be processed using the second Tag Mask.
More specifically, any files that match the following regular expression will use the first Tag Mask.
singles\*-*.*
What the importer does is convert each Tag Mask into a regular expression using any selectors and separators present in the Tag Mask. If a tag has no selector, it is simply replaced with the wildcard asterisk *. It then compares each path with the regular expression and, if they match, it uses that Tag Mask.
One potential drawback of our Tag Mask is that all of our singles will be tagged as being in the “singles” album. If that is not desirable, you can omit the tag name when using a selector. In that case, the selector will be used in the matching process, but the resulting tag will be ignored. So, we could change our Tag Mask to:
<=singles>\<artist>-<name>.<>
Which would mean all of our single tracks will only have the <artist> and <name> tags associated with them.
Selectors can be used inside any tag, including tags that are file name parts.
As we mentioned, a selector is a regular expression. This means that it can be literal text such as “singles” and it can also include any of the special characters of a regular expression. All selectors are case insensitive.
You can use sets of characters enclosed in [ and ]. Each set can include any number of literal characters and also ranges separated by a -. Each set matches only a single character if it is in the set. If the first character in a set is ! then the set will match a character if it is not part of the set. You can also use the * wildcard which matches any number of characters and the ? wildcard which matches only a single character.
For example:
[0-9][0-9]* will match any string that has the first two characters in the range 0 to 9.
?[0-9][0-9]* will match any string that has anything for a first character and the second and third characters are in the range 0 to 9.
[abc]*[abc] will match any string whose first and last characters are a,b or c ( or A, B or C ).
If you need to use any of the special characters *, ? or [ as part of a literal, you can put a forward slash / before it.
Copyright 2003 Meedio, LLC.