FATX - Technical Design - Directory Table - VFAT Long File Names

VFAT Long File Names

VFAT Long File Names (LFN) are stored on a FAT file system using a trick—adding (possibly multiple) additional entries into the directory before the normal file entry. The additional entries are marked with the Volume Label, System, Hidden, and Read Only attributes (yielding 0x0F), which is a combination that is not expected in the MS-DOS environment, and therefore ignored by MS-DOS programs and third-party utilities. Notably, a directory containing only volume labels is considered as empty and is allowed to be deleted; such a situation appears if files created with long names are deleted from plain DOS. This method is very similar to the DELWATCH method to utilize the volume attribute to hide pending delete files for possible future undeletion since DR DOS 6.0 (1991) and higher.

Because older versions of DOS could mistake LFN names in the root directory for the volume label, VFAT was designed to create a blank volume label in the root directory before adding any LFN name entires (if a volume label did not already exist).

Each phony entry can contain up to 13 UTF-16 characters (26 bytes) by using fields in the record which contain file size or time stamps (but not the starting cluster field, for compatibility with disk utilities, the starting cluster field is set to a value of 0. See 8.3 filename for additional explanations). Up to 20 of these 13-character entries may be chained, supporting a maximum length of 255 UTF-16 characters.

After the last UTF-16 character, a 0x0000 is added. The remaining unused characters are filled with 0xFFFF.

LFN entries use the following format:

Byte Offset Length (bytes) Description
0x00 1 Sequence Number (bit 6: last logical, first physical LFN entry, bit 5: 0; bits 4-0: number 0x01..0x14 (0x1F), deleted entry: 0xE5)
0x01 10 Name characters (five UTF-16 characters)
0x0B 1 Attributes (always 0x0F)
0x0C 1 Type (always 0x00 for VFAT LFN, other values reserved for future use; for special usage of bits 4 and 3 in SFNs see below)
0x0D 1 Checksum of DOS file name
0x0E 12 Name characters (six UTF-16 characters)
0x1A 2 First cluster (always 0x0000)
0x1C 4 Name characters (two UTF-16 characters)

If there are multiple LFN entries, required to represent a file name, firstly comes the last LFN entry (the last part of the filename). The sequence number also has bit 6 (0x40) set (this means the last LFN entry, however it's the first entry seen when reading the directory file). The last LFN entry has the largest sequence number which decreases in following entries. The first LFN entry has sequence number 1. A value of 0xE5 is used to indicate that the entry is deleted.

On FAT12 and FAT16 volumes, testing for the values at 0x1A to be zero and at 0x1C to be non-zero can be used to distinguish between VFAT LFNs and pending delete files under DELWATCH.

For example if we have filename "File with very long filename.ext" it would be formatted like this:

Sequence number Entry data
0x43 "me.ext"
0x02 "y long filena"
0x01 "File with ver"
??? Normal 8.3 entry

A checksum also allows verification of whether a long file name matches the 8.3 name; such a mismatch could occur if a file was deleted and re-created using DOS in the same directory position. The checksum is calculated using the algorithm below. (Note that pFCBName is a pointer to the name as it appears in a regular directory entry, i.e. the first eight characters are the filename, and the last three are the extension. The dot is implicit. Any unused space in the filename is padded with space characters (ASCII 0x20). For example, "Readme.txt" would be "README␠␠TXT".)

unsigned char lfn_checksum(const unsigned char *pFCBName) { int i; unsigned char sum = 0; for (i = 0; i < 11; i++) sum = ((sum & 1) << 7) + (sum >> 1) + pFCBName; return sum; }

If a filename contains only lowercase letters, or is a combination of a lowercase basename with an uppercase extension, or vice-versa; and has no special characters, and fits within the 8.3 limits, a VFAT entry is not created on Windows NT and later versions of Windows such as XP. Instead, two bits in byte 0x0C of the directory entry are used to indicate that the filename should be considered as entirely or partially lowercase. Specifically, bit 4 means lowercase extension and bit 3 lowercase basename, which allows for combinations such as "example.TXT" or "HELLO.txt" but not "Mixed.txt". Few other operating systems support it. This creates a backwards-compatibility problem with older Windows versions (Windows 95 / 98 / 98 SE / ME) that see all-uppercase filenames if this extension has been used, and therefore can change the name of a file when it is transported between operating systems, such as on a USB flash drive. Current 2.6.x versions of Linux will recognize this extension when reading (source: kernel 2.6.18 /fs/fat/dir.c and fs/vfat/namei.c); the mount option shortname determines whether this feature is used when writing.

Read more about this topic:  FATX, Technical Design, Directory Table

Famous quotes containing the words long, file and/or names:

    But such is life, the silliest proverbs prove to be true, and when a man thinks, now it’s all right, it’s not all right by a long shot. Man proposes, God disposes, and there’s always that last straw to break the camel’s back.
    Alfred Döblin (1878–1957)

    While waiting to get married, several forms of employment were acceptable. Teaching kindergarten was for those girls who stayed in school four years. The rest were secretaries, typists, file clerks, or receptionists in insurance firms or banks, preferably those owned or run by the family, but respectable enough if the boss was an upstanding Christian member of the community.
    Barbara Howar (b. 1934)

    I come to this land to ride my horse,
    to try my own guitar, to copy out
    their two separate names like sunflowers, to conjure
    up my daily bread, to endure,
    somehow to endure.
    Anne Sexton (1928–1974)