Oh, I forgot to mention that each skin map vertex uses 8 bytes and each skin map face uses 6 bytes. Because textures aren't animated with animated models, this only gets processed once instead. Each position on each of 3 axes (2 for skin vertices) uses 4 bytes (a 32-bit float). Each vertex is assigned a value from 0 to 65,535 (a 16-bit variable) and faces are built based on a choice of 3 such 16-bit positions which is where the 6 comes from.