MySQL VARCHAR Lengths and UTF-8: Clarifying the Confusion
When creating a VARCHAR(32) field in a UTF-8 table in MySQL, the question arises: does it represent a storage capacity of 32 bytes or 32 characters? The understanding of this distinction can vary depending on the MySQL version.
Understanding VARCHAR Lengths
In MySQL 5.0.3 and later versions, VARCHAR lengths are interpreted in character units, not bytes. This means that a VARCHAR(32) field can accommodate up to 32 characters, regardless of the storage space required by each character.
In earlier versions of MySQL (4.1 and below), VARCHAR lengths were interpreted in bytes. Therefore, a VARCHAR(32) field in UTF-8 would allow for up to 32 bytes of data, which could potentially be stored as less than 32 characters, depending on the character encoding.
UTF-8 Character Encoding
UTF-8 is a character encoding that can represent a wide range of international characters, including multi-byte characters. This means that a single UTF-8 encoded character can take up multiple bytes of storage space.
Maximum VARCHAR Length with UTF-8
Due to the possibility of multi-byte UTF-8 characters, the effective maximum length of a VARCHAR column in MySQL 5.0.3 and later is subject to the maximum row size (65,535 bytes) and the character set used. For the UTF-8 character set, a VARCHAR can have a maximum length of approximately 21,844 characters, as each character can require up to three bytes.
The above is the detailed content of How Many Characters Can a VARCHAR(32) Field Hold in a UTF-8 MySQL Table?. For more information, please follow other related articles on the PHP Chinese website!