Home  >  Article  >  Database  >  How do VARCHAR lengths work in MySQL with UTF-8: Bytes or Characters?

How do VARCHAR lengths work in MySQL with UTF-8: Bytes or Characters?

Barbara Streisand
Barbara StreisandOriginal
2024-11-22 02:21:13511browse

How do VARCHAR lengths work in MySQL with UTF-8: Bytes or Characters?

MySQL VARCHAR Lengths and UTF-8: Bytes versus Characters

When creating a VARCHAR field in a MySQL table, it's crucial to understand how the specified length is interpreted. In MySQL versions prior to 4.1, VARCHAR lengths were defined in bytes. However, from MySQL 4.1 onwards, lengths are counted in characters.

The VARCHAR(32) field in a UTF-8 table represents 32 characters, not 32 bytes. This is because UTF-8 is a variable-length encoding, where each character can occupy multiple bytes (up to 4 bytes).

The official MySQL documentation for version 5 states:

"MySQL interprets length specifications in character column definitions in character units. This applies to CHAR, VARCHAR, and the TEXT types."

However, the maximum length of a VARCHAR column is also influenced by UTF-8. In MySQL 5.0.3 and later, the effective maximum length is limited by the row size (65,535 bytes) and the character set used.

For example, since UTF-8 characters can require up to 3 bytes per character, a VARCHAR column using UTF-8 can be declared with a maximum of 21,844 characters. This is because 21,844 multiplied by 3 (bytes per character) is 65,532, leaving a buffer of 3 bytes for other column data within the maximum row size.

The above is the detailed content of How do VARCHAR lengths work in MySQL with UTF-8: Bytes or Characters?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn