Home  >  Article  >  Database  >  Why Does \"Harligt\" and \"Härligt\" Return the Same Results in MySQL? A Look at Collation and Character Normalization.

Why Does \"Harligt\" and \"Härligt\" Return the Same Results in MySQL? A Look at Collation and Character Normalization.

DDD
DDDOriginal
2024-10-26 22:48:30425browse

Why Does

MySQL's Treatment of Special Characters: A Paradox Explained

In MySQL, executing queries involving special characters like 'Å', 'Ä', and 'Ö' often raises questions regarding result consistency. For instance, queries with 'Harligt' and 'Härligt' yield identical results, leaving users perplexed.

This phenomenon is attributed to MySQL's default collation settings, specifically "utf8_general_ci" and "utf8_unicode_ci." These collations normalize certain unicode characters, including Scandinavian letters, by equating them to their English equivalents (e.g., "Ä = A"). This normalization simplifies comparison operations and searches but can be inconvenient in certain scenarios.

To resolve this issue, consider the following options:

  • Use a Different Collation: Collations like "utf8_bin" preserve character values, including special characters, but may have implications for other operations.
  • Specify Collation in Queries: For specific queries where you want to override the default collation, append "COLLATE utf8_bin" to the query. Example:
select * from topics where name='Harligt' COLLATE utf8_bin;
  • Create a Custom Collation: If neither of the above solutions meet your needs, you can create a custom collation that suits your specific requirements.

It's worth noting that case-insensitive LIKE operations in MySQL cannot be performed without the normalization of special characters. However, related discussions can be found here:

  • [Looking for case insensitive MySQL collation where “a” != “ä”](https://dba.stackexchange.com/questions/231116/looking-for-case-insensitive-mysql-collation-where-a-a)
  • [MYSQL case sensitive search for utf8_bin field](https://stackoverflow.com/questions/9704962/mysql-case-sensitive-search-for-utf8-bin-field)

The above is the detailed content of Why Does \"Harligt\" and \"Härligt\" Return the Same Results in MySQL? A Look at Collation and Character Normalization.. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn