Home  >  Article  >  Database  >  How to Fix Double-Encoded UTF8 Characters in a MySQL Table?

How to Fix Double-Encoded UTF8 Characters in a MySQL Table?

DDD
DDDOriginal
2024-10-31 00:15:03891browse

How to Fix Double-Encoded UTF8 Characters in a MySQL Table?

Fixing Double-Encoded UTF8 Characters in an UTF-8 Table

A previous import operation using LOAD DATA INFILE incorrectly assumed that the input CSV file was Latin1 encoded. This led to multibyte characters being split into two single-byte characters and subsequently double-encoded in UTF-8, creating anomalies such as 'ñ' instead of 'ñ'.

To rectify these misencoded strings, MySQL provides a solution using the CONVERT() function:

CONVERT(CAST(CONVERT(field USING latin1) AS BINARY) USING utf8)

This function takes the double-encoded field and sequentially converts it from Latin1 (assuming the original file encoding) to binary representation and finally to UTF-8, effectively undoing the double encoding.

To apply this correction, an UPDATE statement can be executed:

UPDATE tablename SET
    field = CONVERT(CAST(CONVERT(field USING latin1) AS BINARY) USING utf8);

This statement will replace the existing field values with the corrected ones, restoring the intended UTF-8 representation of the multibyte characters.

The above is the detailed content of How to Fix Double-Encoded UTF8 Characters in a MySQL Table?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn