Home >Backend Development >PHP Tutorial >MySQL GBK→UTF-8 encoding conversion_PHP tutorial
Preface:
This is my first time writing a tutorial. It’s not actually a tutorial, I just want to summarize the conversion notes. If there are mistakes in the middle, or the method is not ideal enough, please reply and study it.
In addition, I also hope that our forum will not only be a place for chatting, but also that everyone can activate the learning atmosphere of our forum. After all, we all come from a place that should give us knowledge, no matter how much you get from there, you need Knowledge.
Okay, let’s get down to business.
First preparation:
Environment: MySQL4.1.x and above.
Convertz - text encoding conversion tool, introduced on molyx, I use it. In fact, there are many such tools.
Second theory:
MySQL’s internal storage character set supports UTF-8 starting from version 4.1. I only saw this in the past few days. Because during the process of upgrading the forum, the server database environment was 4.0.26 and I didn't know that it did not support the UTF-8 character set at the time, so I had to go through some twists and turns. In this way, if UTF-8 dump is involved, the MySQL version must be upgraded to 4.1 or above.
The general idea of conversion is - backup (prepared or not) → repair database → mysqldump export → Convertz conversion encoding → modify the converted file → mysqldump import recovery
Three practices:
1. Backup. This doesn’t need to be said too much. You can use any conventional backup method as long as you restore it yourself.
2. Repair. mysqlcheck -r -u user -p If everything is OK, then it is OK. If not, try again. It's not all OK yet, I don't know what to do.
3. Export. Since latin1 is the default storage, you need to determine the encoding format of your database in advance. For example, lncz.net was originally encoded as gbk, but stored as latin1. In this case, the encoding should be specified as latin1 when exporting, so that the gbk text can be correctly displayed in ANSI form after exporting.
Export command: mysqldump database_name field > path --default-character-set=latin1 -u user -p
A large database needs to be segmented, otherwise the next operation will be very troublesome. I exported each table separately. The idea at the time was relatively simple, because there were bad tables in the database, and I just wanted to know which table had an error during recovery and fix it individually.
4. Conversion. Convertz is very simple to use this software, no need to say more.
5. Modification. When I tried to directly import the recovery database, I failed N times and got garbled characters every time. After thinking about it carefully, I realized that if you import it back directly, the database will still use the default latin1 for storage, and your current encoding is utf-8, so it will perform another conversion and an error will occur. I'm not sure exactly how MySQL handles this. If anyone knows, please tell me. At this time, we need to add the statement "set names utf8;" to the converted file. Note that it is not utf-8; and we need to change the "CHARSET=latin1;" in the file to "CHARSET=utf8;" to specify the storage encoding of the table.
6. Recovery. Logically speaking, the recovery process should be very simple, and it is all handled by mysqldump. One thing to note is that if your database is large, you need to modify the global variable max_allowed_packet, which defaults to 1M. Depending on the size of your database table, modify the my.ini file accordingly.
Import command: mysqldump database_name < path -u user -p If the import goes smoothly, your database encoding will have been converted to utf-8.
Compare the dishes below. Please correct me if there are any mistakes and make me laugh. The above is for reference only.