What are the differences between different encoding formats in mysql-Mysql Tutorial-php.cn

Home

Database

Mysql Tutorial

What are the differences between different encoding formats in mysql

(*-*)浩

May 08, 2019 am 10:39 AM

mysqlEncoding format

The difference between different encoding formats in mysql is: ASCII encoding directly stores the serial number of the character in the encoded character set as a numerical value in the computer; Latin1 encoding, which is an extension of ASCII encoding; UTF- 8 encoding is a variable-length character encoding for Unicode.

What are the differences between different encoding formats in mysql

This article will explain and introduce some encodings of mysql, but this is not all character set encodings.

Recommended course: mysql video tutorial

1. Introduction to character set

Character (Character) is a variety of text and The general term for symbols, including the characters of various countries, punctuation marks, graphic symbols, numbers, etc.

Character set is a collection of multiple characters. There are many types of character sets. Each character set contains a different number of characters. Common character set names: ASCII character set, GB2312 character set, BIG5 Character set, GB18030 character set, Unicode character set, etc. In order for a computer to accurately process text in various character sets, character encoding is required so that the computer can recognize and store various text.

Character encoding (Character encoding) is to encode a certain character in the character set into a character in the specified character set so that text can be stored in the computer and transmitted through the communication network. Common examples include encoding the Latin alphabet into ASCII, which numbers letters, numbers, and other symbols and represents them in a 7-bit binary system.
Character order (collation) refers to the comparison rules between characters in the same character set. Only after determining the character order can we define what are equivalent characters in a character set and the size relationship between characters. A character can contain multiple character sequences. The MySQL character order naming rules are: start with the character set name corresponding to the character order, center with the country name (or center with general), and end with ci, cs, or bin. The character sequence ending with ci indicates case insensitivity, the character sequence ending with cs indicates case sensitivity, and the character sequence ending with bin indicates comparison based on binary coded values.

2. ASCII encoding

ASCII is both a coded character set and a character encoding. ASCII directly stores the serial number of the character in the coded character set as a character in the computer. numerical value.
For example: In ASCII, the A character is ranked 65th in the table, the serial number is 65, and the value of A after encoding is 0100 0001, which is the binary conversion result of 65 in decimal.

3. Latin1 character set

Latin1 character set is extended based on the ASCII character set. It still uses one byte to represent characters, but the high bit is enabled. The expansion Specifies the representation range of the character set.

4. UTF-8 encoding

UTF-8 (8-bit Unicode Transformation Format) is a variable-length character encoding for Unicode, also known as Universal code. Created by Ken Thompson in 1992. It is now standardized as RFC 3629. UTF-8 encodes Unicode characters using 1 to 6 bytes.
UTF-8 is a variable-length byte encoding method. For the UTF-8 encoding of a certain character, if there is only one byte, the highest binary bit is 0; if it is multiple bytes, the first byte starts from the highest bit, and the number of consecutive binary bits is 1. Determines the number of digits to encode, and the remaining bytes start with 10. UTF-8 can be used up to 6 bytes. As shown in the table:
1 Byte 0xxxxxxx
2 Byte 110xxxxx 10xxxxxx
3 Byte 1110xxxx 10xxxxxx 10xxxxxx
4 Byte 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
5 Byte 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
6 Bytes 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
Therefore, the actual number of digits that can be used to represent character encoding in UTF-8 is up to 31, which is the bit represented by x in the above table. Except for the control bits (10 at the beginning of each byte, etc.), the bits represented by x correspond to the UNICODE encoding one-to-one, and the bit order is the same.
When actually converting UNICODE to UTF-8 encoding, the high-order 0s should be removed first, and then the minimum number of UTF-8 encoding digits required is determined based on the remaining encoding digits. Therefore, characters in the basic ASCII character set (UNICODE compatible with ASCII) can be represented by only one byte of UTF-8 encoding (7 binary bits).

The above is the detailed content of What are the differences between different encoding formats in mysql. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Explain the InnoDB Buffer Pool and its importance for performance.Apr 19, 2025 am 12:24 AM

InnoDBBufferPool reduces disk I/O by caching data and indexing pages, improving database performance. Its working principle includes: 1. Data reading: Read data from BufferPool; 2. Data writing: After modifying the data, write to BufferPool and refresh it to disk regularly; 3. Cache management: Use the LRU algorithm to manage cache pages; 4. Reading mechanism: Load adjacent data pages in advance. By sizing the BufferPool and using multiple instances, database performance can be optimized.

MySQL vs. Other Programming Languages: A ComparisonApr 19, 2025 am 12:22 AM

Compared with other programming languages, MySQL is mainly used to store and manage data, while other languages such as Python, Java, and C are used for logical processing and application development. MySQL is known for its high performance, scalability and cross-platform support, suitable for data management needs, while other languages have advantages in their respective fields such as data analytics, enterprise applications, and system programming.

Learning MySQL: A Step-by-Step Guide for New UsersApr 19, 2025 am 12:19 AM

MySQL is worth learning because it is a powerful open source database management system suitable for data storage, management and analysis. 1) MySQL is a relational database that uses SQL to operate data and is suitable for structured data management. 2) The SQL language is the key to interacting with MySQL and supports CRUD operations. 3) The working principle of MySQL includes client/server architecture, storage engine and query optimizer. 4) Basic usage includes creating databases and tables, and advanced usage involves joining tables using JOIN. 5) Common errors include syntax errors and permission issues, and debugging skills include checking syntax and using EXPLAIN commands. 6) Performance optimization involves the use of indexes, optimization of SQL statements and regular maintenance of databases.

MySQL: Essential Skills for Beginners to MasterApr 18, 2025 am 12:24 AM

MySQL is suitable for beginners to learn database skills. 1. Install MySQL server and client tools. 2. Understand basic SQL queries, such as SELECT. 3. Master data operations: create tables, insert, update, and delete data. 4. Learn advanced skills: subquery and window functions. 5. Debugging and optimization: Check syntax, use indexes, avoid SELECT*, and use LIMIT.

MySQL: Structured Data and Relational DatabasesApr 18, 2025 am 12:22 AM

MySQL efficiently manages structured data through table structure and SQL query, and implements inter-table relationships through foreign keys. 1. Define the data format and type when creating a table. 2. Use foreign keys to establish relationships between tables. 3. Improve performance through indexing and query optimization. 4. Regularly backup and monitor databases to ensure data security and performance optimization.

MySQL: Key Features and Capabilities ExplainedApr 18, 2025 am 12:17 AM

MySQL is an open source relational database management system that is widely used in Web development. Its key features include: 1. Supports multiple storage engines, such as InnoDB and MyISAM, suitable for different scenarios; 2. Provides master-slave replication functions to facilitate load balancing and data backup; 3. Improve query efficiency through query optimization and index use.

The Purpose of SQL: Interacting with MySQL DatabasesApr 18, 2025 am 12:12 AM

SQL is used to interact with MySQL database to realize data addition, deletion, modification, inspection and database design. 1) SQL performs data operations through SELECT, INSERT, UPDATE, DELETE statements; 2) Use CREATE, ALTER, DROP statements for database design and management; 3) Complex queries and data analysis are implemented through SQL to improve business decision-making efficiency.

MySQL for Beginners: Getting Started with Database ManagementApr 18, 2025 am 12:10 AM

The basic operations of MySQL include creating databases, tables, and using SQL to perform CRUD operations on data. 1. Create a database: CREATEDATABASEmy_first_db; 2. Create a table: CREATETABLEbooks(idINTAUTO_INCREMENTPRIMARYKEY, titleVARCHAR(100)NOTNULL, authorVARCHAR(100)NOTNULL, published_yearINT); 3. Insert data: INSERTINTObooks(title, author, published_year)VA

See all articles