Home >Database >Mysql Tutorial >Understanding of MySQL data types
Understanding of MySql data types:
Due to the unique characteristics and implementation details of Mysql, The impact on performance is obvious, because it is critical to design the Mysql database well. For database design, we have to mention the type selection of table fields. Since Mysql supports many data types, how to choose the correct data type is crucial to obtain high performance. No matter what type of data we want to store, we need to consider it according to some database design principles.
Thinking about choosing data types
Smaller is usually better (In general, the smallest data type that correctly stores the data should be used whenever possible.)
why?
(1) Because smaller data types are generally faster because they occupy less disk, memory, and CPU cache, and require fewer CPU cycles to process.
(2) Be sure not to underestimate the range of values that need to be stored. The smaller value is relative to the maximum value range of the data type.
(3) If you can't decide which data type is the best, choose the smallest type that you think will not exceed the range.
Simple is good (Operations on simple data types usually require shorter CPU cycles.)
Why? Here are a few examples to illustrate why.
(1) Integer operations are cheaper than string operations because character comparisons of string sets and collation rules (sorting rules) are more complex than integer comparisons.
(2) The built-in types of Mysql (date, time, datatime) should be used to store dates and times.
(3) The IP address should be stored in integer type (int).
Try to avoid NULL (null value)
Why?
(1) Many tables contain columns that can be NULL, even if the program does not need to save NULL, because the default attribute of the column is that it can be NULL. It's usually best to specify NOT NULL for a column unless you really need to store NULL.
(2) If the query contains columns that can be NULL, it is difficult for Mysql to optimize because NULL columns make indexes, index statistics, and value comparisons more complex. Columns that can be NULL will use more storage space and require special handling in Mysql. When NULLable columns are indexed, each index record requires an extra byte, which may even cause a fixed-size index to become a variable-size index in MyISAM.
(3) Usually changing the NULL-capable column to NOTNULL will bring little performance improvement. If you plan to build an index on the column, you should try to avoid designing it as a NULL-capable column. (There is an exception, that is, in InnoDB, a separate bit is used to store NULL values, so it has good space efficiency for sparse data.)
Summary
When choosing a data type for a column, the first step is to determine the appropriate large type (number, string, time, etc.). This is usually very simple, then the next step is to choose the specific type.
Many Mysql data types can store the same type of data, but the storage length and range are different, the allowed precision is different, or the physical space required (disk and memory space) is different. Different subtypes of data of the same large type sometimes have special behaviors and properties. For example: DATATIME and TIMESAMP columns can store the same type of data (time and date) and are accurate to seconds. However, TIMESTAMP only uses half of the storage space of DATATIME and has special automatic update capabilities based on time zone changes. In addition, the time range allowed by TIMESTAMP is much smaller, and sometimes its special abilities can become obstacles, which we developers need to consider.
Integer type
There are two types of numbers: whole number and real number.
If you store integers, you can use these integer types: TINNYINT (8), SMALLINT (16), MEDIUMINT (24), INT (32), BIGINT (64).
The integer type has an optional UNSIGNED attribute, which means that negative values are not allowed. This can roughly double the upper limit of positive numbers.
For example: TINYINT UNSIGNED can store the range 0~255, while the storage range of TINYINT is -127~128.
Signed and unsigned types use the same storage space and have Same function.
So you can choose the appropriate type according to the actual situation.
Your choice determines how Mysql saves data in memory and disk.
Integer generally chooses 64-bit BIGINT integer, even in 32-bit environment. (But some aggregate functions are exceptions, they are calculated using DECIMAL or DOUBLE)
Mysql can specify the width for integer types.
For example: INT (11), this is meaningless for most applications: it does not limit the legal range of values, but only stipulates that some interactive tools of Mysql (such as the Mysql command line client) are used The number of characters displayed. For storage and calculation purposes, INT(1) and INT(20) are the same.
Some third-party storage engines (such as Infobright) sometimes have customized storage formats and compression schemes, which do not necessarily use the common Mysql built-in engine.
Real number type
A real number is a number with a decimal part.
They not only store the decimal part in the future, but can also use DECIMAL to store integers larger than BIGINT. Mysql supports both precise and imprecise types. The DECIMAL type is used to store exact decimals.
Precision operations are supported in Mysql5.0 or later versions, but exceptions will occur when using floating point operations in Mysql4.1 and earlier versions (mainly caused by loss of precision). Both FLOAT and DECIMAL types can be used. Specify the progress.
For DECIMAL columns, you can specify the maximum number of digits allowed before and after the decimal point, which will affect the space consumption of the column. There are many ways to specify the precision required for FLOAT (floating point) columns, which will cause Mysql to quietly select a different data type, or to round off the value when storing, but these precisions are often non-standard, so it is generally recommended to only The specified data type does not specify precision.
Due to the additional space and computational overhead required, you should try to use DECIMAL only when performing precise calculations on decimals. For example, when storing financial data, if the amount of data is relatively large, you can consider using BIGINT instead of DECIMAL, and multiply the currency unit to be stored by the corresponding multiple according to the number of decimal places. The FLOAT and DOUBLE types support approximate calculations using standard floating point arithmetic.
String type
Mysql supports multiple string types, and there are many variants of each type. Among them, VARCHAR and CHAR are the two main string types.
Note: The way the Mysql storage engine stores CHAR or VARCHAR values may be different in memory and on disk, so the values read by the Mysql server from the storage engine may need to be converted to another storage Format.
The VARCHAR type is used to store variable-length strings and is the most common string data type.
VARCHAR is more space-efficient than fixed-length types because it only uses the necessary space (shorter strings use less space).
VARCHAR requires 1 or 2 extra bytes to record the length of the string.
VARCHAR saves storage space, so it is helpful for performance.
The following are some scenarios where VARCHAR is suitable for use:
(1) The maximum length of the string column is much larger than the average length.
(2) Columns are rarely updated, so fragmentation is not an issue.
(3) uses a complex character set like UTF-8, and each character uses a different number of bytes for storage.
CHAR type is fixed length. (Mysql always allocates enough space according to the defined string length)
CHAR is suitable for storing very short strings, or all values are close to the same length.
Similar types to VARCHAR and CHAR are BINARY and VARBINARY, which store binary strings.
Note: The space overhead of storing "hello" using VARCAHR(5) and VARCHAR(200) is the same, so what are the advantages of using a shorter column? (It turns out to be a big advantage)
Longer columns consume more memory, because Mysql usually allocates fixed-size memory blocks to hold internal values. This is especially bad when using in-memory temporary tables for sorting or operations. It's equally bad when sorting using disk temporary tables.
Note: Ultimately, the best strategy is to allocate only the space you really need.
BLOB and TEXT types
BLOB and TEXT are both string data types designed for storing large data, and are stored in binary and character modes respectively.
In fact, they belong to two different groups of data type families: string types include TINYTEXT, SMALLTEXT, TEXT, MEDIUMTEXT, LONGTEXT;
binary types include TINYBLOB, SMALLBLOB, BLOB, MEDIUMBLOB, LONGBLOB;
ENUM type
You can use an enumeration (ENUM) instead of the string type. Many times it is recommended to use enumeration columns instead of commonly used string types.
(1) The enumeration column can store some unique strings into a predefined collection.
(2) Mysql is very compact when storing enumerations and will be compressed into one or two bytes depending on the number of list values.
(3) Mysql will internally save the position of each value in the list as an integer, and save the "lookup table" of the "number-string" mapping relationship in the .frm file of the table.
Note: One surprising thing is that the enumeration fields are sorted by internally stored integers rather than by defined strings.
Note: The worst thing about enumerations is that the string list is fixed. Adding or deleting strings must use ALTER TABLE, so for a series of characters that may change in the future Strings, using enumerations is not a good idea unless you accept that you can only add elements to the end of the list.
Note: There is some overhead for enumeration columns because MySQL saves each enumeration value as an integer and must do a lookup to convert it to a string.
Date and Time Type
Mysql has many types that can save date and time values, such as YEAR and DATE.
The minimum time granularity that Mysql can store is seconds (MariaDB supports microsecond-level event types). But MySQL can also perform ad hoc operations with microsecond level granularity.
Most of the time there are no alternatives to the type, so there is no question of what is the best choice.
The only question next is what needs to be done when saving the date and time.
DATETIME
(1) This type can save a wide range of values, from 1001 to 9999, with a precision of seconds.
(2) DATETIME encapsulates time and date into integers in the format YYYYMMDDHHMMSS, regardless of time zone.
(3)DATETIME uses 8 bytes of storage space.
TIMESTAMP
(1) The TIMESTAMP type stores the number of seconds since midnight on January 1, 1970, which is the same as the UNIX timestamp.
(2) TIMESTAMP only uses 4 bytes of storage space, so its range is much smaller than DATETIME.
(3) The value displayed by TIMESTAMP depends on the time zone.
Comparison between DATETIME and TIMESTAMP:
(1) By default, if the value of the first TIMESTAMP column is not specified when inserting, Mysql sets the value of this column to the current time. (This is a feature that DATETIME does not have)
(2) When inserting a row of records, Mysql will also update the value of the first TIMESTAMP column by default.
(3) The TIMESTAMP column defaults to NOT NULL, which is different from other data types.
Summary
(1) In addition to special behavior, TIMESTAMP should generally be used whenever possible because it is more space efficient than DATETIME.
(2) Generally speaking, it is not recommended to save UNIX timestamps as integer values. This will not bring any benefits. Saving timestamps in integer format is usually inconvenient to process.
(3) If you need to store date and time values with a smaller granularity than seconds, you can use the BIGINT type to store microsecond-level timestamps, or use DOUBLE to store the decimal part after seconds. You can also use MariaDB instead of Mysql.
Bit data type
BIT defines a field containing a single bit, BIT(2) stores 2 bits, and the maximum length is 64 bits.
Note: It is generally recommended to use the BIT type with caution. For most applications, it is best to avoid using this type.
Selecting the identifier
It is very important to choose the appropriate data type for the identifier (identity column).
Generally speaking, it is more likely to use the identity column to compare with other values, or to find other columns through the identity column.
When choosing the type of identity column, you not only need to consider the storage type, but also how Mysql performs calculations and comparisons on this type.
Once you select a type, make sure to use the same type in all related tables.
On the premise that the range requirements of the value can be met and room for future growth is reserved, the smallest data type should be selected.
Note: Integers are usually the best choice for identity columns because they are fast and can use AUTO_INCREMENT. Note: ENUM and SET are the worst choices; avoid using strings as identity columns if possible, as they are space-consuming and generally slower than numeric types.
Summary of the full text
For database design, you must think twice before making a decision. Choosing the most suitable data column type and determining the size of the data column are very important. A crucial step. In fact, there is no need to panic. Regardless of the data table design for any type of requirements, you only need to remember one principle, a very important, very important, very important principle: use the smallest data type that can correctly store data as much as possible.
The above is all about the understanding of MySQL data types.
Related referencesPHP Chinese website
The above is the detailed content of Understanding of MySQL data types. For more information, please follow other related articles on the PHP Chinese website!