Table creation protocol


1. [Mandatory] Fields that express the concept of yes or no must be named in the manner of is_xxx, and the data type is unsigned tinyint (1 means yes, 0 means no). This rule also applies to odps construction surface.

Note: If any field is a non-negative number, it must be unsigned.


2. [Mandatory] Table names and field names must use lowercase letters or numbers; it is forbidden to start with numbers, and it is forbidden to only numbers between two underscores. Modification of database field names is very costly, because pre-release is not possible, so field names need to be carefully considered.

Positive example: getter _ admin , task _ config , level 3_ name

Counter example: GetterAdmin , taskConfig , level _3_ name


#3. [Mandatory] Do not use plural nouns in table names.

Note: The table name should only represent the entity content in the table, and should not represent the number of entities. The corresponding DO class name is also in the singular form, which is consistent with expression habits.


#4. [Mandatory] Disable reserved words, such as desc, range, match, delayed, etc. Please refer to MySQL official reserved words.


5. [Mandatory] The unique index name is uk _field name; the common index name is idx _field name.

Explanation: uk _ is the unique key; idx _ is the abbreviation of index.


#6. [Mandatory] The decimal type is decimal, and float and double are prohibited.

Note: When float and double are stored, there is a problem of precision loss. It is very likely that incorrect results will be obtained when comparing values. If the stored data range exceeds the decimal range, it is recommended to split the data into integers and decimals and store them separately.


#7. [Mandatory] If the lengths of the stored strings are almost equal, use the char fixed-length string type.


8. [Mandatory] varchar is a variable-length string. No storage space is allocated in advance. The length should not exceed 5000. If the storage length

is greater than this value, Define the field type as text, create a separate table, and use the primary key to correspond to avoid affecting the indexing efficiency of other fields.

9. [Mandatory] The table must have three fields: id, gmt_create, gmt_modified.

Note:

id must be the primary key, type is unsigned bigint, auto-increment for single table, step size is 1. The types of gmt _ create and

gmt _ modified are all date _ time types.

#10. [Recommendation] It is best to name the table with "business name_function of the table".

Positive example: tiger _ task / tiger _ reader / mpp _ config


11. [Recommended] The library name and application name should be as consistent as possible .


#12. [Recommendation] If you modify the meaning of a field or add a status to a field, you need to update the field comments in a timely manner.


#13. [Recommendation] Allow appropriate redundancy in fields to improve performance, but data synchronization must be considered. Redundant fields should follow:

1) Fields that are not frequently modified.

2) It is not a varchar super long field, let alone a text field.

Positive example: The product category name is used frequently, the field length is short, and the name is basically unchanged. The category name can be stored redundantly in the associated table to avoid Related queries.


14. [Recommendation] Database and table sharding is only recommended when the number of rows in a single table exceeds 5 million or the capacity of a single table exceeds 2 GB.

Note: If the data volume is not expected to reach this level in three years, please do not divide the database into tables when creating the table.


15. [Reference] Appropriate character storage length not only saves database table space and index storage, but more importantly, improves retrieval speed

.

Positive example: The age of a person is unsigned tinyint (representing the range 0-255, and a person’s life span will not exceed 255 years); Turtle must be smallint, but if The age of the sun must be an int; if the ages of all stars are added up, a bigint must be used.