Home  >  Article  >  Database  >  Big Data Learning Part 2 MYSQL Advanced

Big Data Learning Part 2 MYSQL Advanced

coldplay.xixi
coldplay.xixiforward
2021-02-01 09:33:501525browse

Big Data Learning Part 2 MYSQL Advanced

Article Directory

  • 5 MySQL Architecture
    • 5.1 MyISAM
    • 5.2 Innodb
      • 5.2.1 How to choose between system table space and independent table space
      • 5.2.2 Features of Innodb storage engine
      • 5.2.3 Innodb status check
      • 5.2.4 Applicable scenarios
      • 5.2.4 (Extended) What is a lock
        • 5.2.3.1 What is a lock ?
        • 5.2.3.2 Type of lock:
        • 5.2.3.3 Compatibility relationship between write lock and read lock (compatibility for one row)
        • 5.2.3.4 Lock granularity :
        • 5.2.3.5 Blocking and deadlock
    • ##5.3 CSV storage engine
    • ##5.3. 1 Characteristics of CSV storage engine
      • 5.3.2 Applicable scenarios of CSV storage engine
      • ##5.4 Archive storage engine
    • ##5.4 .1 File system storage features
    • 5.4.2 Archive storage engine features
      • 5.4.3 Archive storage engine usage scenarios
      • ##5.5 Memory storage engine
    • 5.5.1 File system storage characteristics
    • 5.5.2 Functional characteristics of Memory
      • 5.5.3 Easily confused concepts in Memory
      • 5.5.4 Memory usage scenarios
      • 5.6 Federated storage engine
    • 5.6.1 Federated features
    • 5.6.2 How to use Federated
      • 5.6.3 Applicable scenarios of Federated
      • 6 How to choose the correct storage engine
  • Continued from the previous article
5 MySQL architecture

Next we select some of the more commonly used storage A brief explanation of the engine. The storage engine used by MySQL will have a direct impact on the performance of the database. I also hope that you can carefully understand some of the characteristics of the storage engine before using the storage engine.

5.1 MyISAMBig Data Learning Part 2 MYSQL Advanced

MyISAM was the default storage engine before MySQL5.5. For this reason, there are still many servers using the MyISAM storage engine. At the same time, MyISAM is currently the storage engine used by many system tables and temporary tables. The temporary table mentioned here is not the table we create through

create table, but refers to the amount of data used during operations such as sorting and grouping. After exceeding a certain size, there are temporary tables created by the query optimizer. The MyISAM storage engine is composed of MYD and MYI. MYD is the extension of the data file and MYI is the extension of the index file. This storage engine stores tables in data files and index files with these two extensions. middle.

Features:
Concurrency and lock level

MyISAM uses table-level locks, not row-level locks, which means that the data in the table is When modifying, the entire table needs to be locked, and when reading the table, shared locks are also added to all tables. From here we can see that the table reading and writing operations using MyISAM as the engine are Mutually exclusive. It can be seen that MyISAM is not very good at concurrent reading and writing operations. If it is only for read-only operations, the performance is not bad in terms of concurrency because the shared lock does not block the shared lock.

Table damage repair
    MyISAM supports checking and repairing MyISAM tables that are damaged due to any unexpected shutdown, but the repair mentioned here is not data recovery, because MyISAM is not transactional storage engine, so it cannot perform related logs required for transaction recovery, so be aware that the recovery of MyISAM tables may cause data loss.
  • We can check the table through
    check table tablename
  • and restore the table through
  • repair table tablename
    .
    Index types supported by MyISAM tables MyISAM supports full-text indexing, and was the only official storage engine that natively supported full-text indexing before mysql5.7. MyISAM table supports data compression
  • If MyISAM represents a large read-only table, that is, after the table is created and the data is imported, no modifications will be made to the table, then we will Tables can be compressed to reduce disk I/O. We can use the
  • myisampack
    command to compress the table. Compression compresses the table independently, so when reading a row of data, it is not necessary to decompress the entire table.

  • Restrictions:
  • Version

    If you store a large table, you need to modify MAX_Rows and AVG_ROW_LENGTH

      Version> The default support for mysql5.0 is 256TB
    • Applicable scenarios:
    • Non-transactional applications

    Read-only applications (Reports and the like)

      Spatial Application
    • 5.2 Innodb
    • Innodb is the default storage engine for MySQL 5.5 and later versions. Innodb is the storage engine for transaction storage, which means it supports transaction processing.
      Innodb has its own concept of table space, and the data is stored in the table space, which is determined by the innodb_file_per_table parameter. If this parameter is ON, A system file with the extension ibd will be created for each Innodb table. If this parameter is OFF, the data will be stored in the system's shared table space, that is ibdataX, X represents a number, starting from 1 by default.
      The command to view this parameter is:
      show variables like 'innodb_file_per_table';
      The command to modify this parameter is:
      set global innodb_file_per_table=off;

      5.2.1 How to choose between system table space and independent table space

      Comparison:

      System table space Independent table space
      Cannot simply shrink the file size You can passoptimize tableCommand to shrink system files
      will cause IO bottleneck Can refresh data to multiple files at the same time

      Recommendation:

      • Use independent table space for Innodb

      Transfer the table that originally existed in the system table space to Methods in independent tablespaces.
      Steps:

  1. Use mysqldump to export all database table data
  2. Stop the MySQL service, modify parameters, and delete Innodb related files
  3. Restart the MySQL service and rebuild the Innodb system table space
  4. Reimport the data

##5.2.2 Features of the Innodb storage engine

    Innodb is a transactional storage engine
  • Fully supports the ACID characteristics of transactions (the atomicity, consistency, etc. introduced before)
  • Redo Log and Undo Log
  • Redo Log implementation The durability of the transaction is composed of two parts, one of which is the work log persistent buffer in memory, whose size is determined by innodb_log_buffer_size, and the other is the reconstructed log file, which is the ib_logflie we see in the file system related documents. Undo Log realizes the atomicity of transactions and performs rollback operations when transactions fail. Redo Log is read and written sequentially, and Undo Log is read and written randomly. If possible, the data can be stored in a solid-state drive to improve performance.
  • Innodb supports row-level locks
  • Row-level locks and table-level locks are different. The characteristic of row-level locks is that they can support concurrency to the greatest extent. Row-level locks are implemented by the storage engine layer.

5.2.3 Innodb status check

You can use the following command to check the Innodb status:


show engine innodb status

5.2.4 Applicable scenarios

Innodb is suitable for most OLTP applications, because after mysql5.7 version, Innodb already supports full-text index and spatial functions.

5.2.4 (Extended) What is a lock

5.2.3.1 What is a lock?

    The main function of locks is to manage concurrent access to shared resources
  • Locks are used to achieve transaction isolation
##5.2 .3.2 Types of locks:

Shared lock (also called read lock)
  • Exclusive lock (also called write lock)
5.2.3.3 Compatibility relationship between write lock and read lock (compatibility for one row)

Write lockRead lock

In actual situations, the results may be different from the results in the above table, mainly because the lock mechanism in Innodb is a very complicated thing, and there are many locks that affect the final result.

5.2.3.4 Lock granularity:

  • Table-level lock
  • Row-level lock

5.2.3.5 Blocking and deadlock

Blocking: Blocking is due to the compatibility relationship between different locks. At some point, the lock in one transaction needs to wait for the lock of another transaction to be released. The resources it occupies form a blockage.
Deadlock: Deadlock refers to an exception that occurs when two or more transactions occupy the resources that the other party is waiting for during the execution of the transaction. As can be seen from the definition, multiple blocked transactions occupy the resources that the blocked transaction is waiting for, and deadlock is when multiple blocked transactions occupy the resources that each other is waiting for.

5.3 CSV storage engine

The CSV storage engine can process csv files as mysql table files. The storage format of this storage engine is an ordinary csv file. The data storage method of the csv storage engine is very unique. If we store the table in MyISAM or Innodb, we cannot directly view the data files because these two files are stored in binary format. The CSV storage engine is different. CSV data is stored in files in the form of text. That is, we can view it through the command to view the file, such as more, or use the vi command to view and edit the table in the CSV storage engine. As long as the format and requirements of the CSV file are met, we don't have to worry about corrupting the data.
When we create the CSV storage engine table in mysql, we should be able to see files in 3 file systems. These three file names are all based on the table name, but have csv, csm, and frm as suffixes respectively. The csv file is the data file in the CSV storage engine. The csm file stores the metadata of the table and the table status and data volume. The frm file stores table structure information.

5.3.1 Features of CSV storage engine

  • The biggest feature is that data is stored in CSV format
    Each column in CSV is , to separate, and the text content is enclosed in double quotes, as shown in the following figure:
    Big Data Learning Part 2 MYSQL Advanced
  • All columns must not be NULL
    When creating a table, all columns must be non-empty and cannot be stored as NULL values ​​
  • Does not support indexes
    Not suitable for large tables and online processing
  • Data files can be edited directly
    Save text file contents

5.3.2 Applicable scenarios of CSV storage engine

CSV storage engine is suitable for data exchange Intermediate table
Big Data Learning Part 2 MYSQL Advanced
Big Data Learning Part 2 MYSQL Advanced

##5.4 Archive storage engine

5.4.1 File system storage characteristics

Archive storage engine caches all writes and uses zlib to compress inserted rows. Therefore, Archive storage engine saves disk I/O compared to tables of MyISAM storage engine. For data of the same order of magnitude, Archive The storage engine saves storage space compared to MyISAM and Innodb. A several-terabyte Innodb table stored in the Archive storage engine may only require a few hundred megabytes of storage space.

The table data of the Archive storage engine is a file with the suffix ARZ. Like other engines, there is also a system file with the suffix frm used to store the structural information of the table.

5.4.2 Features of Archive storage engine

    Only supports
  • insert and select operations
  • Only indexes are allowed on auto-incrementing ID columns
##5.4.3 Archive storage engine usage scenarios

Scenario 1: Log and data collection class Data

Because Archive does not support modification and deletion, our ORDB will definitely modify the data, but it is still useful for some warehouse-type applications or some special tables, such as log tables or Data collection tables are more suitable for using the Archive storage engine because they need to collect a large amount of data. Because the Archive storage engine has the smallest storage space among all engines, it should be noted that even in data collection or logging applications, the Archive storage engine cannot update these data, so when recording logs Or if you modify the data in a data collection application, you may not be able to use the Archive storage engine.


5.5 Memory storage engine

5.5.1 File system storage characteristics

Memory storage engine is also called HEAP storage engine, so the data is stored in memory. This means that the data table is disposable. Once the MySQL service is restarted, all Memory storage engine data will disappear. However, the table structure will be retained, because creating a table under the Memory storage engine will only generate a frm system file, which is used to save the table structure. This is why data will be lost when restarting the MySQL server, but the table structure will not.
From its file storage characteristics, we can know that the I/O efficiency of the Memory storage engine will be much higher than that of MyISAM, because only the index of MyISAM is stored in the memory, and the data is cached by the operating system, while the Memory storage engine All data and indexes of the engine are stored in memory. Let's take a look at the functional features of the Memory storage engine.

5.5.2 Functional features of Memory

Functional features:

  • Supports HASH index (default) and BTree Index
    If it is a HASH index, it will be very fast when doing equivalent queries. If it is doing a range query, the HASH index cannot be used, so we need to pay attention when creating the table. If the table requires a large number of equivalent queries, use the HASH index, and use the BTree index for range queries. Different index types can have a big impact on performance.
  • All fields are of fixed length varchar(10) = char(10)
    This requires us to meet the minimum field length requirements when defining the table structure, otherwise a lot of memory is wasted.
  • Does not support large fields such as BLOG and TEXT
  • Memory storage engine uses table-level locks
  • The maximum size is determined by the max_heap_table_size parameter
    The default value of this parameter is only 16 megabytes , if we want to store a large amount of data in the Memory storage engine table, we need to modify this parameter, and this parameter modification will not take effect on the existing Memory storage engine table. If we need to take effect on the existing table, we need to modify it. Existing tables are rebuilt.

5.5.3 Confusing concepts in Memory

Memory storage engine table:
It can be used on all systems, it is not the same A temporary table.
Temporary table:
There are two types of temporary tables. One is the temporary table used by the system when the query optimizer optimizes the query, which is the internal temporary table. The system uses temporary tables when the limit is exceeded (using BLOB or TEXT large field), use the MyISAM temporary table, and use the Memory table if the limit is not exceeded.
The other is a temporary table created through the command create temporary table. The created table can use any storage engine.
No matter what kind of temporary table it is, it is only visible internally.

5.5.4 Memory usage scenarios

  • is used for search or mapping tables, such as postal code and region correspondence tables
  • is used for Save the intermediate table generated during data analysis
  • The result table used to cache periodic aggregation data

Memory data is easy to lose, so the data is required to be reproducible.

5.6 Federated Storage Engine

5.6.1 Features of Federated

  • Provides access to remote MySQL servers Table method
    Since the Federated storage engine only establishes a local connection to the remote server, it can be said that all the tables we want to access are still placed on the remote server, and no data is stored locally. Each time a Federated storage engine table is accessed, the query is sent to the remote server to run and the relevant data is obtained from the remote MySQL server.
  • No data is stored locally, all data is placed on the remote server
  • The table structure and the connection information of the remote server need to be saved locally
    Therefore, there will also be a frm file in the system, use Information about storing remote information and how to connect to remote tables.

5.6.2 How to use Federated

The Federated storage engine can realize the function of SQL Server connecting to the server, but due to its own performance is not very good, usually The same purpose can be achieved through replication, etc., so in the current MySQL version, the Federated storage engine is disabled by default. If you need to use the Federated storage engine, you need to add federated=1 to /usr/local/mysql/my.cnf, and then restart the MySQL server. We can use show engine to confirm whether the current MySQL server supports the Federated storage engine.
And use the following connection string in the create table statement,
mysql://user_name[:password]@host_name[:port_num]/db_name/tbl_name
Remote server binding connection:
grant select,update,insert,delete on remote.remote_fet to fred_link@'127.0.0.1' identified by '123456'
You can decide the query Information about the remote server and some information about the related database tables.

5.6.3 Applicable scenarios for Federated

  • Occasional statistical analysis and manual query
    Due to the slow performance of Federated, it is only suitable for occasional Statistical analysis and manual query.

6 How to choose the correct storage engine

Reference conditions:

  • Transaction
  • Backup
  • Crash recovery
  • Unique features of storage engines
    Try to avoid mixing storage engines.

More related free learning recommendations: mysql tutorial(Video)


Write lock
Read lock
Incompatible Incompatible
Incompatible Compatible

The above is the detailed content of Big Data Learning Part 2 MYSQL Advanced. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:csdn.net. If there is any infringement, please contact admin@php.cn delete