Home  >  Article  >  Database  >  Detailed example of how MyBatis implements Mysql database sub-database and table sub-explanation

Detailed example of how MyBatis implements Mysql database sub-database and table sub-explanation

黄舟
黄舟Original
2017-08-23 13:52:012559browse

This article mainly introduces the operation and summary of MyBatis's implementation of Mysql database sub-database and table sub-tables. Friends in need can refer to the following

Foreword

As a database, as a table in the database, as the number of users increases and time goes by, one day, the amount of data will be so large that it is difficult to handle. At this time, the data in just one table exceeds tens of millions. Whether it is querying or modifying, its operation will be very time-consuming. At this time, database segmentation operation is required.

The simplest steps for MyBatis to implement sub-tables

Since the title of the article is written like this, it is more practical to go directly to the dry information. Let’s Let’s first take a look at how to implement the simplest sub-table.

1. We simulated the data volume of the user table to exceed tens of millions (although it is actually unlikely)

2. The original name of the user table was user_tab, and we divided it into user_tab_0 and user_tab_1 (actually it may not be such a random name), so that the original tens of millions of data can be separated into two tables with an amount of data of two million.

3. How to operate these two tables? We use userId, which is the user's unique identifier, to distinguish.

4, userId%2 == 0 user operation table user_tab_0, similarly userId%2 == 1 user operation table user_tab_1

5. So how to implement the sql statement in MyBatis? The following is an example of a SQL statement to query a user


<select id="getUser" parameterType="java.util.Map" resultType="UserDO"> 
    SELECT userId, name 
    FROM user_tab_#{tabIndex} 
    WHERE userId = #{userId} 
</select>

We passed in two parameters tabIndex and userId, tabIndex is the indicator value of the table to be operated on (0 or 1) , so if you need to query the user whose userId is 5, the final sql statement will be:


SELECT userId, name 
FROM user_tab_1 
WHERE userId = 5

I will not show more of the other redundant DAO services and implementations here. Yes, I believe you will do it if you are smart.

The above is the simplest implementation. It does not require extra frameworks or any plug-ins to meet the requirements of sub-tables.

The above is basically all the implementation content. Now we will start to talk about the details of separation in detail. Those who are watching the excitement can basically leave.

I will talk about it from the following perspectives. I put it in the simplest vernacular possible.

Separation methods

There are two main ways of segmentation, horizontal segmentation and vertical segmentation.

1. Horizontal segmentation

To put it simply, split a table into several identical tables, and then the table names are different. Just like the simplest example above.

This kind of segmentation is suitable for situations where the amount of data in a table is too large and the operation time is slowed down, such as some saved record tables.

2. Vertical Segmentation

Divide different business modules into different databases. These business modules are directly preferably 0-coupled (simply put, they have no relationship).

This is mainly suitable for situations where the amount of data is generally large, and the business scenarios are scattered and there is no logical relationship between them.

Separation strategy

There are many specific strategies. You can also design your own. The common strategies are as follows: It’s just a list without going into detail.

1. "%" modulo is implemented in the above example and is also the simplest one.

2, MD5 hash

3, shift

4, date and time (divided into tables according to different dates, such as one table per month, the operation will be performed this month This table will be replaced next month)

5. Enumeration range (users 1-10000 operate the first table, users 10001-20000 operate the second table)

The problem of separation

Let’s talk about the final point and the problems it causes.

The database is definitely not divided as you say. (People are more emotional, how can they just break up?)

Seriously, I have listed the following problems that separation will only cause.

1. The problem of uniqueness of the primary key when adding; after separating multiple tables, the original self-increasing primary key will not be unique, so there is no way to self-increase, causing problems, and there are solutions. , such as maintaining a separate primary key table specifically to store the current primary key, or using other middleware, etc.

2. Although the efficiency issue when adding new data is not a big problem, adding new data will definitely increase the amount of calculation. This problem can be ignored.

3. The paging problem caused by the query will be very difficult after it is separated into multiple tables. This also takes into account that different separations require different solutions. In short, problems will arise.

4. In the same way, related queries, originally it was very simple to associate one table with another table or another table with one table, but now it is difficult after separation.

5. Transaction issues. Multiple tables need to use distributed transactions to complete the original operations with transactions. Because the original transaction only locked one table, now it may have to lock multiple tables.

6. Scalability issues. Some sharding strategies do not have good data scalability. If more data comes later, does it mean that you can create new tables to expand?

Principles of separation

The following summarizes several principles of separation, which are mainly based on references on the Internet without any actual basis (I am not A DBA with an annual salary of one million cannot come across such large data to actually test it), so if you have any questions, please point them out.

1. If you can’t separate, don’t separate.

2. If you can separate less, don’t separate too much.

3. Too redundant and irrelevant

4 , Avoid using distributed transactions, mainly because it is too difficult and I don’t know how to do it

5. If there are less than 10 million records in a single table, it will not be divided

6. If you don’t divide it now, it will be too late

7. Expand, couple, and carefully consider

Ways to achieve separation

Finally, let’s talk about the way of separation, which is now popular The best DAO framework is MyBatis, but there are many other frameworks. The separation is mainly implemented in the following ways.

1. Native implementation, just like the above example, does not require anything else. Use the native framework to control the implementation yourself.

The advantages are: easy to control and take the initiative.

The disadvantages are: there is a lot of code, you need to know it clearly, it is inconvenient to modify, and it does not support complex segmentation. For example, you need to do some paging queries after segmentation, as well as the primary key issues mentioned above.

2. Plug-in implementation, use some plug-ins developed by the framework itself to implement these plug-ins, and then use the plug-ins to access the database to directly achieve separation.

The advantages are: small amount of code, simple implementation, and good scalability.

Disadvantages are: difficult to control, limited separation methods, and difficult to solve problems. No particularly mature plug-ins found.

3. Middleware implementation. Use some database access middleware to perform some operations before accessing the database to make corresponding changes in SQL to achieve separation.

The advantages are: small coupling, good scalability, and can solve the problem of distributed transactions.

Definitely: the implementation is more complicated, requires learning the middleware, and the cost is high. Maintenance is also a big issue, in case it fails. .

In short, each method has its own merits, but considering the cost, the first method is almost 0 cost, you can get started, and it is easier to control, just like the example given above, and the data I am currently processing is still We haven’t reached the point where we have to separate everywhere, so I choose the first option. Also recommended. If you find a plug-in or middleware that is easier to use, you can recommend it in the comments.

Summary

In the actual project, I had to separate the user’s account records because there were too many, and because the account records were more Most of them are just new additions without modification or deletion, and there are only a few queries, so we used the simplest way to separate them and chose the simplest strategy. I hope the above summary of principles, strategies, methods and issues can be helpful and reference for you. If you have any questions, please leave me a message and I will reply to you in time. I would also like to thank everyone for your support of the Script House website!

The above is the detailed content of Detailed example of how MyBatis implements Mysql database sub-database and table sub-explanation. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn