Home >Database >Mysql Tutorial >Let's talk about database zipper tables
This article brings you relevant knowledge about mysql, which mainly introduces related issues about database zipper tables. Zipper tables are a data model, mainly for data warehouse design. The table is defined by the way it stores data. As the name suggests, the so-called zipper is to record history. Let’s take a look at it together. I hope it will be helpful to everyone.
Recommended learning: mysql video tutorial
In the process of designing the data model of the data warehouse, we often encounter such requirements:
1. The amount of data is relatively large;
2. Some fields in the table will Be updated, such as the user's address, product description information, order status, etc.;
3. Need to view historical snapshot information at a certain point in time or time period, for example, to view the historical snapshot information of an order at a certain time in the history The status at a point in time, for example, check how many times a user has updated in a certain period of time in the past, etc.;
4. The proportion and frequency of changes are not very large, for example, there are 10 million in total There are about 100,000 new members and changes every day;
5. If a full copy of this table is retained every day, then a lot of unchanged information will be saved in the full volume each time, which is very important for storage. A huge waste;
There are several options for this kind of table:
Comparison of the above solutions
Option 1
Needless to say more about this solution, it is very easy to implement Simple, drop the previous day's data every day and extract the latest one again.
The advantages are obvious, it saves space, and is also very convenient for common uses. There is no need to add a time partition when selecting a table.
The shortcomings are also obvious. There is no historical data. The only way to check the old accounts first is through other methods, such as drawing from the flow sheet.
Option 2
A full serving of slices every day is a relatively safe plan, and the historical data is also there.
The disadvantage is that the storage space is too large. If a full copy of this table is kept every day, a lot of unchanged information will be saved in each full copy, which is a huge waste of storage. , I still feel very deeply about this...
Of course we can also make some trade-offs, such as only retaining the data of the past month? However, the demand is shameless, and the life cycle of data is not something we can completely control.
Zipper table
The zipper table basically takes into account our needs in use.
First of all, it makes a trade-off in terms of space. Although it does not occupy as small a space as Plan 1, its daily increment may be only one thousandth or even one ten thousandth of Plan 2.
In fact, it can meet the needs of Solution 2, which can not only obtain the latest data, but also add filtering conditions and obtain historical data.
So it is still necessary for us to use zipper tables.
The zipper table is a data model, which is mainly defined for the way tables store data in data warehouse design. As the name suggests, the so-called zipper is to record history. Record information about all changes in a thing from its beginning to its current state. Zipper tables can avoid the massive storage problem caused by storing all records for each day, and are also a common way to deal with slowly changing data (SCD2).
Baidu Encyclopedia’s explanation: The zipper table is a table that maintains historical status and the latest status data. Depending on the zipper granularity, the zipper table is actually equivalent to a snapshot, but it has been optimized and part of it has been removed. Unchanged records, the customer records at the time of zipping can be easily restored through the zipper table.
1. Collect the full data of the day to the ND (NowDay) table;
2. Can retrieve yesterday’s full data from the history table and store it in the OD ( OldDay (last day) table;
3. Compare the two tables with all fields. (ND-OD) is the new and changed data on the day, that is, the increment of the day, represented by W_I;
4. Compare the full fields of the two tables. (OD-ND) is the data that needs to be closed when the status ends. END_DATE needs to be modified, represented by W_U;
5. Change the contents of the W_I table All are inserted into the history table, these are new records, start_date is the current day, and end_date is the max value, which can be set to '9999-12-31';
6. Perform W_U part of the history table In the update operation, start_date remains unchanged, while end_date is changed to the current day, which is a link operation. The historical table (OD) is compared with the W_U table, except for START_DATE and END_DATE. The W_U table shall prevail. The intersection of the two will change END_DATE to the current day. , indicating that the record is invalid.
As a simple example, for example, there is an order table:
There are 3 records on June 20th:
Order creation date | Order number | Order status |
---|---|---|
2012 -06-20 | 001 | Create order |
2012-06-20 | 002 | Create Order |
2012-06-20 | 003 | Payment completed |
to 6 On March 21st, there are 5 records in the table:
Order creation date | Order number | Order status |
---|---|---|
2012-06-20 | 001 | Create order |
002 | Create order | |
003 | Payment completed | |
004 | Create order | |
005 | Create order |
Order number | Order status | |
---|---|---|
001 | Create order | |
002 | Create order | |
003 | Payment completed | |
004 | Create order | |
005 | Create order | |
006 | Create order |
1. Only one copy of the full amount is retained, so the data is the same as the record on June 22. If you need to check the status of order 001 on June 21, it cannot be satisfied;
2. One copy is retained every day. In full, the table in the data warehouse has a total of 14 records, but many records are saved repeatedly without task changes, such as order 002,004. The large amount of data will cause a lot of storage waste;
If If the table is designed to be saved as a historical zipper table in the data warehouse, there will be a table like the following:
Order number | Order status | dw_bigin_date | dw_end_date | |
---|---|---|---|---|
001 | Create order | 2012-06-20 | 2012-06-20 | ##2012-06-20 |
Payment completed | 2012-06-21 | 9999-12-31 | 2012-06- 20 | |
Create order | 2012-06-20 | 9999-12-31 | ## 2012-06-20 | |
Payment completed | 2012-06-20 | 2012-06-21 | 2012-06-20 | |
shipped | 2012-06-22 | 9999-12-31 | 2012-06-21 | |
Create order | 2012-06-21 | 9999-12- 31 | ##2012-06-21 | 005 |
2012-06-21 | 2012-06-21 | 2012-06-21 | 005 | |
2012-06-22 | 9999-12-31 | 2012-06-22 | 006 | |
2012-06 -22 | 9999-12-31 | Note: |
Order Creation Date
dw_bigin_date | dw_end_date | ##2012-06-20 | ||
---|---|---|---|---|
2012-06-21 | 9999-12-31 | 2012-06- 20 | 002 | |
2012-06-20 | 9999-12-31 | ## 2012-06-20 | 003 | Payment completed |
2012-06-21 | 2012-06-21 | 004 | Create order | |
9999-12-31 | 2012-06-21 | 005 | Create order | |
2012-06-21 | is completely consistent with the record in the source table on June 21: |
Order number
2012-06-20 | 001 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2012-06-20 | 002 | Create order | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2012-06-20 | 003 | Payment completed | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2012-06-21 | 004 | Create order | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2012-06-21 | 005 | Create order | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Name of person | Start Date | End Date | Status |
---|---|---|---|
19000101 | 19070901 | Hat home | |
19070901 | 19130901 | A Elementary School | |
19130901 | 19160901 | B Junior High School | |
19160901 | 19190901 | C High School | |
19190901 | 19230901 | D University | |
19230901 | 19601231 | E Company | |
19601231 | 29991231 | H retires at home |
Zip table implementation method
CREATE VOLATILE TABLE VT_xxxx_NEW AS xxxx WITH NO DATA ON COMMIT PRESERVE ROWS;CREATE VOLATILE SET TABLE VT_xxxx_CHG,NO LOG AS xxxx WITH NO DATA ON COMMIT PRESERVE ROWS;
2. Get the full amount of data for the day
INSERT INTO VT_xxxx_NEW(xx) SELECT (xx,cur_date, max_date) FROM xxxx_sorce;
3. Extract new or changed data from xxxx_NEW temporary table to xxxx_CHG temporary table;
INSERT INTO VT_xxxx_CHG(xx)SELECT xx FROM VT_xxxx_NEWWHERE (xx) NOT IN (select xx from xxxx_HIS where end_date='max_date');
4. Update the end_date of the invalid record in the history table to max Value
UPDATE A1 FROM xxxx_HIS A1, VT_xxxx_CHG A2SET End_Date='current_date'WHERE A1.xx=A2.xx AND A1.End_Date='max_date';
5. Insert new or changed data into the target table
INSERT INTO xxxx_HIS SELECT * FROM VT_xxxx_CHG;Take product data as an example
There is a product table t_product, The table structure is as follows:
Type | Description | |
---|---|---|
Product number | goods_status | |
Product status (pending review, for sale, on sale, deleted) | createtime | |
Product creation date | modifytime | |
Product modification date |
goods_status | ##createtime | ##modifytime | 001 |
---|---|---|---|
2019-12-20 | 002 | For Sale | |
2019-12-20 | 003 | for sale | |
2019-12-20 | ##004 | Deleted | |
2019-12-20 | The status of the product will change over time. We need to record the history of all changes in the product. The information is saved. | Plan 1: Snapshot each day's data to the data warehouse |
createtime | modifytime | 001 | To be reviewed|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
002 | For sale | 2019-12-19 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
##003 | for sale | 2019-12-20 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
004 | Deleted | 2019-12-15 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
December 21 (10 pieces of data)
|
The above is the detailed content of Let's talk about database zipper tables. For more information, please follow other related articles on the PHP Chinese website!