Home  >  Article  >  Database  >  Methods of deduplication and connection query in MySQL database

Methods of deduplication and connection query in MySQL database

WBOY
WBOYforward
2023-05-28 22:07:101080browse
Directory
  • 1. Deduplication

  • 2. Connection query

    • Use where for multi-table connection query

    • Inner join-equivalent join

    • ##Inner join-non-equivalent join

    • Inner join-self join

    • Outer join-left and right outer join

    • Three table join

1. Deduplication

Refer to this article for sample table content

Some MySQL data tables may have duplicate records, and in some cases we allow duplicate data exists, but sometimes we also need to delete these duplicate data.

For example: Remove duplicates and display position information:

mysql> select distinct job from emp;
+-----------+
| job       |
+-----------+
| CLERK     |
| SALESMAN  |
| MANAGER   |
| ANALYST   |
| PRESIDENT |
+-----------+
5 rows in set (0.02 sec)

Another example: Joint deduplication, find unique information about departments and positions:

mysql> select distinct job,deptno from emp;
+-----------+--------+
| job       | deptno |
+-----------+--------+
| CLERK     |     20 |
| SALESMAN  |     30 |
| MANAGER   |     20 |
| MANAGER   |     30 |
| MANAGER   |     10 |
| ANALYST   |     20 |
| PRESIDENT |     10 |
| CLERK     |     30 |
| CLERK     |     10 |
+-----------+--------+
9 rows in set (0.00 sec)

Another example: Now We want to count the number of jobs, combined with the count function:

mysql> select count(distinct job) from emp;
+---------------------+
| count(distinct job) |
+---------------------+
|                   5 |
+---------------------+
1 row in set (0.00 sec)

2. Connection query

We have learned how to read data in a table, which is relatively simple, but In real applications, it is often necessary to read data from multiple data tables.

JOIN is roughly divided into the following three categories according to its functions:

INNER JOIN (inner join, or equivalent join): obtains records of field matching relationships in two tables.

LEFT JOIN (left join): Get all the records in the left table, even if there are no corresponding matching records in the right table.

RIGHT JOIN (right join): Contrary to LEFT JOIN, it is used to obtain all records in the right table, even if there are no corresponding matching records in the left table.

The operation of multi-table join is to match each piece of data in one table with the data rows in another table. This involves efficiency control issues

Using where to perform multi-table connection queries

Now let's demonstrate an example: take out the name and department name of each employee:

mysql> select ename,dname
    -> from emp,dept
    -> where emp.deptno = dept.deptno;
+--------+------------+
| ename  | dname      |
+--------+------------+
| SMITH  | RESEARCH   |
| ALLEN  | SALES      |
| WARD   | SALES      |
| JONES  | RESEARCH   |
| MARTIN | SALES      |
| BLAKE  | SALES      |
| CLARK  | ACCOUNTING |
| SCOTT  | RESEARCH   |
| KING   | ACCOUNTING |
| TURNER | SALES      |
| ADAMS  | RESEARCH   |
| JAMES  | SALES      |
| FORD   | RESEARCH   |
| MILLER | ACCOUNTING |
+--------+------------+
14 rows in set (0.00 sec)

The above SQL statement is actually very inefficient. We try to optimize it (alias the table): (sql92 syntax)

mysql> select e.ename,d.dname
    -> from emp e,dept d
    -> where e.deptno = d.deptno;
+--------+------------+
| ename  | dname      |
+--------+------------+
| SMITH  | RESEARCH   |
| ALLEN  | SALES      |
| WARD   | SALES      |
| JONES  | RESEARCH   |
| MARTIN | SALES      |
| BLAKE  | SALES      |
| CLARK  | ACCOUNTING |
| SCOTT  | RESEARCH   |
| KING   | ACCOUNTING |
| TURNER | SALES      |
| ADAMS  | RESEARCH   |
| JAMES  | SALES      |
| FORD   | RESEARCH   |
| MILLER | ACCOUNTING |
+--------+------------+
14 rows in set (0.00 sec)

Note: The more connections the table has, the lower the efficiency. Please try to reduce the number of connections in the table. Number of connections!

Inner join - equivalent join

Still the above example, take out the name of each employee and department name: (sql99 syntax)

Inner join, we use inner

mysql> select e.ename,d.dname
    -> from emp e
    -> inner join
    -> dept d
    -> on
    -> e.deptno = d.deptno;
+--------+------------+
| ename  | dname      |
+--------+------------+
| SMITH  | RESEARCH   |
| ALLEN  | SALES      |
| WARD   | SALES      |
| JONES  | RESEARCH   |
| MARTIN | SALES      |
| BLAKE  | SALES      |
| CLARK  | ACCOUNTING |
| SCOTT  | RESEARCH   |
| KING   | ACCOUNTING |
| TURNER | SALES      |
| ADAMS  | RESEARCH   |
| JAMES  | SALES      |
| FORD   | RESEARCH   |
| MILLER | ACCOUNTING |
+--------+------------+
14 rows in set (0.00 sec)

The advantage of sql99 is: the connection of the table is independent and does not occupy the where position. Make the overall sql statement clearer

Inner join-non-equivalent join

Case: Find out the salary grade of each employee and require the employee name, salary, and salary grade to be displayed

mysql> select
    -> e.ename,e.sal,s.grade
    -> from
    -> emp e
    -> inner join
    -> salgrade s
    -> on
    -> e.sal between s.losal and s.hisal;
+--------+---------+-------+
| ename  | sal     | grade |
+--------+---------+-------+
| SMITH  |  800.00 |     1 |
| ALLEN  | 1600.00 |     3 |
| WARD   | 1250.00 |     2 |
| JONES  | 2975.00 |     4 |
| MARTIN | 1250.00 |     2 |
| BLAKE  | 2850.00 |     4 |
| CLARK  | 2450.00 |     4 |
| SCOTT  | 3000.00 |     4 |
| KING   | 5000.00 |     5 |
| TURNER | 1500.00 |     3 |
| ADAMS  | 1100.00 |     1 |
| JAMES  |  950.00 |     1 |
| FORD   | 3000.00 |     4 |
| MILLER | 1300.00 |     2 |
+--------+---------+-------+
14 rows in set (0.01 sec)

Inner connection - self-connection

Case: Query the employee's superior leader and request to display the employee name and the corresponding leader name

We can find that the relationship between employees and leaders is in a table. At this time Need to use self-join (technique: one table is regarded as two tables)

mysql> select
    -> a.ename as '员工名',b.ename as '领导名'
    -> from emp a
    -> join emp b
    -> on
    -> a.mgr = b.empno;
+-----------+-----------+
| 员工名    | 领导名      |
+-----------+-----------+
| SMITH     | FORD      |
| ALLEN     | BLAKE     |
| WARD      | BLAKE     |
| JONES     | KING      |
| MARTIN    | BLAKE     |
| BLAKE     | KING      |
| CLARK     | KING      |
| SCOTT     | JONES     |
| TURNER    | BLAKE     |
| ADAMS     | SCOTT     |
| JAMES     | BLAKE     |
| FORD      | JONES     |
| MILLER    | CLARK     |
+-----------+-----------+
13 rows in set (0.00 sec)

Outer join-left and right outer join

The difference between outer join and inner join is that the outer join does not match successfully The records of a certain table will also be taken out

Case: Find the employee's department information. Require the department to find out even if there are no employees

mysql> select
    -> e.ename,d.dname
    -> from emp e
    -> right join dept d
    -> on
    -> e.deptno = d.deptno;
+--------+------------+
| ename  | dname      |
+--------+------------+
| SMITH  | RESEARCH   |
| ALLEN  | SALES      |
| WARD   | SALES      |
| JONES  | RESEARCH   |
| MARTIN | SALES      |
| BLAKE  | SALES      |
| CLARK  | ACCOUNTING |
| SCOTT  | RESEARCH   |
| KING   | ACCOUNTING |
| TURNER | SALES      |
| ADAMS  | RESEARCH   |
| JAMES  | SALES      |
| FORD   | RESEARCH   |
| MILLER | ACCOUNTING |
| NULL   | OPERATIONS |
+--------+------------+
15 rows in set (0.00 sec)

Similarly, if it is a left outer join, all the data in the left table will be queried, and the left join keyword can be used

For outer joins The number of query results must be >= the number of query results of inner join

Three table connection

A more complicated situation is group table connection

Let’s Look at a case:

Find out the department name and salary grade of each employee. It is required to display the employee name, department name, salary, and salary grade

mysql> select
    -> e.ename,e.sal,d.dname,s.grade
    -> from emp e
    -> join dept d
    -> on e.deptno = d.deptno
    -> join salgrade s
    -> on e.sal between s.losal and s.hisal;
+--------+---------+------------+-------+
| ename  | sal     | dname      | grade |
+--------+---------+------------+-------+
| SMITH  |  800.00 | RESEARCH   |     1 |
| ALLEN  | 1600.00 | SALES      |     3 |
| WARD   | 1250.00 | SALES      |     2 |
| JONES  | 2975.00 | RESEARCH   |     4 |
| MARTIN | 1250.00 | SALES      |     2 |
| BLAKE  | 2850.00 | SALES      |     4 |
| CLARK  | 2450.00 | ACCOUNTING |     4 |
| SCOTT  | 3000.00 | RESEARCH   |     4 |
| KING   | 5000.00 | ACCOUNTING |     5 |
| TURNER | 1500.00 | SALES      |     3 |
| ADAMS  | 1100.00 | RESEARCH   |     1 |
| JAMES  |  950.00 | SALES      |     1 |
| FORD   | 3000.00 | RESEARCH   |     4 |
| MILLER | 1300.00 | ACCOUNTING |     2 |
+--------+---------+------------+-------+
14 rows in set (0.00 sec)

Let’s look at a more complex situation:

Find out the department name, salary grade, and leader name of each employee. Request to display employee name, department name, leader name, salary, salary grade

mysql> select
    -> e.ename,e.sal,d.dname,s.grade,l.ename
    -> from emp e
    -> join dept d
    -> on e.deptno = d.deptno
    -> join salgrade s
    -> on e.sal between s.losal and s.hisal
    -> left join
    -> emp l
    -> on e.mgr = l.empno;
+--------+---------+------------+-------+-------+
| ename  | sal     | dname      | grade | ename |
+--------+---------+------------+-------+-------+
| SMITH  |  800.00 | RESEARCH   |     1 | FORD  |
| ALLEN  | 1600.00 | SALES      |     3 | BLAKE |
| WARD   | 1250.00 | SALES      |     2 | BLAKE |
| JONES  | 2975.00 | RESEARCH   |     4 | KING  |
| MARTIN | 1250.00 | SALES      |     2 | BLAKE |
| BLAKE  | 2850.00 | SALES      |     4 | KING  |
| CLARK  | 2450.00 | ACCOUNTING |     4 | KING  |
| SCOTT  | 3000.00 | RESEARCH   |     4 | JONES |
| KING   | 5000.00 | ACCOUNTING |     5 | NULL  |
| TURNER | 1500.00 | SALES      |     3 | BLAKE |
| ADAMS  | 1100.00 | RESEARCH   |     1 | SCOTT |
| JAMES  |  950.00 | SALES      |     1 | BLAKE |
| FORD   | 3000.00 | RESEARCH   |     4 | JONES |
| MILLER | 1300.00 | ACCOUNTING |     2 | CLARK |
+--------+---------+------------+-------+-------+
14 rows in set (0.00 sec)

The above is the detailed content of Methods of deduplication and connection query in MySQL database. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:yisu.com. If there is any infringement, please contact admin@php.cn delete