Home  >  Article  >  Database  >  oralce distinct 去除重复记录的方法

oralce distinct 去除重复记录的方法

WBOY
WBOYOriginal
2016-06-07 17:46:05958browse

distinct的只显示一次重复出更的值。

不管这个值出现多少次只显示一次。

distinct 字段名1,字段名2 from 表格 order by 字段名1

最好和order by 结合使用。可以提高效率

SQL>
SQL> CREATE TABLE employees (
  2    au_id    CHAR(3)     NOT NULL,
  3    au_fname VARCHAR(15) NOT NULL,
  4    au_lname VARCHAR(15) NOT NULL,
  5    phone    VARCHAR(12) NULL    ,
  6    address  VARCHAR(20) NULL    ,
  7    city     VARCHAR(15) NULL    ,
  8    state    CHAR(2)     NULL    ,
  9    zip      CHAR(5)     NULL
 10  );

Table created.

SQL>
SQL> INSERT INTO employees VALUES('A01','S','B','111-111-1111','75 St','Boston','NY','11111');

1 row created.

SQL> INSERT INTO employees VALUES('A02','W','H','222-222-2222','2922 Rd','Boston','CO','22222');

1 row created.

SQL> INSERT INTO employees VALUES('A03','H','H','333-333-3333','3800 Ave, #14F','San Francisco','CA','33333');

1 row created.

SQL> INSERT INTO employees VALUES('A04','K','H','444-444-4444','3800 Ave, #14F','San Francisco','CA','44444');

1 row created.

SQL> INSERT INTO employees VALUES('A05','C','K','555-555-5555','114 St','New York','NY','55555');

1 row created.

SQL> INSERT INTO employees VALUES('A06',' ','K','666-666-666','390 Mall','Palo Alto','CA','66666');

1 row created.

SQL> INSERT INTO employees VALUES('A07','P','O','777-777-7777','1442 St','Sarasota','FL','77777');

1 row created.

SQL>
SQL>
SQL> SELECT DISTINCT state
  2    FROM employees;


看过滤重复的sql语句

SQL> select * from Employee
  2  /

ID   FIRST_NAME LAST_NAME  START_DAT END_DATE      SALARY CITY       DESCRIPTION
---- ---------- ---------- --------- --------- ---------- ---------- ---------------
01   Jason      Martin     25-JUL-96 25-JUL-06    1234.56 Toronto    Programmer
02   Alison     Mathews    21-MAR-76 21-FEB-86    6661.78 Vancouver  Tester
03   James      Smith      12-DEC-78 15-MAR-90    6544.78 Vancouver  Tester
04   Celia      Rice       24-OCT-82 21-APR-99    2344.78 Vancouver  Manager
05   Robert     Black      15-JAN-84 08-AUG-98    2334.78 Vancouver  Tester
06   Linda      Green      30-JUL-87 04-JAN-96    4322.78 New York   Tester
07   David      Larry      31-DEC-90 12-FEB-98    7897.78 New York   Manager
08   James      Cat        17-SEP-96 15-APR-02    1232.78 Vancouver  Tester

8 rows selected.

SQL>
SQL>
SQL> SELECT Description FROM employee
  2  /

DESCRIPTION
---------------
Programmer
Tester
Tester
Manager
Tester
Tester
Manager
Tester

8 rows selected.

SQL> SELECT DISTINCT Description FROM employee
  2  /

DESCRIPTION
---------------
Programmer
Manager
Tester

如果要处理多列表的话,字段以“,"分开就要以了,如下

SQL> SELECT DISTINCT city, state
  2    FROM employees


关于增加了distinct后查询的效率测试

只有增加DISTINCT关键字,Oracle必然需要对后面的所有字段进行排序。以前也经常发现由于开发人员对SQL不是很理解,在SELECT列表的20多个字段前面添加了DISTINCT,造成查询基本上不可能执行完成,甚至产生ORA-7445错误。所以一直向开发人员强调DISTINCT给性能带来的影响。

  没想到开发人员在测试一条大的SQL的时候,告诉我如果加上了DISTINCT,则查询大概需要4分钟左右可以执行完,如果不加DISTINCT,则查询执行了10多分钟,仍然得不到结果。

  首先想到的是可能DISTINCT是在子查询中,由于加上了DISTINCT,将第一步结果集缩小了,导致查询性能提高,结果一看SQL,发现DISTINCT居然是在查询的最外层。

  由于原始SQL太长,而且牵扯的表太多,很难说清楚,这里模拟了一个例子,这个例子由于数据量和SQL的复杂程度限制,无法看出二者执行时间上的明显差别。这里从两种情况的逻辑读对比来说明问题。

  首先建立模拟环境:

 

  SQL> CREATE TABLE T1 AS SELECT * FROM DBA_OBJECTS
  2 WHERE OWNER = 'SYS'
  3 AND OBJECT_TYPE NOT LIKE '%BODY'
  4 AND OBJECT_TYPE NOT LIKE 'JAVA%';
  Table created.
  SQL> CREATE TABLE T2 AS SELECT * FROM DBA_SEGMENTS WHERE OWNER = 'SYS';
  Table created.
  SQL> CREATE TABLE T3 AS SELECT * FROM DBA_INDEXES WHERE OWNER = 'SYS';
  Table created.
  SQL> ALTER TABLE T1 ADD CONSTRAINT PK_T1 PRIMARY KEY (OBJECT_NAME);
  Table altered.
  SQL> CREATE INDEX IND_T2_SEGNAME ON T2(SEGMENT_NAME);
  Index created.
  SQL> CREATE INDEX IND_T3_TABNAME ON T3(TABLE_NAME);
  Index created.
  SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'T1', METHOD_OPT => 'FOR ALL INDEXED COLUMNS SIZE 100', CASCADE => TRUE)
  PL/SQL procedure successfully completed.
  SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'T2', METHOD_OPT => 'FOR ALL INDEXED COLUMNS SIZE 100', CASCADE => TRUE)
  PL/SQL procedure successfully completed.
  SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'T3', METHOD_OPT => 'FOR ALL INDEXED COLUMNS SIZE 100', CASCADE => TRUE)
  PL/SQL procedure successfully completed.
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn