search
HomeDatabaseOracleHow to deduplicate data in Oracle

How to deduplicate data in Oracle

Jan 04, 2023 pm 02:42 PM
oracle

Deduplication method: 1. Use distinct keyword to remove duplication, syntax "SELECT DISTINCT field name FROM table name;"; 2. Use window function row_number () over() to remove duplication; 3. Use "group by" clause to deduplicate, the syntax is "select field name from table name group by field name;"; 4. Use rowid to deduplicate pseudo columns.

How to deduplicate data in Oracle

The operating environment of this tutorial: Windows 7 system, Oracle 11g version, Dell G3 computer.

Business Scenario

Need to query certain data. Since three tables are required for related queries, the query results are as follows:

How to deduplicate data in Oracle
Original SQL statement

SELECT 
  D.ORDER_NUM AS "申请单号" ,
  D.CREATE_TIME ,
  D.EMP_NAME AS "申请人",
  (SELECT extractvalue(t1.row_data,'/root/row/FI13_wasteName')
  FROM dat_table_row t1
  WHERE d.document_id = t1.document_id
  AND t1.table_id     = 'dynamicRowsIdPTFLXX'
  ) AS "废料名称",
  (SELECT extractvalue(t1.row_data,'/root/row/FI13_units')
  FROM dat_table_row t1
  WHERE d.document_id = t1.document_id
  AND t1.table_id     = 'dynamicRowsIdPTFLXX'
  ) AS "单位",
  (SELECT extractvalue(t1.row_data,'/root/row/FI13_estimate')
  FROM dat_table_row t1
  WHERE d.document_id = t1.document_id
  AND t1.table_id     = 'dynamicRowsIdPTFLXX'
  ) AS "预估数量",
  (SELECT extractvalue(t1.row_data,'/root/row/FI13_stockRemoval')
  FROM dat_table_row t1
  WHERE d.document_id = t1.document_id
  AND t1.table_id     = 'dynamicRowsIdPTFLXX'
  ) AS "累计出库数量",
  (SELECT extractvalue(t1.row_data,'/root/row/FI13_receivingTime')
  FROM dat_table_row t1
  WHERE d.document_id = t1.document_id
  AND t1.table_id     = 'dynamicRowsIdCGYTX'
  ) AS "收购方收货时间",
  (SELECT extractvalue(t2.row_data,'/root/row/FI13_collectionTime')
  FROM dat_table_row t2
  WHERE d.document_id = t2.document_id
  AND t2.table_id     = 'dynamicRowsIdPTSJSKSJ'
  ) AS "实际收款时间"
FROM dat_document d,
  dat_table_row dtr
WHERE d.form_name       ='FI14'
AND d.document_id       =dtr.document_id
AND (D.DOCUMENT_STATUS != 'deleted'
OR D.DOCUMENT_STATUS   IS NULL )
  --AND TO_CHAR(d.create_time,'yyyy-MM-dd') BETWEEN '2020-01-01' AND '2021-03-26'
AND d.order_num = 'FI1420210708002' --FI1420210708002
ORDER BY d.CREATE_TIME DESC;

Method 1: distinct deduplication

SELECT DISTINCT can be used to filter the result set Duplicate rows, ensuring that the values ​​in the specified column or columns returned in the SELECT clause are unique.

The syntax of the DISTINCT statement is as follows:

SELECT DISTINCT column_1,
    column_2,
        ...
        FROM
    table_name;

Example:

SELECT 
  D.ORDER_NUM AS "申请单号" ,
  D.CREATE_TIME ,
  D.EMP_NAME AS "申请人",
  (SELECT extractvalue(t1.row_data,'/root/row/FI13_wasteName')
  FROM dat_table_row t1
  WHERE d.document_id = t1.document_id
  AND t1.table_id     = 'dynamicRowsIdPTFLXX'
  ) AS "废料名称",
  (SELECT extractvalue(t1.row_data,'/root/row/FI13_units')
  FROM dat_table_row t1
  WHERE d.document_id = t1.document_id
  AND t1.table_id     = 'dynamicRowsIdPTFLXX'
  ) AS "单位",
  (SELECT extractvalue(t1.row_data,'/root/row/FI13_estimate')
  FROM dat_table_row t1
  WHERE d.document_id = t1.document_id
  AND t1.table_id     = 'dynamicRowsIdPTFLXX'
  ) AS "预估数量",
  (SELECT extractvalue(t1.row_data,'/root/row/FI13_stockRemoval')
  FROM dat_table_row t1
  WHERE d.document_id = t1.document_id
  AND t1.table_id     = 'dynamicRowsIdPTFLXX'
  ) AS "累计出库数量",
  (SELECT extractvalue(t1.row_data,'/root/row/FI13_receivingTime')
  FROM dat_table_row t1
  WHERE d.document_id = t1.document_id
  AND t1.table_id     = 'dynamicRowsIdCGYTX'
  ) AS "收购方收货时间",
  (SELECT extractvalue(t2.row_data,'/root/row/FI13_collectionTime')
  FROM dat_table_row t2
  WHERE d.document_id = t2.document_id
  AND t2.table_id     = 'dynamicRowsIdPTSJSKSJ'
  ) AS "实际收款时间"
FROM dat_document d,
  dat_table_row dtr
WHERE d.form_name       ='FI14'
AND d.document_id       =dtr.document_id
AND (D.DOCUMENT_STATUS != 'deleted'
OR D.DOCUMENT_STATUS   IS NULL )
  --AND TO_CHAR(d.create_time,'yyyy-MM-dd') BETWEEN '2020-01-01' AND '2021-03-26'
AND d.order_num = 'FI1420210708002' --FI1420210708002
ORDER BY d.CREATE_TIME DESC;

Note: DISTINCT must be followed by the ORDER BY field. Oracle first performs DISTINCT to remove duplicates, and then uses ORDER Sorted by BY. Therefore, if the field that needs to be sorted in ORDER BY is not in the field after distinct, an error will naturally be thrown.

The error message is as follows:

How to deduplicate data in Oracle

##Method 2: row_number() over()

Grammar format

select * from
(select A.*, row_number() over(partition by A.name1 order by A.name12 desc) rn from A)
where rn = 1

Example

select * from (
select 
  d.order_num as "申请单号" ,
  d.create_time ,
  d.emp_name as "申请人",
  (select extractvalue(t1.row_data,'/root/row/FI13_wasteName')
  from dat_table_row t1
  where d.document_id = t1.document_id
  and t1.table_id     = 'dynamicRowsIdPTFLXX'
  ) as "废料名称",
  (select extractvalue(t1.row_data,'/root/row/FI13_units')
  from dat_table_row t1
  where d.document_id = t1.document_id
  and t1.table_id     = 'dynamicRowsIdPTFLXX'
  ) as "单位",
  (select extractvalue(t1.row_data,'/root/row/FI13_estimate')
  from dat_table_row t1
  where d.document_id = t1.document_id
  and t1.table_id     = 'dynamicRowsIdPTFLXX'
  ) as "预估数量",
  (select extractvalue(t1.row_data,'/root/row/FI13_stockRemoval')
  from dat_table_row t1
  where d.document_id = t1.document_id
  and t1.table_id     = 'dynamicRowsIdPTFLXX'
  ) as "累计出库数量",
  (select extractvalue(t1.row_data,'/root/row/FI13_receivingTime')
  from dat_table_row t1
  where d.document_id = t1.document_id
  and t1.table_id     = 'dynamicRowsIdCGYTX'
  ) as "收购方收货时间",
  (select extractvalue(t2.row_data,'/root/row/FI13_collectionTime')
  from dat_table_row t2
  where d.document_id = t2.document_id
  and t2.table_id     = 'dynamicRowsIdPTSJSKSJ'
  ) as "实际收款时间",
  row_number() over(partition by d.order_num  order by d.create_time desc) rn 
from dat_document d,
  dat_table_row dtr
where d.form_name       ='FI14'
and d.document_id       =dtr.document_id
and (d.document_status != 'deleted'
or d.document_status   is null )
  --AND TO_CHAR(d.create_time,'yyyy-MM-dd') BETWEEN '2020-01-01' AND '2021-03-26'
and d.order_num = 'FI1420210708002' --FI1420210708002
) where rn = 1;
Query results


How to deduplicate data in Oracle

Method 3: group by

select 字段名 from 表名
group by 字段名;

Method 4: Using rowid (pseudo column deduplication)

select id,name,age from test t1
where t1.rowid in (select min(rowid) from test t2 where t1.name=t2.name and t1.age=t2.age);

Recommended tutorial: "

Oracle tutorial

The above is the detailed content of How to deduplicate data in Oracle. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
The Role of Oracle Software: Streamlining Business ProcessesThe Role of Oracle Software: Streamlining Business ProcessesMay 10, 2025 am 12:19 AM

Oracle software simplifies business processes through database management, ERP, CRM and data analysis capabilities. 1) OracleERPCloud automates financial, human resources and other processes; 2) OracleCXCloud manages customer interactions and provides personalized services; 3) OracleAnalyticsCloud supports data analysis and decision-making.

Oracle's Software Suite: Products and Services ExplainedOracle's Software Suite: Products and Services ExplainedMay 09, 2025 am 12:12 AM

Oracle's software suite includes database management, ERP, CRM, etc., helps enterprises optimize operations, improve efficiency, and reduce costs. 1. OracleDatabase manages data, 2. OracleERPCloud handles finance, human resources and supply chain, 3. Use OracleSCMCloud to optimize supply chain management, 4. Ensure data flow and consistency through APIs and integration tools.

MySQL vs. Oracle: Licensing, Features, and BenefitsMySQL vs. Oracle: Licensing, Features, and BenefitsMay 08, 2025 am 12:05 AM

The main difference between MySQL and Oracle is licenses, features, and advantages. 1. License: MySQL provides a GPL license for free use, and Oracle adopts a proprietary license, which is expensive. 2. Function: MySQL has simple functions and is suitable for web applications and small and medium-sized enterprises. Oracle has powerful functions and is suitable for large-scale data and complex businesses. 3. Advantages: MySQL is open source free, suitable for startups, and Oracle is reliable in performance, suitable for large enterprises.

MySQL vs. Oracle: Selecting the Right Database SystemMySQL vs. Oracle: Selecting the Right Database SystemMay 07, 2025 am 12:09 AM

MySQL and Oracle have significant differences in performance, cost and usage scenarios. 1) Performance: Oracle performs better in complex queries and high concurrency environments. 2) Cost: MySQL is open source, low cost, suitable for small and medium-sized projects; Oracle is commercialized, high cost, suitable for large enterprises. 3) Usage scenarios: MySQL is suitable for web applications and small and medium-sized enterprises, and Oracle is suitable for complex enterprise-level applications. When choosing, you need to weigh the specific needs.

Oracle Software: Maximizing Efficiency and PerformanceOracle Software: Maximizing Efficiency and PerformanceMay 06, 2025 am 12:07 AM

Oracle software can improve performance in a variety of ways. 1) Optimize SQL queries and reduce data transmission; 2) Appropriately manage indexes to balance query speed and maintenance costs; 3) Reasonably configure memory, optimize SGA and PGA; 4) Reduce I/O operations and use appropriate storage devices.

Oracle: Enterprise Software and Cloud ComputingOracle: Enterprise Software and Cloud ComputingMay 05, 2025 am 12:01 AM

Oracle is so important in the enterprise software and cloud computing sectors because of its comprehensive solutions and strong technical support. 1) Oracle provides a wide range of product lines from database management to ERP, 2) its cloud computing services such as OracleCloudPlatform and Infrastructure help enterprises achieve digital transformation, 3) Oracle database stability and performance and seamless integration of cloud services improve enterprise efficiency.

MySQL vs. Oracle: A Comparative Analysis of Database SystemsMySQL vs. Oracle: A Comparative Analysis of Database SystemsMay 04, 2025 am 12:13 AM

MySQL and Oracle have their own advantages and disadvantages, and comprehensive considerations should be taken into account when choosing: 1. MySQL is suitable for lightweight and easy-to-use needs, suitable for web applications and small and medium-sized enterprises; 2. Oracle is suitable for powerful functions and high reliability needs, suitable for large enterprises and complex business systems.

MySQL vs. Oracle: Understanding Licensing and CostMySQL vs. Oracle: Understanding Licensing and CostMay 03, 2025 am 12:19 AM

MySQL uses GPL and commercial licenses for small and open source projects; Oracle uses commercial licenses for enterprises that require high performance. MySQL's GPL license is free, and commercial licenses require payment; Oracle license fees are calculated based on processors or users, and the cost is relatively high.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

mPDF

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

WebStorm Mac version

WebStorm Mac version

Useful JavaScript development tools

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver Mac version

Dreamweaver Mac version

Visual web development tools