Linux下删除大数据文件中部分字段重复的行-Mysql Tutorial-php.cn

Home

Database

Mysql Tutorial

Linux下删除大数据文件中部分字段重复的行

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 07, 2016 pm 05:08 PM

最近写的一个数据采集程序生成了一个含有1千多万行数据的文件，数据由4个字段组成，按照要求需要删除第二个字段重复的行，找来找

最近写的一个数据采集程序生成了一个含有1千多万行数据的文件，，数据由4个字段组成，按照要求需要删除第二个字段重复的行，找来找去linux下也没找到合适的工具，sed/gawk等流处理工具只能针对一行一行处理，并无法找到字段重复的行。看来只好自己python一个程序了，突然想起来利用mysql，于是进行乾坤大挪移：

1. 利用mysqlimport --local dbname data.txt导入数据到表中，表名要与文件名一致
2. 执行下列sql语句(要求唯一的字段为uniqfield）

use dbname;

alter table tablename add rowid int auto_increment not null;

create table t select min(rowid) as rowid from tablename group by uniqfield;

create table t2 select tablename .* from tablename,t where tablename.rowid= t.rowid;

drop table tablename;

rename table t2 to tablename;

linux

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

How does MySQL index cardinality affect query performance?Apr 14, 2025 am 12:18 AM

MySQL index cardinality has a significant impact on query performance: 1. High cardinality index can more effectively narrow the data range and improve query efficiency; 2. Low cardinality index may lead to full table scanning and reduce query performance; 3. In joint index, high cardinality sequences should be placed in front to optimize query.

MySQL: Resources and Tutorials for New UsersApr 14, 2025 am 12:16 AM

The MySQL learning path includes basic knowledge, core concepts, usage examples, and optimization techniques. 1) Understand basic concepts such as tables, rows, columns, and SQL queries. 2) Learn the definition, working principles and advantages of MySQL. 3) Master basic CRUD operations and advanced usage, such as indexes and stored procedures. 4) Familiar with common error debugging and performance optimization suggestions, such as rational use of indexes and optimization queries. Through these steps, you will have a full grasp of the use and optimization of MySQL.

Real-World MySQL: Examples and Use CasesApr 14, 2025 am 12:15 AM

MySQL's real-world applications include basic database design and complex query optimization. 1) Basic usage: used to store and manage user data, such as inserting, querying, updating and deleting user information. 2) Advanced usage: Handle complex business logic, such as order and inventory management of e-commerce platforms. 3) Performance optimization: Improve performance by rationally using indexes, partition tables and query caches.

SQL Commands in MySQL: Practical ExamplesApr 14, 2025 am 12:09 AM

SQL commands in MySQL can be divided into categories such as DDL, DML, DQL, DCL, etc., and are used to create, modify, delete databases and tables, insert, update, delete data, and perform complex query operations. 1. Basic usage includes CREATETABLE creation table, INSERTINTO insert data, and SELECT query data. 2. Advanced usage involves JOIN for table joins, subqueries and GROUPBY for data aggregation. 3. Common errors such as syntax errors, data type mismatch and permission problems can be debugged through syntax checking, data type conversion and permission management. 4. Performance optimization suggestions include using indexes, avoiding full table scanning, optimizing JOIN operations and using transactions to ensure data consistency.

How does InnoDB handle ACID compliance?Apr 14, 2025 am 12:03 AM

InnoDB achieves atomicity through undolog, consistency and isolation through locking mechanism and MVCC, and persistence through redolog. 1) Atomicity: Use undolog to record the original data to ensure that the transaction can be rolled back. 2) Consistency: Ensure the data consistency through row-level locking and MVCC. 3) Isolation: Supports multiple isolation levels, and REPEATABLEREAD is used by default. 4) Persistence: Use redolog to record modifications to ensure that data is saved for a long time.

MySQL's Place: Databases and ProgrammingApr 13, 2025 am 12:18 AM

MySQL's position in databases and programming is very important. It is an open source relational database management system that is widely used in various application scenarios. 1) MySQL provides efficient data storage, organization and retrieval functions, supporting Web, mobile and enterprise-level systems. 2) It uses a client-server architecture, supports multiple storage engines and index optimization. 3) Basic usages include creating tables and inserting data, and advanced usages involve multi-table JOINs and complex queries. 4) Frequently asked questions such as SQL syntax errors and performance issues can be debugged through the EXPLAIN command and slow query log. 5) Performance optimization methods include rational use of indexes, optimized query and use of caches. Best practices include using transactions and PreparedStatemen

MySQL: From Small Businesses to Large EnterprisesApr 13, 2025 am 12:17 AM

MySQL is suitable for small and large enterprises. 1) Small businesses can use MySQL for basic data management, such as storing customer information. 2) Large enterprises can use MySQL to process massive data and complex business logic to optimize query performance and transaction processing.

What are phantom reads and how does InnoDB prevent them (Next-Key Locking)?Apr 13, 2025 am 12:16 AM

InnoDB effectively prevents phantom reading through Next-KeyLocking mechanism. 1) Next-KeyLocking combines row lock and gap lock to lock records and their gaps to prevent new records from being inserted. 2) In practical applications, by optimizing query and adjusting isolation levels, lock competition can be reduced and concurrency performance can be improved.

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Best Graphic Settings

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

2 weeks agoByDDD

R.E.P.O. How to Fix Audio if You Can't Hear Anyone

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

WWE 2K25: How To Unlock Everything In MyRise

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

SublimeText3 Linux new version

SublimeText3 Linux latest version

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

SublimeText3 Chinese version

Chinese version, very easy to use

Dreamweaver Mac version

Visual web development tools

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

Hot Topics

Where is the login entrance for gmail email?

7499

CakePHP Tutorial

1377

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers