search
HomeDatabaseMysql Tutorialsqlserver 删除大数据

一、写在前面 - 想说爱你不容易 为了升级数据库至SQL Server 2008 R2,拿了一台现有的PC做测试,数据库从正式库Restore(3个数据库大小夸张地达到100G),而机器内存只有可怜的4G,不仅要承担DB Server角色,同时也要作为Web Server,可想而知这台机器的命运是

一、写在前面 - 想说爱你不容易

  为了升级数据库至SQL Server 2008 R2,拿了一台现有的PC做测试,数据库从正式库Restore(3个数据库大小夸张地达到100G+),而机器内存只有可怜的4G,不仅要承担DB Server角色,同时也要作为Web Server,可想而知这台机器的命运是及其惨烈的,只要MS SQL Server一启动,内存使用率立马飙升至99%。没办法,只能升内存,两根8G共16G的内存换上,结果还是一样,内存瞬间被秒杀(CPU利用率在0%徘徊)。由于是PC机,内存插槽共俩,目前市面上最大的单根内存为16G(价格1K+),就算买回来估计内存还是不够(卧槽,PC机伤不起啊),看样子别无它法 -- 删数据!!!

  删除数据 - 说的容易, 不就是DELETE吗?靠,如果真这么干,我XXX估计能“知道上海凌晨4点的样子”(KB,Sorry,谁让我是XXX的Programmer,哥在这方面绝对比你牛X),而且估计会暴库(磁盘空间不足,产生的日志文件太大了)。

二、沙场点兵 - 众里寻他千百度

  为了更好地阐述我所遇到的困难和问题,有必要做一些必要的测试和说明,同时这也是对如何解决问题的一种探究。因为毕竟这个问题的根本是如何来更好更快的操作数据,说到底就是DELETE、UPDATE、INSERT、TRUNCATE、DROP等的优化操作组合,我们的目的就是找出最优最快最好的方法。为了便于测试,准备了一张测试表Employee

sqlserver 删除大数据

<span>--</span><span>Create table Employee</span>
<span>CREATE</span> <span>TABLE</span> <span>[</span><span>dbo</span><span>]</span>.<span>[</span><span>Employee</span><span>]</span> (
    <span>[</span><span>EmployeeNo</span><span>]</span> <span>INT</span> <span>PRIMARY</span> <span>KEY</span>,
    <span>[</span><span>EmployeeName</span><span>]</span> <span>[</span><span>nvarchar</span><span>]</span>(<span><strong>50</strong></span>) <span>NULL</span>,
    <span>[</span><span>CreateUser</span><span>]</span> <span>[</span><span>nvarchar</span><span>]</span>(<span><strong>50</strong></span>) <span>NULL</span>,
    <span>[</span><span>CreateDatetime</span><span>]</span> <span>[</span><span>datetime</span><span>]</span> <span>NULL</span>
);

sqlserver 删除大数据

1. 数据插入PK

1.1. 循环插入,执行时间为38026毫秒

sqlserver 删除大数据

<span>--</span><span>循环插入</span>
<span>SET</span> <span>STATISTICS</span> TIME <span>ON</span>;
<span>DECLARE</span> <span>@Index</span> <span>INT</span> <span>=</span> <span><strong>1</strong></span>;
<span>DECLARE</span> <span>@Timer</span> <span>DATETIME</span> <span>=</span> <span>GETDATE</span>();

<span>WHILE</span> <span>@Index</span> <span> <span><strong>100000</strong></span>
<span>BEGIN</span>
    <span>INSERT</span> <span>[</span><span>dbo</span><span>]</span>.<span>[</span><span>Employee</span><span>]</span>(EmployeeNo, EmployeeName, CreateUser, CreateDatetime) <span>VALUES</span>(<span>@Index</span>, <span>'</span><span>Employee_</span><span>'</span> <span>+</span> <span>CAST</span>(<span>@Index</span> <span>AS</span> <span>CHAR</span>(<span><strong>6</strong></span>)), <span>'</span><span>system</span><span>'</span>, <span>GETDATE</span>());
    <span>SET</span> <span>@Index</span> <span>=</span> <span>@Index</span> <span>+</span> <span><strong>1</strong></span>;
<span>END</span>

<span>SELECT</span> <span>DATEDIFF</span>(MS, <span>@Timer</span>, <span>GETDATE</span>()) <span>AS</span> <span>[</span><span>执行时间(毫秒)</span><span>]</span>;
<span>SET</span> <span>STATISTICS</span> TIME <span>OFF</span>;</span>

sqlserver 删除大数据

1.2.   事务循环插入,执行时间为6640毫秒

sqlserver 删除大数据

<span>--</span><span>事务循环</span>
<span>BEGIN</span> <span>TRAN</span>;
<span>SET</span> <span>STATISTICS</span> TIME <span>ON</span>;
<span>DECLARE</span> <span>@Index</span> <span>INT</span> <span>=</span> <span><strong>1</strong></span>;
<span>DECLARE</span> <span>@Timer</span> <span>DATETIME</span> <span>=</span> <span>GETDATE</span>();

<span>WHILE</span> <span>@Index</span> <span> <span><strong>100000</strong></span>
<span>BEGIN</span>
    <span>INSERT</span> <span>[</span><span>dbo</span><span>]</span>.<span>[</span><span>Employee</span><span>]</span>(EmployeeNo, EmployeeName, CreateUser, CreateDatetime) <span>VALUES</span>(<span>@Index</span>, <span>'</span><span>Employee_</span><span>'</span> <span>+</span> <span>CAST</span>(<span>@Index</span> <span>AS</span> <span>CHAR</span>(<span><strong>6</strong></span>)), <span>'</span><span>system</span><span>'</span>, <span>GETDATE</span>());
    <span>SET</span> <span>@Index</span> <span>=</span> <span>@Index</span> <span>+</span> <span><strong>1</strong></span>;
<span>END</span>

<span>SELECT</span> <span>DATEDIFF</span>(MS, <span>@Timer</span>, <span>GETDATE</span>()) <span>AS</span> <span>[</span><span>执行时间(毫秒)</span><span>]</span>;
<span>SET</span> <span>STATISTICS</span> TIME <span>OFF</span>;

<span>COMMIT</span>;</span>

sqlserver 删除大数据

1.3.   批量插入,执行时间为220毫秒

sqlserver 删除大数据

<span>SET</span> <span>STATISTICS</span> TIME <span>ON</span>;
<span>DECLARE</span> <span>@Timer</span> <span>DATETIME</span> <span>=</span> <span>GETDATE</span>();

<span>INSERT</span> <span>[</span><span>dbo</span><span>]</span>.<span>[</span><span>Employee</span><span>]</span>(EmployeeNo, EmployeeName, CreateUser, CreateDatetime)
<span>SELECT</span> <span>TOP</span>(<span><strong>100000</strong></span>) EmployeeNo <span>=</span> ROW_NUMBER() <span>OVER</span> (<span>ORDER</span> <span>BY</span> C1.<span>[</span><span>OBJECT_ID</span><span>]</span>), <span>'</span><span>Employee_</span><span>'</span>, <span>'</span><span>system</span><span>'</span>, <span>GETDATE</span>()
<span>FROM</span> SYS.COLUMNS <span>AS</span> C1 <span>CROSS</span> <span>JOIN</span> SYS.COLUMNS <span>AS</span> C2
<span>ORDER</span> <span>BY</span> C1.<span>[</span><span>OBJECT_ID</span><span>]</span>

<span>SELECT</span> <span>DATEDIFF</span>(MS, <span>@Timer</span>, <span>GETDATE</span>()) <span>AS</span> <span>[</span><span>执行时间(毫秒)</span><span>]</span>;
<span>SET</span> <span>STATISTICS</span> TIME <span>OFF</span>;

sqlserver 删除大数据

1.4.   CTE插入,执行时间也为220毫秒

sqlserver 删除大数据

<span>SET</span> <span>STATISTICS</span> TIME <span>ON</span>;
<span>DECLARE</span> <span>@Timer</span> <span>DATETIME</span> <span>=</span> <span>GETDATE</span>();

;<span>WITH</span> CTE(EmployeeNo, EmployeeName, CreateUser, CreateDatetime) <span>AS</span>(
    <span>SELECT</span> <span>TOP</span>(<span><strong>100000</strong></span>) EmployeeNo <span>=</span> ROW_NUMBER() <span>OVER</span> (<span>ORDER</span> <span>BY</span> C1.<span>[</span><span>OBJECT_ID</span><span>]</span>), <span>'</span><span>Employee_</span><span>'</span>, <span>'</span><span>system</span><span>'</span>, <span>GETDATE</span>()
    <span>FROM</span> SYS.COLUMNS <span>AS</span> C1 <span>CROSS</span> <span>JOIN</span> SYS.COLUMNS <span>AS</span> C2
    <span>ORDER</span> <span>BY</span> C1.<span>[</span><span>OBJECT_ID</span><span>]</span>
)
<span>INSERT</span> <span>[</span><span>dbo</span><span>]</span>.<span>[</span><span>Employee</span><span>]</span> <span>SELECT</span> EmployeeNo, EmployeeName, CreateUser, CreateDatetime <span>FROM</span> CTE;

<span>SELECT</span> <span>DATEDIFF</span>(MS, <span>@Timer</span>, <span>GETDATE</span>()) <span>AS</span> <span>[</span><span>执行时间(毫秒)</span><span>]</span>;
<span>SET</span> <span>STATISTICS</span> TIME <span>OFF</span>;

sqlserver 删除大数据

小结:

  • 按执行时间,效率依次为:CTE和批量插入效率相当,速度最快,事务插入次之,单循环插入速度最慢;
  • 单循环插入速度最慢是由于INSERT每次都有日志,事务插入大大减少了写入日志次数,批量插入只有一次日志,CTE的基础是CLR,善用速度是最快的。

 

2.  数据删除PK

2.1.   循环删除,执行时间为1240毫秒

sqlserver 删除大数据

<span>SET</span> <span>STATISTICS</span> TIME <span>ON</span>;
<span>DECLARE</span> <span>@Timer</span> <span>DATETIME</span> <span>=</span> <span>GETDATE</span>();

<span>DELETE</span> <span>FROM</span> <span>[</span><span>dbo</span><span>]</span>.<span>[</span><span>Employee</span><span>]</span>;

<span>SELECT</span> <span>DATEDIFF</span>(MS, <span>@Timer</span>, <span>GETDATE</span>()) <span>AS</span> <span>[</span><span>执行时间(毫秒)</span><span>]</span>;
<span>SET</span> <span>STATISTICS</span> TIME <span>OFF</span>;

sqlserver 删除大数据

2.2.  批量删除,执行时间为106毫秒

sqlserver 删除大数据

<span>SET</span> <span>STATISTICS</span> TIME <span>ON</span>;
<span>DECLARE</span> <span>@Timer</span> <span>DATETIME</span> <span>=</span> <span>GETDATE</span>();

<span>SET</span> <span>ROWCOUNT</span> <span><strong>100000</strong></span>;

<span>WHILE</span> <span><strong>1</strong></span> <span>=</span> <span><strong>1</strong></span>
<span>BEGIN</span>
    <span>BEGIN</span> <span>TRAN</span>
    <span>DELETE</span> <span>FROM</span> <span>[</span><span>dbo</span><span>]</span>.<span>[</span><span>Employee</span><span>]</span>;
    <span>COMMIT</span>
    <span>IF</span> <span><strong>@@ROWCOUNT</strong></span> <span>=</span> <span><strong>0</strong></span>
        <span>BREAK</span>;
<span>END</span>

<span>SET</span> <span>ROWCOUNT</span> <span><strong>0</strong></span>;

<span>SELECT</span> <span>DATEDIFF</span>(MS, <span>@Timer</span>, <span>GETDATE</span>()) <span>AS</span> <span>[</span><span>执行时间(毫秒)</span><span>]</span>;
<span>SET</span> <span>STATISTICS</span> TIME <span>OFF</span>;

sqlserver 删除大数据

2.3.  TRUNCATE删除,执行时间为0毫秒

sqlserver 删除大数据

<span>SET</span> <span>STATISTICS</span> TIME <span>ON</span>;
<span>DECLARE</span> <span>@Timer</span> <span>DATETIME</span> <span>=</span> <span>GETDATE</span>();

<span>TRUNCATE</span> <span>TABLE</span> <span>[</span><span>dbo</span><span>]</span>.<span>[</span><span>Employee</span><span>]</span>;

<span>SELECT</span> <span>DATEDIFF</span>(MS, <span>@Timer</span>, <span>GETDATE</span>()) <span>AS</span> <span>[</span><span>执行时间(毫秒)</span><span>]</span>;
<span>SET</span> <span>STATISTICS</span> TIME <span>OFF</span>;

sqlserver 删除大数据

 小结:

  • TRUNCATE太快了,清除10W数据一点没压力,批量删除次之,最后的DELTE太慢了;
  • TRUNCATE快是因为它属于DDL语句,只会产生极少的日志,普通的DELETE不仅会产生日志,而且会锁记录。

 

三、磨刀霍霍 - 犹抱琵琶半遮面

  由上面的第二点我们知道,插入最快和删除最快的方式分别是批量插入和TRUNCATE,所以为了达到删除大数据的目的,我们也将采用这两种方式的组合,其中心思想是先把需要保留的数据存放之新表中,然后TRUNCATE原表中的数据,最后再批量把数据插回去,当然实现方式也可以随便变通。

1. 保留需要的数据之新表中->TRUNCATE原表数据->还原之前保留的数据之原表中

  脚本类似如下

<span>SELECT</span> <span>*</span> <span>INTO</span> #keep <span>FROM</span> Original <span>WHERE</span> CreateDate <span>></span> <span>'</span><span>2011-12-31</span><span>'</span>
<span>TRUNCATE</span> <span>TABLE</span> Original
<span>INSERT</span> Original <span>SELECT</span> <span>*</span> <span>FROM</span> #keep

  第一条语句会把所有要保留的数据先存放至表#keep中(表#keep无需手工创建,由SELECT INTO生效),#keep会Copy原始表Original的表结构。PS:如果你只想创建表结构,但不拷贝数据,则对应的脚本如下

<span>SELECT</span> <span>*</span> <span>INTO</span> #keep <span>FROM</span> Original <span>WHERE</span> <span><strong>1</strong></span> <span>=</span> <span><strong>2</strong></span>

  第二条语句用于清除整个表中数据,产生的日志文件基本可以忽略;第三条语句用于还原保留数据。

几点说明:

  • 你可以不用SELECT INTO,自己通过写脚本(或拷贝现有表)来创建#keep,但是后者有一个弊端,即无法通过SQL脚本来获得对应的表生成Script(我的意思是和原有表完全一致的脚本,即基本列,属性,索引,约束等),而且当要操作的表比较多时,估计你肯定会抓狂;
  • 既然第一点欠妥,那考虑新建一个同样的数据库怎么样?既可以使用现有脚本,而且生成的数据库基本一致,但是我告诉你最好别这么做,因为第一要跨库,第二,你得准备足够的磁盘空间。

 

2. 新建表结构->批量插入需要保留的数据->DROP原表->重命名新表为原表

  CREATE TABLE #keep AS (xxx) xxx -- 使用上面提到的方法(使用既有表的创建脚本),但是不能够保证完全一致;

  INSERT #keep SELECT * FROM Original where clause

  DROP TBALE Original

  EXEC SP_RENAME '#keep','Original'

  这种方式比第一种方法略快点,因为省略了数据还原(即最后一步的数据恢复),但是稍微麻烦点,因为你需要创建一张和以前原有一模一样的表结构,包括基本列、属性、约束、索性等等。

三、数据收缩 - 秋风少落叶

   数据删除后,发现数据库占用空间大小并没有发生变化,此时我们就用借助强悍的数据收缩功能了,脚本如下,运行时间不定,取决于你的数据库大小,多则几十分钟,少则瞬间秒杀

<span>DBCC</span> SHRINKDATABASE(<span>DB_NAME</span>)
Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
sqlserver数据库中已存在名为的对象怎么解决sqlserver数据库中已存在名为的对象怎么解决Apr 05, 2024 pm 09:42 PM

对于 SQL Server 数据库中已存在同名对象,需要采取以下步骤:确认对象类型(表、视图、存储过程)。如果对象为空,可使用 IF NOT EXISTS 跳过创建。如果对象有数据,使用不同名称或修改结构。使用 DROP 删除现有对象(谨慎操作,建议备份)。检查架构更改,确保没有引用删除或重命名的对象。

sqlserver服务无法启动怎么办sqlserver服务无法启动怎么办Apr 05, 2024 pm 10:00 PM

当 SQL Server 服务无法启动时,可采取以下步骤解决:检查错误日志以确定根本原因。确保服务帐户具有启动服务的权限。检查依赖项服务是否正在运行。禁用防病毒软件。修复 SQL Server 安装。如果修复不起作用,重新安装 SQL Server。

怎么查看sqlserver端口号怎么查看sqlserver端口号Apr 05, 2024 pm 09:57 PM

要查看 SQL Server 端口号:打开 SSMS,连接到服务器。在对象资源管理器中找到服务器名称,右键单击它,然后选择“属性”。在“连接”选项卡中,查看“TCP 端口”字段。

sqlserver数据库在哪里sqlserver数据库在哪里Apr 05, 2024 pm 08:21 PM

SQL Server 数据库文件通常存储在以下默认位置:Windows: C:\Program Files\Microsoft SQL Server\MSSQL\DATALinux: /var/opt/mssql/data可通过修改数据库文件路径设置来自定义数据库文件位置。

Java连接SqlServer错误如何解决Java连接SqlServer错误如何解决May 01, 2023 am 09:22 AM

问题发现这次使用的是SqlServer数据库,之前并没有使用过,但是问题不大,我按照需求文档的步骤连接好SqlServer之后,启动SpringBoot项目,发现了一个报错,如下:刚开始我以为是SqlServer连接问题呢,于是便去查看数据库,发现数据库一切正常,我首先第一时间问了我的同事,他们是否有这样的问题,发现他们并没有,于是我便开始了我最拿手的环节,面向百度编程。开始解决具体报错信息是这样,于是我便开始了百度报错:ERRORc.a.d.p.DruidDataSource$CreateCo

sqlserver英文安装怎么更改中文sqlserver英文安装怎么更改中文Apr 05, 2024 pm 10:21 PM

SQL Server 英文安装可通过以下步骤更改为中文:下载相应语言包;停止 SQL Server 服务;安装语言包;更改实例语言;更改用户界面语言;重启应用程序。

Win11无法安装SQL Server的原因及解决方案Win11无法安装SQL Server的原因及解决方案Dec 27, 2023 pm 07:48 PM

有网友反馈,在win11上无法安装sqlserver这款软件,不知道是怎么回事,根据目前的测试来看,win11存在硬盘问题,部分接口硬盘无法安装这款软件。win11为啥不能安装sqlserver:答:win11不能安装sqlserver是硬盘的问题。1、据了解,win11存在对于硬盘的检测bug。2、这导致sqlserver无法在“三星m.2接口”硬盘上安装。3、因此,如果我们要安装的话,需要准备一块其他硬盘。4、然后将该硬盘安装到电脑里,如果没有额外插槽的话就要换掉之前的硬盘。5、安装完成后,

sqlserver数据库日志怎么查询sqlserver数据库日志怎么查询Apr 05, 2024 pm 09:06 PM

可以通过以下步骤查询 SQL Server 数据库日志:1. 打开 SQL Server Management Studio,连接到数据库服务器;2. 展开“管理”节点,导航到“SQL Server 日志”;3. 选择要查询的日志文件,右键单击并选择“查看日志文件”;4. 浏览日志记录。其他查询日志方法:使用 Transact-SQL 查询、PowerShell Cmdlet。

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

Repo: How To Revive Teammates
1 months agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Dreamweaver Mac version

Dreamweaver Mac version

Visual web development tools

MantisBT

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)