Delete Data from Large SQL Server Tables with Minimal Logging

By:Jim Evans | Last Updated: 2019-02-14 | | Related Tips:More > Database Administration

Problem

As SQL Server DBAs or developers, we periodically are tasked with purging data from a very large table. However, typical data delete methods can cause issues with large transaction logs andcontention especially when purging a production system. My recent challenge was to purge a log table that had over 650 million records and retain only the latest 1 million rows.

Solution

My goal is to build a strategic technique to purge a very large table, retaining only a fraction of the data. On a typical SQL Server, 1 million records is not a massive amount and with my recent production challenge it was only 1/650 of the current table size. To do a fast data purge we will utilize a little used feature in SQL Server, Partition Switching . Every table in SQL Server has at least 1 partition. In SQL Server you can quickly move an entire table from one table to another using the Alter Table Switch command. This leaves the original table fully intact, but empty. Then we can move the small sub set of data from the new table back to the original. I chose to use Partition Switch vs.renaming the table because I believe it is a less intrusive method, in which the object and index definitions do not get altered or destroyed during the operation. It is hard to know on many database systems the effects on an application of renaming objects or the effects of system cache.

Note: it is best to do this during off peak hours! According to Microsoft "Executing a SWITCH prevents new transaction from starting and might significantly affect the workload throughput and temporarily delay access to the underlying table."

Note: In this scenario the source table is not referenced by any foreign keys. If other tables reference the source table in a foreign key you will have a multi-table scenario which will have to be handled as to not break referential integrity. This article is meant for a simple scenario such as large log tables without foreign key constraints.

SQL Server Fast Delete from Large Table Demo

In this demo we will create a starting table and populate it with sample data. Then create an identical 'switch' table including indexes. Next, we will move the records to the new table using ALTER TABLE SWITCH command. Show that the record move (very quickly) and the original table is empty. After that we will move a small set of data back to the original table. Last drop the un-needed switch table.

Review the requirement and restrictions for SWITCH at Microsoft: https://docs.microsoft.com/en-us/sql/t-sql/statements/alter-table-transact-sql?view=sql-server-2017

Step 1: Create a Test Table

For this step we will create a test table: dbo.Test1.

CREATE TABLE [dbo].[Test1]  (
   Col1_id INT IDENTITY(1,1),
   Col2_D DATETIME DEFAULT GETDATE(),
   Col3_C NCHAR(20) DEFAULT 'My test desc!',
   Col4_T NVARCHAR(MAX) DEFAULT REPLICATE('0123456789', 100)
);
GO

Step 2: Load Test Data

For this limited example we will only load 5000 rows and create 2 indexes. This also works for tables with millions of records.

--2. Load Test Data

SET NOCOUNT ON;
INSERT [dbo].[Test1]  (Col2_D, Col3_C, Col4_T)
VALUES (DEFAULT, 'My test desc!', DEFAULT);
GO 5000 
 
--Check it.
SELECT top 10 * FROM [dbo].[Test1];
SELECT COUNT(*) As 'Test1' FROM [dbo].[Test1];
GO
 
--Create Indexes
CREATE CLUSTERED INDEX [ClusteredIndex_Test1_Col2_D] ON [dbo].[Test1]
([Col2_D] ASC) 
GO
 
CREATE UNIQUE NONCLUSTERED INDEX [NonClusteredIndex_Test1_Col1_ID] ON [dbo].[Test1]
([Col1_id] ASC) 
GO

The image below shows example results and 5000 rows loaded.

myaqYvv.png!web

Step 3: Create a Duplicate Table

Create a duplicate with new names for the table and indexes. You might script the original table from SSMS and do a find and replace on the table names. Be sure to script all table attributes and indexes. You do not need to include Foreign Keys in the switch table!

--3. Now let's duplicate the original schema and change the table and index names
 
CREATE TABLE [dbo].[Test1_SWITCH] (
   Col1_ID INT IDENTITY(1,1),
   Col2_D DATETIME DEFAULT GETDATE(),
   Col3_C NCHAR(20) DEFAULT 'My test desc!',
   Col4_T NVARCHAR(MAX) DEFAULT REPLICATE('0123456789', 201)
);
 
CREATE CLUSTERED INDEX [ClusteredIndex_Test1_SWITCH_Col2_D] ON [dbo].[Test1_SWITCH]
([Col2_D] ASC) 
GO
 
CREATE UNIQUE NONCLUSTERED INDEX [NonClusteredIndex_Test1_SWITCH_Col1_ID] ON [dbo].[Test1_SWITCH]
([Col1_id] ASC) 
GO

--Check it
SELECT COUNT(*) AS 'Test1' FROM [dbo].[Test1];
SELECT COUNT(*) AS 'Test1_SWITCH' FROM [dbo].[Test1_SWITCH];

The image below shows 5000 rows in the original table and 0 rows in the new table.

67nYny7.png!web

Step 4: The SWITCH!

Run the Alter Table SWITCH command and check the counts.

--4. Switch the data to a new table
ALTER TABLE [dbo].[Test1] SWITCH to [dbo].[Test1_SWITCH];
 
SELECT COUNT(*) AS 'Test1' FROM [dbo].[Test1];
SELECT COUNT(*) AS 'Test1_SWITCH' FROM [dbo].[Test1_SWITCH];

Voilà! The records quickly moved to the new table and the original table is now empty!

7rENFb3.png!web

Step 5: Move Back!

Now we can move the small set of records back to the original table. In this case I have a table column that has an auto incrementing field, so we must use IDENTITY_INSERT to keep these values. This may not be the same in your real-world scenario.

--5. Now we can load the small subset of data back to the table.
SET NOCOUNT ON;

SET IDENTITY_INSERT [dbo].[Test1] ON;
INSERT dbo.Test1 (Col1_ID, Col2_D, Col3_C, Col4_T)
SELECT Col1_ID, Col2_D, Col3_C, Col4_T FROM [dbo].[Test1_SWITCH] WHERE Col1_ID <= 1000;
SET IDENTITY_INSERT [dbo].[Test1] OFF;
 
SELECT COUNT(*) AS 'Test1' FROM [dbo].[Test1];
SELECT COUNT(*) AS 'Test1_SWITCH' FROM [dbo].[Test1_SWITCH];

The results show 1000 rows back in the original table.

26B7Rn7.png!web

Step 6: Eliminate the Original Data!

Let drop the switch table that is no longer needed. Note the DROP writes a lot less to the transaction log versus DELETEs when done in mass amounts!

--6. Finally drop the newly created switch table. 
DROP TABLE [dbo].[Test1_SWITCH];

Last: Clean Up the Demo

For this demo we are done with the original table, so go ahead and delete it.

--X. Clean Up the Demo
DROP TABLE [dbo].[Test1]

Wrap up of how to delete millions of records in SQL Server

This concludes this demo. Obviously, your production situation with be significantly greater than this 5000-row demo. If you find yourself in a situation where you need to purge a very large table, consider this technique to optimize the run time and minimize transaction log growth. Remember to target off peak hours to avoid contention. Also, be sure to test in a lab prior to running in production. Most important, be sure you have a recoverable backup before running any data purges in production!

Additional Thoughts

You could use this technique to implement Sliding Partitions which would enable you to drop a partition rather than doing deletes going forward. ( https://www.mssqltips.com/sqlservertip/5296/implementation-of-sliding-window-partitioning-in-sql-server-to-purge-data )
You can also use this technique in conjunction with changes to the cluster key or introducing a cluster index if the table is a heap.

Next Steps

This works for both the Standard Edition and Enterprise editions of SQL Server starting with SQL Server 2005.
You may also want to rebuild your indexes on the table after you load the data back into the table.
Read

Last Updated: 2019-02-14

J7JRjaZ.png!web

q2qQNb.png!web

About the author

mqeaQnm.jpg!web Jim Evans is an IT Manager with Stericycle who has managed DBAs, BI Developer, and Data Management teams over the past 10 years.

View all my tips

Related Resources

More SQL Server DBA Tips...

By:Jim Evans | Last Updated: 2019-02-14 | | Related Tips:More > Database Administration

Problem

Solution

SQL Server Fast Delete from Large Table Demo

Step 1: Create a Test Table

Step 2: Load Test Data

Step 3: Create a Duplicate Table

Step 4: The SWITCH!

Step 5: Move Back!

Step 6: Eliminate the Original Data!

Last: Clean Up the Demo

Wrap up of how to delete millions of records in SQL Server

Additional Thoughts

Next Steps

About the author

Recommend

Scale SQL Server Bulk Loading with Partition Switching for Performance - Part 2

Getting Started with Containers for SQL Server - Part 1

外媒分析：苹果公司在中国究竟出了什么问题

情人节，送女友一桶代码可否？ - 美码师

老粉丝和新时代，暴雪该如何抉择与平衡

HONOR 荣耀 V20 智能手机 6GB+128GB 2999元包邮_京东优惠

iQOO来了，vivo也走向了双品牌策略

记一次Spring boot 和Vue前后端分离的入门培训

Android 寻找极限编码的「快感」

【offer收割机必备】我简历上的Java项目都好low，怎么办？

About Joyk