How to minimize the page splits in sqlserver to improve the performane of database?
Page Splits:
A page is 8Kbytes of data which can be index related, data related, large object binary (lob’s) etc...
When you insert rows into a table they go on a page, into ‘slots’, your row will have a row length and you can get only so many rows on the 8Kbyte page. What happens when that row’s length increases because you entered a bigger product name in your varchar column for instance,well,SQL Server needs to move the other rows along in order to make room for your modification, if the combined new length of all the rows on the page will no longer fit on that page then SQL Server grabs a new page and moves rows to the right or left of your modification onto it – that is called a ‘page split’.
Page splits arise when records from one memory page are moved to another page during changes to your table. Suppose a new record (Martin) being inserted, in sequence, between Adam and Rony. Since there’s no room in this memory page, some records will need to shift around. The page split occurs when Irene’s record moves to the second page.
This creates page fragmentation and is very bad for performance and is also reported as page split.
Page splits are considered very bad for performance, and there are a number of techniques to reduce, or even eliminate, the risk of page splits.
Example code for tracking Page Splits :
We can find the bad page splits using the event sql_server.transaction_log. This event monitors all the activities in the transaction log, because that we need to use with caution. We can filter the ‘operation’ field looking for the value 11, which means LOP_DELETE_SPLIT. This is the deletion of rows that happens when SQL Server is moving rows from one page to another in a page split, a bad page split.
Extended Events for SQL Server provides a generic tracing and troubleshooting framework which allows deeper and more granular level control of tracing which was not possible using earlier methods like DBCC, SQL Trace, Profiler, etc... These earlier methods still exist and Extended Events is not a replacement.
For this We need to create the session by t-sql. The code to create the session will be this:
IF EXISTS (SELECT 1
FROM sys.server_event_sessions
WHERE name = 'PageSplits_Tracker')
DROP EVENT SESSION [PageSplits_Tracker] ON SERVER
CREATE EVENT SESSION PageSplits_Tracker
ON SERVER
ADD EVENT sqlserver.transaction_log(
WHERE operation = 11 -- LOP_DELETE_SPLIT
)
--Description for transaction_log event is: “Occurs when a record is added to the SQL Server transaction log.
--This is a very high volume event that will affect the performance of the server. Therefore, you should use
--appropriate filtering to reduce the number of events, and only use this event for targeted troubleshooting
--during a short time period.”
-- LOP_DELETE_SPLIT : A page split has occurred. Rows have moved physically.
ADD TARGET package0.histogram(
SET filtering_event_name = 'sqlserver.transaction_log',
source_type = 0,source = 'database_id');
GO
--package0.histogram : You can use the histogram target to troubleshoot performance issues.
--filtering_event_name : Any event present in the Extended Events session.
--source_type : The type of object that the bucket is based on.
--0 for an event
--1 for an action
--source : The event column or action name that is used as the data source.
-- Start the Event Session
ALTER EVENT SESSION PageSplits_Tracker
ON SERVER
STATE=START;
GO
-- Create the database
CREATE DATABASE Performance_Tracker
GO
USE [Performance_Tracker]
GO
-- Create a bad splitting clustered index table
CREATE TABLE PageSplits
( ROWID UNIQUEIDENTIFIER NOT NULL DEFAULT NEWID() PRIMARY KEY,
Data INT NOT NULL DEFAULT (RAND()*1000),
Change_Date DATETIME2 NOT NULL DEFAULT CURRENT_TIMESTAMP);
GO
-- This index should mid-split based on the DEFAULT column value
CREATE INDEX IX_PageSplitsPk_Data ON PageSplits (Data);
GO
-- This index should end-split based on the DEFAULT column value
CREATE INDEX IX_PageSplitsPk_ChangeDate ON PageSplits (Change_Date);
GO
-- Create a table with an increasing clustered index
CREATE TABLE PageSplits_Index
( ROWID INT IDENTITY NOT NULL PRIMARY KEY,
Data INT NOT NULL DEFAULT (RAND()*1000),
Change_Date DATETIME2 NOT NULL DEFAULT DATEADD(mi, RAND()*-1000, CURRENT_TIMESTAMP))
GO
-- This index should mid-split based on the DEFAULT column value
CREATE INDEX IX_PageSplits_Index_ChangeDate ON PageSplits_Index (Change_Date);
GO
-- Insert the default values repeatedly into the tables
WHILE 1=1
BEGIN
INSERT INTO PageSplits DEFAULT VALUES;
INSERT INTO PageSplits_Index DEFAULT VALUES;
WAITFOR DELAY '00:00:00.005';
END
GO
--If we startup this workload and allow it to run for a couple of minutes, we can then query the histogram target
--for our session to find the database that has the mid-page splits occurring.
-- Query the target data to identify the worst splitting database_id
with cte as
(
SELECT
n.value('(value)[1]', 'int') AS database_id,
DB_NAME(n.value('(value)[1]', 'int')) AS database_name,
n.value('(@count)[1]', 'bigint') AS split_count
FROM
(SELECT CAST(target_data as XML) target_data
FROM sys.dm_xe_sessions AS s
JOIN sys.dm_xe_session_targets t
ON s.address = t.event_session_address
WHERE s.name = 'PageSplits_Tracker'
AND t.target_name = 'histogram' ) as tab
CROSS APPLY target_data.nodes('HistogramTarget/Slot') as q(n)
)
select * from cte
database_id | database_name | split_count |
16 | Performance_Tracker | 123 |
--With the database_id of the worst splitting database, we can then change our event session configuration
--to only look at this database, and then change our histogram target configuration to bucket on the alloc_unit_id
--so that we can then track down the worst splitting indexes in the database experiencing the worst mid-page splits
-- Drop the Event Session so we can recreate it
-- to focus on the highest splitting database
DROP EVENT SESSION [PageSplits_Tracker]
ON SERVER
-- Create the Event Session to track LOP_DELETE_SPLIT transaction_log operations in the server
CREATE EVENT SESSION [PageSplits_Tracker]
ON SERVER
ADD EVENT sqlserver.transaction_log(
WHERE operation = 11 -- LOP_DELETE_SPLIT
AND database_id = 16 -- CHANGE THIS BASED ON TOP SPLITTING DATABASE!
)
ADD TARGET package0.histogram(
SET filtering_event_name = 'sqlserver.transaction_log',
source_type = 0, -- Event Column
source = 'alloc_unit_id');
GO
-- Start the Event Session Again
ALTER EVENT SESSION [PageSplits_Tracker]
ON SERVER
STATE=START;
GO
--With the new event session definition, we can now rerun our problematic workload for more than 10 minutes period
-- and look at the worst splitting indexes based on the alloc_unit_id’s that are in the histogram target:
WHILE 1=1
BEGIN
INSERT INTO PageSplits DEFAULT VALUES;
INSERT INTO PageSplits_Index DEFAULT VALUES;
WAITFOR DELAY '00:00:00.005';
END
GO
-- Query Target Data to get the top splitting objects in the database:
SELECT
o.name AS table_name,
i.name AS index_name,
tab.split_count,indexstats.index_type_desc AS IndexType,
indexstats.avg_fragmentation_in_percent,
i.fill_factor
FROM ( SELECT
n.value('(value)[1]', 'bigint') AS alloc_unit_id,
n.value('(@count)[1]', 'bigint') AS split_count
FROM
(SELECT CAST(target_data as XML) target_data
FROM sys.dm_xe_sessions AS s
JOIN sys.dm_xe_session_targets t
ON s.address = t.event_session_address
WHERE s.name = 'PageSplits_Tracker'
AND t.target_name = 'histogram' ) as tab
CROSS APPLY target_data.nodes('HistogramTarget/Slot') as q(n)
) AS tab
JOIN sys.allocation_units AS au
ON tab.alloc_unit_id = au.allocation_unit_id
JOIN sys.partitions AS p
ON au.container_id = p.partition_id
JOIN sys.indexes AS i
ON p.object_id = i.object_id
AND p.index_id = i.index_id
JOIN sys.objects AS o
ON p.object_id = o.object_id
JOIN sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, NULL) indexstats
ON i.object_id = indexstats.object_id
AND i.index_id = indexstats.index_id
WHERE o.is_ms_shipped = 0
ORDER BY indexstats.avg_fragmentation_in_percent DESC
table_name | index_name | split_count | IndexType | avg_fragmentation_in_percent | fill_factor |
PageSplits_Index | IX_PageSplits_Index_ChangeDate | 286 | NONCLUSTERED INDEX | 99.57894737 | 0 |
PageSplits | PK__PageSpli__97BD02EBEA21A6BC | 566 | CLUSTERED INDEX | 99.37238494 | 0 |
PageSplits | IX_PageSplitsPk_Data | 341 | NONCLUSTERED INDEX | 98.98989899 | 0 |
PageSplits | IX_PageSplitsPk_ChangeDate | 3 | NONCLUSTERED INDEX | 1.747572816 | 0 |
--With this information we can now go back and change our FillFactor specifications and retest/monitor the impact
-- to determine whether we’ve had the appropriate reduction in mid-page splits to accommodate the time between
-- our index rebuild operations:
-- Change FillFactor based on split occurences to minimize page splits
Using Fill Factor we can minimize the page splits :
Fill Factor :When an index is created with a fill factor percentage, this leaves a percentage of the index pages free after the index is created, rebuilt or reorganized. This free space is used to hold additional pages as page splits occur, reducing the change of a page split in the data page causing a page split in the index structure as well, but even with your Fill Factor set to 10% to 20%, index pages eventually fill up and are split the same way that a data page is split.
A page is the basic unit of data storage in SQL server. Its size is 8KB(8192 bytes). Data is stored in the leaf-level pages of Index. The percentage of space to be filled with data in a leaf level page is decided by fill factor. The remaining space left is used for future growth of data in the page. Fill factor is a number from 1 to 100. Its default value is 0, which is same as 100. So when we say fill factor is 70 means, 70% of space is filled with data and remaining 30% is vacant for future use. So higher the fill factor, more data is stored in the page. Fill factor setting is applied when we create/rebuild index.
ALTER INDEX PK__PageSpli__97BD02EBEA21A6BC ON PageSplits REBUILD WITH (FILLFACTOR=70)
ALTER INDEX IX_PageSplitsPk_Data ON PageSplits REBUILD WITH (FILLFACTOR=70)
ALTER INDEX IX_PageSplits_Index_ChangeDate ON PageSplits_Index REBUILD WITH (FILLFACTOR=80)
GO
-- Stop the Event Session to clear the target
ALTER EVENT SESSION [PageSplits_Tracker]
ON SERVER
STATE=STOP;
GO
-- Start the Event Session Again
ALTER EVENT SESSION [PageSplits_Tracker]
ON SERVER
STATE=START;
GO
--Do the workload once again
WHILE 1=1
BEGIN
INSERT INTO PageSplits DEFAULT VALUES;
INSERT INTO PageSplits_Index DEFAULT VALUES;
WAITFOR DELAY '00:00:00.005';
END
GO
--With the reset performed we can again start up our workload generation and
--begin monitoring the effect of the FillFactor specifications on the indexes with our code.
--After another 2 minute period, the following splits were noted.
--Once again Query Target Data to get the top splitting objects in the database:
--At present there is no page splits are found in indexes IX_PageSplitsPk_ChangeDate, PK__PageSpli__97BD02EBEA21A6BC, IX_PageSplitsPk_Data
table_name | index_name | split_count | IndexType |