“Keep on rollin’ my old buddy; you’re movin’ way too slow.” Jack Straw may have hailed from Wichita, but I can assure you he never had to deal with duplicate indexes slowing down his production environment.
We’re all aware of the perils of insufficient indexes in our SQL Server environment. One of the first truisms we are taught as DBAs is to avoid the dreaded index or table scans in our query plans. That’s a topic covered a billion times before; not sure there’s anything new I can bring to that conversation.
And while there are plenty of scripts bouncing around the blogs on how to discover duplicate indexes; many are oriented towards SS2005 or 2008; or worse yet, use a cursor to find these duplicates. Since I currently work in the stone age with SQL Server 2000, I wanted a solution that would accomplish it’s goal without using dmv’s or cursors.
The script below is a mish-mosh of the work of others (specially Merrill Aldrich) with some changes I’ve included to make it easier to use in my environment. The script first, then a couple comments on how I use it…
SCRIPT TO FIND DUPLICATE INDEXES
CREATE VIEW vw_index_list AS
SELECT tbl.[name] AS TableName,
idx.[name] AS IndexName,
INDEX_COL( tbl.[name], idx.indid, 1 ) AS col1,
INDEX_COL( tbl.[name], idx.indid, 2 ) AS col2,
INDEX_COL( tbl.[name], idx.indid, 3 ) AS col3,
INDEX_COL( tbl.[name], idx.indid, 4 ) AS col4,
INDEX_COL( tbl.[name], idx.indid, 5 ) AS col5,
INDEX_COL( tbl.[name], idx.indid, 6 ) AS col6,
INDEX_COL( tbl.[name], idx.indid, 7 ) AS col7,
INDEX_COL( tbl.[name], idx.indid, 8 ) AS col8,
INDEX_COL( tbl.[name], idx.indid, 9 ) AS col9,
INDEX_COL( tbl.[name], idx.indid, 10 ) AS col10,
INDEX_COL( tbl.[name], idx.indid, 11 ) AS col11,
INDEX_COL( tbl.[name], idx.indid, 12 ) AS col12,
INDEX_COL( tbl.[name], idx.indid, 13 ) AS col13,
INDEX_COL( tbl.[name], idx.indid, 14 ) AS col14,
INDEX_COL( tbl.[name], idx.indid, 15 ) AS col15,
INDEX_COL( tbl.[name], idx.indid, 16 ) AS col16,
FROM SYSINDEXES idx
INNER JOIN SYSOBJECTS tbl ON idx.[id] = tbl.[id]
WHERE indid > 0
AND INDEXPROPERTY( tbl.[id], idx.[name], ‘IsStatistics’) = 0
DECLARE @db SYSNAME, @srv SYSNAME
SET @db = DB_NAME()
SET @srv = @@SERVERNAME
l2.indexname AS overlappingIndex,
l1.col1, l1.col2, l1.col3,
l1.col4, l1.col5, l1.col6,
l1.col7, l1.col8, l1.col9,
l1.col10, l1.col11, l1.col12,
l1.col13, l1.col14, l1.col15,
‘SELECT INDEXPROPERTY(‘ + CONVERT(varchar(20),l1.tableID) + ‘, ”’ + l1.indexname + ”’, ”IsClustered”)’
INNER JOIN vw_index_list_TG l2 ON l1.tablename = l2.tablename
AND l1.indexname <> l2.indexname
AND l1.col1 = l2.col1
AND (l1.col2 IS NULL OR l2.col2 IS NULL OR l1.col2 = l2.col2)
AND (l1.col3 IS NULL OR l2.col3 IS NULL OR l1.col3 = l2.col3)
AND (l1.col4 IS NULL OR l2.col4 IS NULL OR l1.col4 = l2.col4)
AND (l1.col5 IS NULL OR l2.col5 IS NULL OR l1.col5 = l2.col5)
AND (l1.col6 IS NULL OR l2.col6 IS NULL OR l1.col6 = l2.col6)
AND (l1.col7 IS NULL OR l2.col7 IS NULL OR l1.col7 = l2.col7)
AND (l1.col8 IS NULL OR l2.col8 IS NULL OR l1.col8 = l2.col8)
AND (l1.col9 IS NULL OR l2.col9 IS NULL OR l1.col9 = l2.col9)
AND (l1.col10 IS NULL OR l2.col10 IS NULL OR l1.col10 = l2.col10)
AND (l1.col11 IS NULL OR l2.col11 IS NULL OR l1.col11 = l2.col11)
AND (l1.col12 IS NULL OR l2.col12 IS NULL OR l1.col12 = l2.col12)
AND (l1.col13 IS NULL OR l2.col13 IS NULL OR l1.col13 = l2.col13)
AND (l1.col14 IS NULL OR l2.col14 IS NULL OR l1.col14 = l2.col14)
AND (l1.col15 IS NULL OR l2.col15 IS NULL OR l1.col15 = l2.col15)
AND (l1.col16 IS NULL OR l2.col16 IS NULL OR l1.col16 = l2.col16)
ORDER BY l1.tablename,
DROP VIEW vw_index_list
Upon running this, you have a couple options on what to do with the output. I personally prefer to copy and paste the QA results window into Excel for further analysis. It’s for this reason specifically that the script identifies the server and database name in the resultset. This may be your only option if you operate in a disjointed environment where servers cannot reach each other through common linking.
On the other hand, you may very easily convert the last SELECT statement above into a SELECT INTO type statement and pour the data in a table. This DevGuru article is where I turn for the specific syntax when I have a SELECT INTO brainf@rt.
A few very important points to consider when analyzing the results…
- For each pair, the index with more keys generally will encapsulate the functionality of the smaller one. A gerneral rule would be removing the “shorter” of the two.
- Be very hesitant to remove any index that is clustered. This is due to the fact that CI’s actually determine the physical order of the data on the disk. For this reason alone, removing them would most certainly slow things down. Use “INDEXPROPERTY(tableid,indexname,’IsClustered’)” to easily determine if you’re dealing with a clustered index.
- Be weary of any index query hints that stored procedures may use against these duplicates. The query is bound to break if a non-existent index is used.
- Until you’ve had a bit of practice looking over the results and deciding which indexes can actually be removed; it might be a very good idea to script out the index definition prior to removal. Better paranoid than sorry I always say. Later on down the line when you get a frantic request from your customer, saying their scripts or processes have slowed down or stopped working completely; you’ll be happy you did.
- Although I don’t see why it wouldn’t work on SQL Server 2008, I personally have only tested it on 2000 and 2005. Lest ye be warned.
This being my first *real* blog post, I’d love some feedback or comments.
Thanks and have a Grateful day…Troy