Archive for the ‘Index Tuning’ Category

Missing Clustered Indexes w/ Row Counts

April 1st, 2010

This post is a follow-up to Thomas LaRock’s excellent article here, so first go read that. I’ll wait.

You back? Good.

As most of you know tables missing clustered indexes (CI) are fairly bad performance-wise, but why? Even disregarding the fact that SQL Azure actually requires a CI for each and every table in the cloud database, and hence you may be “tripped up” on this in the future. More importantly, SQL Server sequentially reads the data in a CI one extent at a time. Brad McGehee states, “This makes it very easy for the disk subsystem to read the data quickly from disk, especially if there is a lot of data to be retrieved.” Heaps cause the server to do so much extra work due to unordered nature of NCI’s or having no indexes at all, that performance is bound to suffer.

My goal here is to extend the functionality of Thomas’ little script to do two additional things which I find extremely helpful. One, have the script work all at once for every database on a particular server. Certainly this is not an issue for those of you in environments with one database per server, but I’d bet that’s more the exception than the rule for most of us.

Secondly, I find it very useful to have a row count included to assist me with deciding on exactly how to proceed with the data provided. While I don’t disagree with the need to potentially trash any tables you find with 0 rows, that cleanup project is for another day. I’m concerned today with missing CI’s on tables that truly need them. Unfortunately, row count is not included in either of the joined tables in Thomas’ script. I solve that problem by creating a temp table filled from sp_spaceused and join it to the original resultset.

A couple caveats for you before checking out this script. Firstly, being that I work in an environment that’s 95% SQL Server 2000, this script was written specifically for 8.0. Now, I did indeed test it on 2005 and it does work, but I’d bet a lot of coin there are much easier ways of doing this using DMV’s and other new features. Secondly, the script has a hard time with alternate schemas besides ‘dbo’. While I’m fully aware the script can probably be changed to take current schema into account (perhaps from the ‘information_schema.tables’ system view), I don’t have the need to in my environment so honestly, I just never took the time to change the script to handle alternate schemas. Third, don’t hassle me for using cursors. I don’t use them in production, but am not averse to using them carefully in maintenance scripts that get run “occasionally.” I know all about the performance implications but not really an issue here for a script that takes less than a second to run…and on an ad-hoc basis at that.

Thanks again to the SQLRockstar for inspiring me to clean this up and post it out there for all.

Have a Grateful day…Troy


USE master
DECLARE cur1 CURSOR FOR
  SELECT [name] FROM sysdatabases
  WHERE [name] NOT IN ('tempdb','master','msdb','model')
        
DECLARE @db VARCHAR(75)
DECLARE @sql1 NVARCHAR(2000)
OPEN cur1
FETCH NEXT FROM cur1 INTO @db
WHILE @@FETCH_STATUS <> -1
  BEGIN
    SET @sql1 = N'SET NOCOUNT ON

      -- CREATE TEMP TABLE TO HOLD ALL TABLE NAMES
      CREATE TABLE #so (dbName VARCHAR(75),
        tableName VARCHAR(75),
        tableRows INT)
    
      -- FILL FROM sysobjects
      INSERT INTO #so
      SELECT ''' + @db + ''' [db], name [table], 0
      FROM ' + @db + '.dbo.sysobjects
        INNER JOIN ' + @db + '.information_schema.tables ist ON ist.table_name = ' + @db + '.dbo.sysobjects.name
      WHERE xtype = ''u''
        AND id NOT IN
          (SELECT id FROM ' + @db + '.dbo.sysindexes WHERE indid = 1)
        AND ist.table_schema = ''dbo''
        AND ' + @db + '.dbo.sysobjects.name NOT IN (''cdata'',''cglobal'')

      -- CREATE CURSOR TO ITERATE TABLE NAMES AND RUN sp_spaceused AGAINST THEM
      DECLARE cur2 cursor for
        SELECT tableName FROM #so
      OPEN cur2
      DECLARE @tn VARCHAR(75)
      DECLARE @sql2 NVARCHAR(1000)
      FETCH NEXT FROM cur2 INTO @tn
      WHILE @@FETCH_STATUS <> -1
        BEGIN
          -- CREATE TEMP TABLE TO HOLD ROW AMOUNTS (AND OTHER STUFF WE DON'T CARE ABOUT NOW)
          CREATE TABLE #su
            (tableName varchar(100),
                NOR varchar(100),
                RS varchar(50),
                DS varchar(50),
                IxS varchar(50),
                US varchar(50))
          
          -- FILL THE NEW TEMP TABLE
          SET @sql2 = N''INSERT INTO #su EXEC ' + @db + '.dbo.sp_spaceused '''''' + @tn + ''''''''
          EXEC sp_EXECUTESQL @sql2
            
          -- JOIN THE 2 TEMP TABLES TOGETHER TO UPDATE THE FIRST WITH THE ROW NUMBER COUNT
          UPDATE #so
          SET tableRows = su.NOR
          FROM #su su
            INNER JOIN #so so ON so.tableName = su.tableName
          WHERE ISNULL(su.NOR,0) <> 0
          
          -- CLEAN UP
          DROP TABLE #su

          -- ITERATE        
          FETCH NEXT FROM cur2 INTO @tn
        END
  
      -- CLEAN UP
      CLOSE cur2        
      DEALLOCATE cur2

      -- OUR OUTPUT!!
      SELECT * FROM #so
      WHERE tableRows > 0

      -- CLEAN UP
      DROP TABLE #so'
    
    -- EXECUTE THE ENTIRE BLOCK OF DYNAMIC SQL
    EXEC SP_EXECUTESQL @sql1  
    
    -- ITERATE
    FETCH NEXT FROM cur1 INTO @db
  END

-- CLEAN UP
CLOSE cur1
DEALLOCATE cur1

Tags: , ,
Posted in Index Tuning | Comments (2)

Finding Duplicate Indexes in SQL Server 2000

November 19th, 2009

“Keep on rollin’ my old buddy; you’re movin’ way too slow.”  Jack Straw may have hailed from Wichita, but I can assure you he never had to deal with duplicate indexes slowing down his production environment.

We’re all aware of the perils of insufficient indexes in our SQL Server environment. One of the first truisms we are taught as DBAs is to avoid the dreaded index or table scans in our query plans. That’s a topic covered a billion times before; not sure there’s anything new I can bring to that conversation.

And while there are plenty of scripts bouncing around the blogs on how to discover duplicate indexes; many are oriented towards SS2005 or 2008; or worse yet, use a cursor to find these duplicates.  Since I currently work in the stone age with SQL Server 2000, I wanted a solution that would accomplish it’s goal without using dmv’s or cursors.

The script below is a mish-mosh of the work of others (specially Merrill Aldrich) with some changes I’ve included to make it easier to use in my environment.  The script first, then a couple comments on how I use it…

/*
SCRIPT TO FIND DUPLICATE INDEXES
*/

CREATE VIEW vw_index_list AS
SELECT tbl.[name] AS TableName,
idx.[name] AS IndexName,
INDEX_COL( tbl.[name], idx.indid, 1 ) AS col1,
INDEX_COL( tbl.[name], idx.indid, 2 ) AS col2,
INDEX_COL( tbl.[name], idx.indid, 3 ) AS col3,
INDEX_COL( tbl.[name], idx.indid, 4 ) AS col4,
INDEX_COL( tbl.[name], idx.indid, 5 ) AS col5,
INDEX_COL( tbl.[name], idx.indid, 6 ) AS col6,
INDEX_COL( tbl.[name], idx.indid, 7 ) AS col7,
INDEX_COL( tbl.[name], idx.indid, 8 ) AS col8,
INDEX_COL( tbl.[name], idx.indid, 9 ) AS col9,
INDEX_COL( tbl.[name], idx.indid, 10 ) AS col10,
INDEX_COL( tbl.[name], idx.indid, 11 ) AS col11,
INDEX_COL( tbl.[name], idx.indid, 12 ) AS col12,
INDEX_COL( tbl.[name], idx.indid, 13 ) AS col13,
INDEX_COL( tbl.[name], idx.indid, 14 ) AS col14,
INDEX_COL( tbl.[name], idx.indid, 15 ) AS col15,
INDEX_COL( tbl.[name], idx.indid, 16 ) AS col16,
dpages,
used,
rowcnt,
idx.[id] tableID
FROM SYSINDEXES idx
INNER JOIN SYSOBJECTS tbl ON idx.[id] = tbl.[id]
WHERE indid > 0
AND INDEXPROPERTY( tbl.[id], idx.[name], ‘IsStatistics’) = 0

GO

DECLARE @db SYSNAME, @srv SYSNAME
SET @db = DB_NAME()
SET @srv = @@SERVERNAME

SELECT @db,
@srv,
l1.tableID,
l1.tablename,
l1.indexname,
l2.indexname AS overlappingIndex,
l1.col1, l1.col2, l1.col3,
l1.col4, l1.col5, l1.col6,
l1.col7, l1.col8, l1.col9,
l1.col10, l1.col11, l1.col12,
l1.col13, l1.col14, l1.col15,
l1.col16,
l1.dpages,
l1.used,
l1.rowcnt,
‘SELECT INDEXPROPERTY(‘ + CONVERT(varchar(20),l1.tableID) + ‘, ”’ + l1.indexname + ”’, ”IsClustered”)’
FROM vw_index_list
INNER JOIN vw_index_list_TG l2 ON l1.tablename = l2.tablename
AND l1.indexname <> l2.indexname
AND l1.col1 = l2.col1
AND (l1.col2 IS NULL OR l2.col2 IS NULL OR l1.col2 = l2.col2)
AND (l1.col3 IS NULL OR l2.col3 IS NULL OR l1.col3 = l2.col3)
AND (l1.col4 IS NULL OR l2.col4 IS NULL OR l1.col4 = l2.col4)
AND (l1.col5 IS NULL OR l2.col5 IS NULL OR l1.col5 = l2.col5)
AND (l1.col6 IS NULL OR l2.col6 IS NULL OR l1.col6 = l2.col6)
AND (l1.col7 IS NULL OR l2.col7 IS NULL OR l1.col7 = l2.col7)
AND (l1.col8 IS NULL OR l2.col8 IS NULL OR l1.col8 = l2.col8)
AND (l1.col9 IS NULL OR l2.col9 IS NULL OR l1.col9 = l2.col9)
AND (l1.col10 IS NULL OR l2.col10 IS NULL OR l1.col10 = l2.col10)
AND (l1.col11 IS NULL OR l2.col11 IS NULL OR l1.col11 = l2.col11)
AND (l1.col12 IS NULL OR l2.col12 IS NULL OR l1.col12 = l2.col12)
AND (l1.col13 IS NULL OR l2.col13 IS NULL OR l1.col13 = l2.col13)
AND (l1.col14 IS NULL OR l2.col14 IS NULL OR l1.col14 = l2.col14)
AND (l1.col15 IS NULL OR l2.col15 IS NULL OR l1.col15 = l2.col15)
AND (l1.col16 IS NULL OR l2.col16 IS NULL OR l1.col16 = l2.col16)
ORDER BY l1.tablename,
l1.indexname

DROP VIEW vw_index_list

Upon running this, you have a couple options on what to do with the output.  I personally prefer to copy and paste the QA results window into Excel for further analysis.  It’s for this reason specifically that the script identifies the server and database name in the resultset.  This may be your only option if you operate in a disjointed environment where servers cannot reach each other through common linking.

On the other hand, you may very easily convert the last SELECT statement above into a SELECT INTO type statement and pour the data in a table.  This DevGuru article is where I turn for the specific syntax when I have a SELECT INTO brainf@rt.

A few very important points to consider when analyzing the results…

  • For each pair, the index with more keys generally will encapsulate the functionality of the smaller one.  A gerneral rule would be removing the “shorter” of the two.
  • Be very hesitant to remove any index that is clustered.  This is due to the fact that CI’s actually determine the physical order of the data on the disk.  For this reason alone, removing them would most certainly slow things down.  Use “INDEXPROPERTY(tableid,indexname,’IsClustered’)” to easily determine if you’re dealing with a clustered index.
  • Be weary of any index query hints that stored procedures may use against these duplicates.  The query is bound to break if a non-existent index is used.
  • Until you’ve had a bit of practice looking over the results and deciding which indexes can actually be removed; it might be a very good idea to script out the index definition prior to removal.  Better paranoid than sorry I always say. Later on down the line when you get a frantic request from your customer, saying their scripts or processes have slowed down or stopped working completely; you’ll be happy you did.
  • Although I don’t see why it wouldn’t work on SQL Server 2008, I personally have only tested it on 2000 and 2005.  Lest ye be warned.

This being my first *real* blog post, I’d love some feedback or comments.

Thanks and have a Grateful day…Troy

Tags: , ,
Posted in Index Tuning | Comments (1)

Get Adobe Flash playerPlugin by wpburn.com wordpress themes