Archive for the ‘Index Tuning’ Category

Ola Hallengren’s Maintenance Solution Command Log Analysis

October 23rd, 2015

Yeah yeah yeah, it’s been a while since I blogged last. Spent a glorious two years in New York City, but am very happily back in Jacksonville now. That whole story is a blog post in itself, so won’t get into it here. Suffice to say, it’s nice to be back home…full of renewed energy and raring to go. This is me…
WIDE AWAKE!!!
To say that a large number of shops use Ola Hallengren’s maintenance solution is probably the understatement of the decade. It has proven itself over the years to be as thorough, easy-to-use, and flexible as any free, custom, or paid backup/index maintenance solution out there…and there are MANY out there, so that’s saying a lot. The balance of this post is assuming you’re familiar with it’s workings, capabilities, & output. While there’s a lot of verbage out there in blogland about how to use and set up the routines, there isn’t a lot to be found on how one might want to analyze the output. Thusly, I’ve put together over the years a few scripts I’d love to share.

Within Ola’s scripts, there exists the ability to log all activity to a table called “CommandLog”. I highly recommend one turns this option on for every part of the solution you employ. This means the same table will contain rows for each backup (full, t-log, or diff) taken, each index maintained (rebuilt or reorganized), each statistic updated, as well as for each time you use the scripts to perform DBCC integrity checks. The table of metadata is extremely valuable in analyzing exactly what took place the night before, the last month, etc. For the most part, turning on the logging is simply a matter of including a “@LogToTable = ‘Y'” parameter in the procedure call. Easy peazy.

One thing I really like to know is how many index maintenance operations were performed. It may be the case that normal business processes (combined with less-than-stellar schema) fragment our indexes on a regular basis and you don’t have any control over it. This might be so for instance when you do a lot of writes against a clustering key that places these new rows in one of the middle extents, and not at the end. This will cause page splits and natural fragmentation. As DBA’s, it’s our job to limit fragmentation as much as possible. Maybe we alter the cluster key. Maybe we alter the fill factor. Maybe we change the INSERT process to place new rows at the end. Whatever methodology you employ, you’d like to see some sort of reduction in the number of rebuilds and then eventually, reorgs as well. This is what I use to track them…


SELECT COUNT(*) REORGS, CONVERT (DATE, c.starttime) [DATE]
FROM commandlog c
WHERE c.command like '%reorganize%'
  AND NOT EXISTS (SELECT 1 FROM IndexExceptions WHERE IndexName = c.IndexName)
GROUP BY CONVERT (DATE, c.starttime)
ORDER BY CONVERT (DATE, c.starttime) DESC

SELECT COUNT(*) REBUILDS, CONVERT (DATE, c.starttime) [DATE]
FROM commandlog c
WHERE c.command like '%rebuild%'
  AND NOT EXISTS (SELECT 1 FROM IndexExceptions WHERE IndexName = c.IndexName)
GROUP BY CONVERT (DATE, c.starttime)
ORDER BY CONVERT (DATE, c.starttime) DESC

The “IndexExceptions” table reference you see in the 4th row of each SELECT, refers to a small table I keep to exclude any indexes from my counts that I know for 100% sure that I’m going to leave be, and I don’t want them in my report. It’s simple schema is…

CREATE TABLE [dbo].[IndexExceptions](
  [IndexExceptionID] [int] IDENTITY(1,1) NOT NULL,
  [IndexName] [varchar](200) NOT NULL,
  [Reason] [varchar](max) NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

When I want to analyze exactly how long my index maintenance operations are taking, I use the following…

SELECT commandtype Activity,
  CAST(CONVERT(CHAR(11), starttime, 113) AS DATETIME) Work_Date,
  COUNT(*) How_Many,
  AVG(DATEDIFF(ms,starttime,endtime)) Avg_ms,
  AVG(DATEDIFF(s,starttime,endtime)) Avg_Secs,
  AVG(DATEDIFF(mi,starttime,endtime)) Avg_Mins
FROM commandlog
WHERE commandtype NOT IN ('xp_create_subdir','xp_delete_file','RESTORE_VERIFYONLY','BACKUP_DATABASE')
GROUP BY commandtype
  ,CAST(CONVERT(CHAR(11), starttime, 113) AS DATETIME)
ORDER BY CommandType, CAST(CONVERT(CHAR(11), starttime, 113) AS DATETIME) DESC

…use the WHERE clause to change what you see and what you don’t.

This snippet will tell you how many times individual indexes where rebuilt/reorg’d in a specific period. Adjust both parts of the WHERE clause as necessary…


SELECT DatabaseName, ObjectName, IndexName, COUNT(*) CommandCount
FROM dbo.CommandLog
WHERE commandtype = 'ALTER_INDEX'
  AND DATEDIFF(d,starttime,GETDATE()) < 14
GROUP BY DatabaseName, ObjectName, IndexName
ORDER BY CommandCount DESC

And this SELECT tells us important information about a single index; including it’s current fill factor (in case you want to change it to a lower amount and in theory, reduce splits and fragmentation), the current fragmentation percentage, as well as the size of the index. Again, change the WHERE clause as neccessary…


SELECT starttime,
  objectname [table],
  indexname,
  i.fill_factor,
  CAST(CAST(extendedinfo as XML).query('data(/ExtendedInfo/Fragmentation)') as VARCHAR(10)) as fragmentation,
  CAST(CAST(extendedinfo as XML).query('data(/ExtendedInfo/PageCount)') as VARCHAR(10)) as [pagecount],
  extendedinfo,
  command
FROM commandlog c
JOIN sys.indexes i ON i.name = c.indexname
WHERE indexname = 'IndexNameHere'
 AND CommandType like '%alter%'
ORDER by databasename, objectname, c.starttime DESC

These are just some of the things you can do with the CommandLog table and the plethora of useful information it contains. Of course, feel free to use these scripts, adjust them, and let me know how it works for you.

Thanks and have a grateful day…Troy

Tags: , , , , ,
Posted in Backups, Index Tuning, SSMS, T-SQL | Comments (0)

Missing Clustered Indexes w/ Row Counts

April 1st, 2010

This post is a follow-up to Thomas LaRock’s excellent article here, so first go read that. I’ll wait.

You back? Good.

As most of you know tables missing clustered indexes (CI) are fairly bad performance-wise, but why? Even disregarding the fact that SQL Azure actually requires a CI for each and every table in the cloud database, and hence you may be “tripped up” on this in the future. More importantly, SQL Server sequentially reads the data in a CI one extent at a time. Brad McGehee states, “This makes it very easy for the disk subsystem to read the data quickly from disk, especially if there is a lot of data to be retrieved.” Heaps cause the server to do so much extra work due to unordered nature of NCI’s or having no indexes at all, that performance is bound to suffer.

My goal here is to extend the functionality of Thomas’ little script to do two additional things which I find extremely helpful. One, have the script work all at once for every database on a particular server. Certainly this is not an issue for those of you in environments with one database per server, but I’d bet that’s more the exception than the rule for most of us.

Secondly, I find it very useful to have a row count included to assist me with deciding on exactly how to proceed with the data provided. While I don’t disagree with the need to potentially trash any tables you find with 0 rows, that cleanup project is for another day. I’m concerned today with missing CI’s on tables that truly need them. Unfortunately, row count is not included in either of the joined tables in Thomas’ script. I solve that problem by creating a temp table filled from sp_spaceused and join it to the original resultset.

A couple caveats for you before checking out this script. Firstly, being that I work in an environment that’s 95% SQL Server 2000, this script was written specifically for 8.0. Now, I did indeed test it on 2005 and it does work, but I’d bet a lot of coin there are much easier ways of doing this using DMV’s and other new features. Secondly, the script has a hard time with alternate schemas besides ‘dbo’. While I’m fully aware the script can probably be changed to take current schema into account (perhaps from the ‘information_schema.tables’ system view), I don’t have the need to in my environment so honestly, I just never took the time to change the script to handle alternate schemas. Third, don’t hassle me for using cursors. I don’t use them in production, but am not averse to using them carefully in maintenance scripts that get run “occasionally.” I know all about the performance implications but not really an issue here for a script that takes less than a second to run…and on an ad-hoc basis at that.

Thanks again to the SQLRockstar for inspiring me to clean this up and post it out there for all.

Have a Grateful day…Troy


USE master
DECLARE cur1 CURSOR FOR
  SELECT [name] FROM sysdatabases
  WHERE [name] NOT IN ('tempdb','master','msdb','model')
        
DECLARE @db VARCHAR(75)
DECLARE @sql1 NVARCHAR(3000)
OPEN cur1
FETCH NEXT FROM cur1 INTO @db
WHILE @@FETCH_STATUS <> -1
  BEGIN
    SET @sql1 = N'SET NOCOUNT ON

      -- CREATE TEMP TABLE TO HOLD ALL TABLE NAMES
      CREATE TABLE #so (dbName VARCHAR(75),
        tableName VARCHAR(75),
        tableRows INT)
    
      -- FILL FROM sysobjects
      INSERT INTO #so
      SELECT ''' + @db + ''' [db], name [table], 0
      FROM ' + @db + '.dbo.sysobjects
        INNER JOIN ' + @db + '.information_schema.tables ist ON ist.table_name = ' + @db + '.dbo.sysobjects.name
      WHERE xtype = ''u''
        AND id NOT IN
          (SELECT id FROM ' + @db + '.dbo.sysindexes WHERE indid = 1)
        AND ist.table_schema = ''dbo''
        AND ' + @db + '.dbo.sysobjects.name NOT IN (''cdata'',''cglobal'')

      -- CREATE CURSOR TO ITERATE TABLE NAMES AND RUN sp_spaceused AGAINST THEM
      DECLARE cur2 cursor for
        SELECT tableName FROM #so
      OPEN cur2
      DECLARE @tn VARCHAR(75)
      DECLARE @sql2 NVARCHAR(1000)
      FETCH NEXT FROM cur2 INTO @tn
      WHILE @@FETCH_STATUS <> -1
        BEGIN
          -- CREATE TEMP TABLE TO HOLD ROW AMOUNTS (AND OTHER STUFF WE DONT CARE ABOUT NOW)
          CREATE TABLE #su
            (tableName varchar(100),
                NOR varchar(100),
                RS varchar(50),
                DS varchar(50),
                IxS varchar(50),
                US varchar(50))
          
          -- FILL THE NEW TEMP TABLE
          SET @sql2 = N''INSERT INTO #su EXEC ' + @db + '.dbo.sp_spaceused '''''' + @tn + ''''''''
          EXEC sp_EXECUTESQL @sql2
            
          -- JOIN THE 2 TEMP TABLES TOGETHER TO UPDATE THE FIRST WITH THE ROW NUMBER COUNT
          UPDATE #so
          SET tableRows = su.NOR
          FROM #su su
            INNER JOIN #so so ON so.tableName = su.tableName
          WHERE ISNULL(su.NOR,0) <> 0
          
          -- CLEAN UP
          DROP TABLE #su

          -- ITERATE        
          FETCH NEXT FROM cur2 INTO @tn
        END
  
      -- CLEAN UP
      CLOSE cur2        
      DEALLOCATE cur2

      -- OUR OUTPUT!!
      SELECT * FROM #so
      WHERE tableRows > 0
                  ORDER BY tableRows DESC

      -- CLEAN UP
      DROP TABLE #so'
    
    -- EXECUTE THE ENTIRE BLOCK OF DYNAMIC SQL
    EXEC SP_EXECUTESQL @sql1  
    
    -- ITERATE
    FETCH NEXT FROM cur1 INTO @db
  END

-- CLEAN UP
CLOSE cur1
DEALLOCATE cur1

Tags: , ,
Posted in Index Tuning | Comments (1)

Finding Duplicate Indexes in SQL Server 2000

November 19th, 2009

“Keep on rollin’ my old buddy; you’re movin’ way too slow.”  Jack Straw may have hailed from Wichita, but I can assure you he never had to deal with duplicate indexes slowing down his production environment.

We’re all aware of the perils of insufficient indexes in our SQL Server environment. One of the first truisms we are taught as DBAs is to avoid the dreaded index or table scans in our query plans. That’s a topic covered a billion times before; not sure there’s anything new I can bring to that conversation.

And while there are plenty of scripts bouncing around the blogs on how to discover duplicate indexes; many are oriented towards SS2005 or 2008; or worse yet, use a cursor to find these duplicates.  Since I currently work in the stone age with SQL Server 2000, I wanted a solution that would accomplish it’s goal without using dmv’s or cursors.

The script below is a mish-mosh of the work of others (specially Merrill Aldrich) with some changes I’ve included to make it easier to use in my environment.  The script first, then a couple comments on how I use it…

/*
SCRIPT TO FIND DUPLICATE INDEXES
*/

CREATE VIEW vw_index_list AS
SELECT tbl.[name] AS TableName,
idx.[name] AS IndexName,
INDEX_COL( tbl.[name], idx.indid, 1 ) AS col1,
INDEX_COL( tbl.[name], idx.indid, 2 ) AS col2,
INDEX_COL( tbl.[name], idx.indid, 3 ) AS col3,
INDEX_COL( tbl.[name], idx.indid, 4 ) AS col4,
INDEX_COL( tbl.[name], idx.indid, 5 ) AS col5,
INDEX_COL( tbl.[name], idx.indid, 6 ) AS col6,
INDEX_COL( tbl.[name], idx.indid, 7 ) AS col7,
INDEX_COL( tbl.[name], idx.indid, 8 ) AS col8,
INDEX_COL( tbl.[name], idx.indid, 9 ) AS col9,
INDEX_COL( tbl.[name], idx.indid, 10 ) AS col10,
INDEX_COL( tbl.[name], idx.indid, 11 ) AS col11,
INDEX_COL( tbl.[name], idx.indid, 12 ) AS col12,
INDEX_COL( tbl.[name], idx.indid, 13 ) AS col13,
INDEX_COL( tbl.[name], idx.indid, 14 ) AS col14,
INDEX_COL( tbl.[name], idx.indid, 15 ) AS col15,
INDEX_COL( tbl.[name], idx.indid, 16 ) AS col16,
dpages,
used,
rowcnt,
idx.[id] tableID
FROM SYSINDEXES idx
INNER JOIN SYSOBJECTS tbl ON idx.[id] = tbl.[id]
WHERE indid > 0
AND INDEXPROPERTY( tbl.[id], idx.[name], ‘IsStatistics’) = 0

GO

DECLARE @db SYSNAME, @srv SYSNAME
SET @db = DB_NAME()
SET @srv = @@SERVERNAME

SELECT @db,
@srv,
l1.tableID,
l1.tablename,
l1.indexname,
l2.indexname AS overlappingIndex,
l1.col1, l1.col2, l1.col3,
l1.col4, l1.col5, l1.col6,
l1.col7, l1.col8, l1.col9,
l1.col10, l1.col11, l1.col12,
l1.col13, l1.col14, l1.col15,
l1.col16,
l1.dpages,
l1.used,
l1.rowcnt,
‘SELECT INDEXPROPERTY(‘ + CONVERT(varchar(20),l1.tableID) + ‘, ”’ + l1.indexname + ”’, ”IsClustered”)’
FROM vw_index_list
INNER JOIN vw_index_list_TG l2 ON l1.tablename = l2.tablename
AND l1.indexname <> l2.indexname
AND l1.col1 = l2.col1
AND (l1.col2 IS NULL OR l2.col2 IS NULL OR l1.col2 = l2.col2)
AND (l1.col3 IS NULL OR l2.col3 IS NULL OR l1.col3 = l2.col3)
AND (l1.col4 IS NULL OR l2.col4 IS NULL OR l1.col4 = l2.col4)
AND (l1.col5 IS NULL OR l2.col5 IS NULL OR l1.col5 = l2.col5)
AND (l1.col6 IS NULL OR l2.col6 IS NULL OR l1.col6 = l2.col6)
AND (l1.col7 IS NULL OR l2.col7 IS NULL OR l1.col7 = l2.col7)
AND (l1.col8 IS NULL OR l2.col8 IS NULL OR l1.col8 = l2.col8)
AND (l1.col9 IS NULL OR l2.col9 IS NULL OR l1.col9 = l2.col9)
AND (l1.col10 IS NULL OR l2.col10 IS NULL OR l1.col10 = l2.col10)
AND (l1.col11 IS NULL OR l2.col11 IS NULL OR l1.col11 = l2.col11)
AND (l1.col12 IS NULL OR l2.col12 IS NULL OR l1.col12 = l2.col12)
AND (l1.col13 IS NULL OR l2.col13 IS NULL OR l1.col13 = l2.col13)
AND (l1.col14 IS NULL OR l2.col14 IS NULL OR l1.col14 = l2.col14)
AND (l1.col15 IS NULL OR l2.col15 IS NULL OR l1.col15 = l2.col15)
AND (l1.col16 IS NULL OR l2.col16 IS NULL OR l1.col16 = l2.col16)
ORDER BY l1.tablename,
l1.indexname

DROP VIEW vw_index_list

Upon running this, you have a couple options on what to do with the output.  I personally prefer to copy and paste the QA results window into Excel for further analysis.  It’s for this reason specifically that the script identifies the server and database name in the resultset.  This may be your only option if you operate in a disjointed environment where servers cannot reach each other through common linking.

On the other hand, you may very easily convert the last SELECT statement above into a SELECT INTO type statement and pour the data in a table.  This DevGuru article is where I turn for the specific syntax when I have a SELECT INTO brainf@rt.

A few very important points to consider when analyzing the results…

  • For each pair, the index with more keys generally will encapsulate the functionality of the smaller one.  A gerneral rule would be removing the “shorter” of the two.
  • Be very hesitant to remove any index that is clustered.  This is due to the fact that CI’s actually determine the physical order of the data on the disk.  For this reason alone, removing them would most certainly slow things down.  Use “INDEXPROPERTY(tableid,indexname,’IsClustered’)” to easily determine if you’re dealing with a clustered index.
  • Be weary of any index query hints that stored procedures may use against these duplicates.  The query is bound to break if a non-existent index is used.
  • Until you’ve had a bit of practice looking over the results and deciding which indexes can actually be removed; it might be a very good idea to script out the index definition prior to removal.  Better paranoid than sorry I always say. Later on down the line when you get a frantic request from your customer, saying their scripts or processes have slowed down or stopped working completely; you’ll be happy you did.
  • Although I don’t see why it wouldn’t work on SQL Server 2008, I personally have only tested it on 2000 and 2005.  Lest ye be warned.

This being my first *real* blog post, I’d love some feedback or comments.

Thanks and have a Grateful day…Troy

Tags: , ,
Posted in Index Tuning | Comments (1)

Get Adobe Flash player