Posts Tagged ‘indexing’

Ola Hallengren’s Maintenance Solution Command Log Analysis

October 23rd, 2015

Yeah yeah yeah, it’s been a while since I blogged last. Spent a glorious two years in New York City, but am very happily back in Jacksonville now. That whole story is a blog post in itself, so won’t get into it here. Suffice to say, it’s nice to be back home…full of renewed energy and raring to go. This is me…
To say that a large number of shops use Ola Hallengren’s maintenance solution is probably the understatement of the decade. It has proven itself over the years to be as thorough, easy-to-use, and flexible as any free, custom, or paid backup/index maintenance solution out there…and there are MANY out there, so that’s saying a lot. The balance of this post is assuming you’re familiar with it’s workings, capabilities, & output. While there’s a lot of verbage out there in blogland about how to use and set up the routines, there isn’t a lot to be found on how one might want to analyze the output. Thusly, I’ve put together over the years a few scripts I’d love to share.

Within Ola’s scripts, there exists the ability to log all activity to a table called “CommandLog”. I highly recommend one turns this option on for every part of the solution you employ. This means the same table will contain rows for each backup (full, t-log, or diff) taken, each index maintained (rebuilt or reorganized), each statistic updated, as well as for each time you use the scripts to perform DBCC integrity checks. The table of metadata is extremely valuable in analyzing exactly what took place the night before, the last month, etc. For the most part, turning on the logging is simply a matter of including a “@LogToTable = ‘Y'” parameter in the procedure call. Easy peazy.

One thing I really like to know is how many index maintenance operations were performed. It may be the case that normal business processes (combined with less-than-stellar schema) fragment our indexes on a regular basis and you don’t have any control over it. This might be so for instance when you do a lot of writes against a clustering key that places these new rows in one of the middle extents, and not at the end. This will cause page splits and natural fragmentation. As DBA’s, it’s our job to limit fragmentation as much as possible. Maybe we alter the cluster key. Maybe we alter the fill factor. Maybe we change the INSERT process to place new rows at the end. Whatever methodology you employ, you’d like to see some sort of reduction in the number of rebuilds and then eventually, reorgs as well. This is what I use to track them…

FROM commandlog c
WHERE c.command like '%reorganize%'
  AND NOT EXISTS (SELECT 1 FROM IndexExceptions WHERE IndexName = c.IndexName)
GROUP BY CONVERT (DATE, c.starttime)

FROM commandlog c
WHERE c.command like '%rebuild%'
  AND NOT EXISTS (SELECT 1 FROM IndexExceptions WHERE IndexName = c.IndexName)
GROUP BY CONVERT (DATE, c.starttime)

The “IndexExceptions” table reference you see in the 4th row of each SELECT, refers to a small table I keep to exclude any indexes from my counts that I know for 100% sure that I’m going to leave be, and I don’t want them in my report. It’s simple schema is…

CREATE TABLE [dbo].[IndexExceptions](
  [IndexExceptionID] [int] IDENTITY(1,1) NOT NULL,
  [IndexName] [varchar](200) NOT NULL,
  [Reason] [varchar](max) NULL

When I want to analyze exactly how long my index maintenance operations are taking, I use the following…

SELECT commandtype Activity,
  CAST(CONVERT(CHAR(11), starttime, 113) AS DATETIME) Work_Date,
  COUNT(*) How_Many,
  AVG(DATEDIFF(ms,starttime,endtime)) Avg_ms,
  AVG(DATEDIFF(s,starttime,endtime)) Avg_Secs,
  AVG(DATEDIFF(mi,starttime,endtime)) Avg_Mins
FROM commandlog
WHERE commandtype NOT IN ('xp_create_subdir','xp_delete_file','RESTORE_VERIFYONLY','BACKUP_DATABASE')
GROUP BY commandtype
  ,CAST(CONVERT(CHAR(11), starttime, 113) AS DATETIME)
ORDER BY CommandType, CAST(CONVERT(CHAR(11), starttime, 113) AS DATETIME) DESC

…use the WHERE clause to change what you see and what you don’t.

This snippet will tell you how many times individual indexes where rebuilt/reorg’d in a specific period. Adjust both parts of the WHERE clause as necessary…

SELECT DatabaseName, ObjectName, IndexName, COUNT(*) CommandCount
FROM dbo.CommandLog
WHERE commandtype = 'ALTER_INDEX'
  AND DATEDIFF(d,starttime,GETDATE()) < 14
GROUP BY DatabaseName, ObjectName, IndexName
ORDER BY CommandCount DESC

And this SELECT tells us important information about a single index; including it’s current fill factor (in case you want to change it to a lower amount and in theory, reduce splits and fragmentation), the current fragmentation percentage, as well as the size of the index. Again, change the WHERE clause as neccessary…

SELECT starttime,
  objectname [table],
  CAST(CAST(extendedinfo as XML).query('data(/ExtendedInfo/Fragmentation)') as VARCHAR(10)) as fragmentation,
  CAST(CAST(extendedinfo as XML).query('data(/ExtendedInfo/PageCount)') as VARCHAR(10)) as [pagecount],
FROM commandlog c
JOIN sys.indexes i ON = c.indexname
WHERE indexname = 'IndexNameHere'
 AND CommandType like '%alter%'
ORDER by databasename, objectname, c.starttime DESC

These are just some of the things you can do with the CommandLog table and the plethora of useful information it contains. Of course, feel free to use these scripts, adjust them, and let me know how it works for you.

Thanks and have a grateful day…Troy

Tags: , , , , ,
Posted in Backups, Index Tuning, SSMS, T-SQL | Comments (0)

Missing Clustered Indexes w/ Row Counts

April 1st, 2010

This post is a follow-up to Thomas LaRock’s excellent article here, so first go read that. I’ll wait.

You back? Good.

As most of you know tables missing clustered indexes (CI) are fairly bad performance-wise, but why? Even disregarding the fact that SQL Azure actually requires a CI for each and every table in the cloud database, and hence you may be “tripped up” on this in the future. More importantly, SQL Server sequentially reads the data in a CI one extent at a time. Brad McGehee states, “This makes it very easy for the disk subsystem to read the data quickly from disk, especially if there is a lot of data to be retrieved.” Heaps cause the server to do so much extra work due to unordered nature of NCI’s or having no indexes at all, that performance is bound to suffer.

My goal here is to extend the functionality of Thomas’ little script to do two additional things which I find extremely helpful. One, have the script work all at once for every database on a particular server. Certainly this is not an issue for those of you in environments with one database per server, but I’d bet that’s more the exception than the rule for most of us.

Secondly, I find it very useful to have a row count included to assist me with deciding on exactly how to proceed with the data provided. While I don’t disagree with the need to potentially trash any tables you find with 0 rows, that cleanup project is for another day. I’m concerned today with missing CI’s on tables that truly need them. Unfortunately, row count is not included in either of the joined tables in Thomas’ script. I solve that problem by creating a temp table filled from sp_spaceused and join it to the original resultset.

A couple caveats for you before checking out this script. Firstly, being that I work in an environment that’s 95% SQL Server 2000, this script was written specifically for 8.0. Now, I did indeed test it on 2005 and it does work, but I’d bet a lot of coin there are much easier ways of doing this using DMV’s and other new features. Secondly, the script has a hard time with alternate schemas besides ‘dbo’. While I’m fully aware the script can probably be changed to take current schema into account (perhaps from the ‘information_schema.tables’ system view), I don’t have the need to in my environment so honestly, I just never took the time to change the script to handle alternate schemas. Third, don’t hassle me for using cursors. I don’t use them in production, but am not averse to using them carefully in maintenance scripts that get run “occasionally.” I know all about the performance implications but not really an issue here for a script that takes less than a second to run…and on an ad-hoc basis at that.

Thanks again to the SQLRockstar for inspiring me to clean this up and post it out there for all.

Have a Grateful day…Troy

USE master
  SELECT [name] FROM sysdatabases
  WHERE [name] NOT IN ('tempdb','master','msdb','model')
OPEN cur1
    SET @sql1 = N'SET NOCOUNT ON

      CREATE TABLE #so (dbName VARCHAR(75),
        tableName VARCHAR(75),
        tableRows INT)
      -- FILL FROM sysobjects
      INSERT INTO #so
      SELECT ''' + @db + ''' [db], name [table], 0
      FROM ' + @db + '.dbo.sysobjects
        INNER JOIN ' + @db + '.information_schema.tables ist ON ist.table_name = ' + @db + '
      WHERE xtype = ''u''
        AND id NOT IN
          (SELECT id FROM ' + @db + '.dbo.sysindexes WHERE indid = 1)
        AND ist.table_schema = ''dbo''
        AND ' + @db + ' NOT IN (''cdata'',''cglobal'')

      DECLARE cur2 cursor for
        SELECT tableName FROM #so
      OPEN cur2
      DECLARE @tn VARCHAR(75)
      DECLARE @sql2 NVARCHAR(1000)
      FETCH NEXT FROM cur2 INTO @tn
      WHILE @@FETCH_STATUS <> -1
          CREATE TABLE #su
            (tableName varchar(100),
                NOR varchar(100),
                RS varchar(50),
                DS varchar(50),
                IxS varchar(50),
                US varchar(50))
          SET @sql2 = N''INSERT INTO #su EXEC ' + @db + '.dbo.sp_spaceused '''''' + @tn + ''''''''
          EXEC sp_EXECUTESQL @sql2
          UPDATE #so
          SET tableRows = su.NOR
          FROM #su su
            INNER JOIN #so so ON so.tableName = su.tableName
          WHERE ISNULL(su.NOR,0) <> 0
          -- CLEAN UP
          DROP TABLE #su

          -- ITERATE        
          FETCH NEXT FROM cur2 INTO @tn
      -- CLEAN UP
      CLOSE cur2        
      DEALLOCATE cur2

      -- OUR OUTPUT!!
      SELECT * FROM #so
      WHERE tableRows > 0
                  ORDER BY tableRows DESC

      -- CLEAN UP
      DROP TABLE #so'
    -- ITERATE

CLOSE cur1

Tags: , ,
Posted in Index Tuning | Comments (1)

FTI Catalog Relocation

March 1st, 2010

Had a relatively simple maintenance to perform this past weekend; the kind so simple, it’s the fodder of DBA dreams everywhere. I was assigned the task to move the data and log files for four separate databases to another drive on the same server. The server designation itself for these four database wasn’t to change; simply the file path to the mdf and ldf. To make it even easier, none of these four were involved with replication; actually quite a rarity in our environment. Our network operations team provided me with a new SAN share on this box, which afforded me the opportunity to ease some of the drive space issues I ran into occasionally.

At first glance, a simple detach-move-reattach was all that appeared necessary for this straightforward operation. But a deeper look prior to the maintenance window’s arrival indicated all four of these databases were full text enabled and populating six separate catalogs each. Ok, no need for panic, this doesn’t turn it from a half hour operation into a six-hour one, but dropping the 24 catalogs then recreating and repopulating them did indeed add a substantial amount of time to the project. It got me wondering just how efficient my approach really was.

Detaching and reattaching a database does by default disable full-text indexing on that database. After the files have been properly moved, the task at hand is to move the full-text catalogs to folder alongside the new mdf/ldf location. For example, the “Users” table is FTI’d and the original catalog in stored on the M drive, same as the data and t-log files. After the move, the mdf and ldf are now on the N drive but right-clicking the “Users” table shows the catalogs still mapped to M. Not only this, but they are grayed out in the full-text editing dialog box, disallowing a simple remapping of the catalog location. My solution was to carefully document the fields/names/primary keys/etc. used for each table’s FTIs, then remove all catalogs and re-add them individually with them mapped to N as needed. Not a mind-numbingly horrible operation as some DBA stuff goes, but complicated enough in that one truly has to concentrate and not miss ANY details whatsoever.

While I accomplished the whole thing with hours still to spare in the maintenance window, I told myself there just has to be an easier way to accomplish the FTI portion. I suppose I could script out the whole process from dropping of the original catalogs to the recreation at the new location in T-SQL. Now while that would indeed save me mountains of time during the window, just how long would it take to create those very scripts? Have I really saved myself work? I don’t think so with that option.

It occurred to me that this catalog location must of course be stored in one of the system tables somewhere. What if I simply ran an update against that row then repopulated the index? Or maybe there is some non-documented-super-duper-secret hidden system stored procedure I could fire off? Could it be that simple?

Turns out that it isn’t that simple after all.

Sudarshan offers a SQL Server 2005 solution towards the bottom of this forum post which seems to indicate the magic “Keep Full Text Indexes” checkbox included with the detach dialog takes care of the issue very easily. There are a couple gotchas according to Sudarshan, but his solution appears quick and elegant. Lest it be noted I haven’t actually tested it because I currently work in a predominantly SQL Server 2000 environment. This looks viable for ’05 and most likely ’08, but proceed cautiously due to me not knowing this author personally.

Microsoft offers up two support pages that address this very issue; here and here. As your typical MSDN pages do, these offer tremendous detail into the process; more than most humans need. They recommend not just a simple “point the catalogs somewhere else” solution, but registry changes as well. Now, well I’ve been around a bit and registry changes don’t scare me as much as they used to, I still try and avoid them whenever possible. Too many horror stories have surfaced over the years of even the slightest of registry changes wreaking total havoc across multiple systems, that I’m ALWAYS going to be hesitant to manually make those changes. If that’s what goes on behind the scene when I complete a wizard or close a dialog box, well then fine, that I trust. If I have a choice that’s slightly more laborious yet means I don’t have to visit regedit, then I’m all for it.

While my “manual” solution appears to possibly be the quickest methodology for a small number of databases, obviously one’s going to have major issues when it comes to scaling and trying to accomplish this over 20, 30 or more databases simultaneously. Like so so many other SQL Server topics, this comes down to an “It depends” answer once again. Take everything into account. How many databases? How long is your maintenance window? Your ability for producing similar scripts to run at a later time? Your tolerance for making direct registry changes? Take all these factors into account and I’m sure the solution you end up with with, will much more appropriate for your situation.

I’m very interested in your feedback on this topic. Have I over-complicated a relatively simple issue? Or have I stumbled upon one of those timeless aspects of SQL Server that simply needs dealing with on a manual basis? How have you accomplished this in the past? I look forward to your responses.

Have a Grateful day…Troy

Tags: , , , ,
Posted in Full Text Indexing | Comments (0)

Finding Duplicate Indexes in SQL Server 2000

November 19th, 2009

“Keep on rollin’ my old buddy; you’re movin’ way too slow.”  Jack Straw may have hailed from Wichita, but I can assure you he never had to deal with duplicate indexes slowing down his production environment.

We’re all aware of the perils of insufficient indexes in our SQL Server environment. One of the first truisms we are taught as DBAs is to avoid the dreaded index or table scans in our query plans. That’s a topic covered a billion times before; not sure there’s anything new I can bring to that conversation.

And while there are plenty of scripts bouncing around the blogs on how to discover duplicate indexes; many are oriented towards SS2005 or 2008; or worse yet, use a cursor to find these duplicates.  Since I currently work in the stone age with SQL Server 2000, I wanted a solution that would accomplish it’s goal without using dmv’s or cursors.

The script below is a mish-mosh of the work of others (specially Merrill Aldrich) with some changes I’ve included to make it easier to use in my environment.  The script first, then a couple comments on how I use it…


CREATE VIEW vw_index_list AS
SELECT tbl.[name] AS TableName,
idx.[name] AS IndexName,
INDEX_COL( tbl.[name], idx.indid, 1 ) AS col1,
INDEX_COL( tbl.[name], idx.indid, 2 ) AS col2,
INDEX_COL( tbl.[name], idx.indid, 3 ) AS col3,
INDEX_COL( tbl.[name], idx.indid, 4 ) AS col4,
INDEX_COL( tbl.[name], idx.indid, 5 ) AS col5,
INDEX_COL( tbl.[name], idx.indid, 6 ) AS col6,
INDEX_COL( tbl.[name], idx.indid, 7 ) AS col7,
INDEX_COL( tbl.[name], idx.indid, 8 ) AS col8,
INDEX_COL( tbl.[name], idx.indid, 9 ) AS col9,
INDEX_COL( tbl.[name], idx.indid, 10 ) AS col10,
INDEX_COL( tbl.[name], idx.indid, 11 ) AS col11,
INDEX_COL( tbl.[name], idx.indid, 12 ) AS col12,
INDEX_COL( tbl.[name], idx.indid, 13 ) AS col13,
INDEX_COL( tbl.[name], idx.indid, 14 ) AS col14,
INDEX_COL( tbl.[name], idx.indid, 15 ) AS col15,
INDEX_COL( tbl.[name], idx.indid, 16 ) AS col16,
idx.[id] tableID
INNER JOIN SYSOBJECTS tbl ON idx.[id] = tbl.[id]
WHERE indid > 0
AND INDEXPROPERTY( tbl.[id], idx.[name], ‘IsStatistics’) = 0


SET @db = DB_NAME()

l2.indexname AS overlappingIndex,
l1.col1, l1.col2, l1.col3,
l1.col4, l1.col5, l1.col6,
l1.col7, l1.col8, l1.col9,
l1.col10, l1.col11, l1.col12,
l1.col13, l1.col14, l1.col15,
‘SELECT INDEXPROPERTY(‘ + CONVERT(varchar(20),l1.tableID) + ‘, ”’ + l1.indexname + ”’, ”IsClustered”)’
FROM vw_index_list
INNER JOIN vw_index_list_TG l2 ON l1.tablename = l2.tablename
AND l1.indexname <> l2.indexname
AND l1.col1 = l2.col1
AND (l1.col2 IS NULL OR l2.col2 IS NULL OR l1.col2 = l2.col2)
AND (l1.col3 IS NULL OR l2.col3 IS NULL OR l1.col3 = l2.col3)
AND (l1.col4 IS NULL OR l2.col4 IS NULL OR l1.col4 = l2.col4)
AND (l1.col5 IS NULL OR l2.col5 IS NULL OR l1.col5 = l2.col5)
AND (l1.col6 IS NULL OR l2.col6 IS NULL OR l1.col6 = l2.col6)
AND (l1.col7 IS NULL OR l2.col7 IS NULL OR l1.col7 = l2.col7)
AND (l1.col8 IS NULL OR l2.col8 IS NULL OR l1.col8 = l2.col8)
AND (l1.col9 IS NULL OR l2.col9 IS NULL OR l1.col9 = l2.col9)
AND (l1.col10 IS NULL OR l2.col10 IS NULL OR l1.col10 = l2.col10)
AND (l1.col11 IS NULL OR l2.col11 IS NULL OR l1.col11 = l2.col11)
AND (l1.col12 IS NULL OR l2.col12 IS NULL OR l1.col12 = l2.col12)
AND (l1.col13 IS NULL OR l2.col13 IS NULL OR l1.col13 = l2.col13)
AND (l1.col14 IS NULL OR l2.col14 IS NULL OR l1.col14 = l2.col14)
AND (l1.col15 IS NULL OR l2.col15 IS NULL OR l1.col15 = l2.col15)
AND (l1.col16 IS NULL OR l2.col16 IS NULL OR l1.col16 = l2.col16)
ORDER BY l1.tablename,

DROP VIEW vw_index_list

Upon running this, you have a couple options on what to do with the output.  I personally prefer to copy and paste the QA results window into Excel for further analysis.  It’s for this reason specifically that the script identifies the server and database name in the resultset.  This may be your only option if you operate in a disjointed environment where servers cannot reach each other through common linking.

On the other hand, you may very easily convert the last SELECT statement above into a SELECT INTO type statement and pour the data in a table.  This DevGuru article is where I turn for the specific syntax when I have a SELECT INTO brainf@rt.

A few very important points to consider when analyzing the results…

  • For each pair, the index with more keys generally will encapsulate the functionality of the smaller one.  A gerneral rule would be removing the “shorter” of the two.
  • Be very hesitant to remove any index that is clustered.  This is due to the fact that CI’s actually determine the physical order of the data on the disk.  For this reason alone, removing them would most certainly slow things down.  Use “INDEXPROPERTY(tableid,indexname,’IsClustered’)” to easily determine if you’re dealing with a clustered index.
  • Be weary of any index query hints that stored procedures may use against these duplicates.  The query is bound to break if a non-existent index is used.
  • Until you’ve had a bit of practice looking over the results and deciding which indexes can actually be removed; it might be a very good idea to script out the index definition prior to removal.  Better paranoid than sorry I always say. Later on down the line when you get a frantic request from your customer, saying their scripts or processes have slowed down or stopped working completely; you’ll be happy you did.
  • Although I don’t see why it wouldn’t work on SQL Server 2008, I personally have only tested it on 2000 and 2005.  Lest ye be warned.

This being my first *real* blog post, I’d love some feedback or comments.

Thanks and have a Grateful day…Troy

Tags: , ,
Posted in Index Tuning | Comments (1)

Get Adobe Flash player