Posts Tagged ‘indexing’

Missing Clustered Indexes w/ Row Counts

April 1st, 2010

This post is a follow-up to Thomas LaRock’s excellent article here, so first go read that. I’ll wait.

You back? Good.

As most of you know tables missing clustered indexes (CI) are fairly bad performance-wise, but why? Even disregarding the fact that SQL Azure actually requires a CI for each and every table in the cloud database, and hence you may be “tripped up” on this in the future. More importantly, SQL Server sequentially reads the data in a CI one extent at a time. Brad McGehee states, “This makes it very easy for the disk subsystem to read the data quickly from disk, especially if there is a lot of data to be retrieved.” Heaps cause the server to do so much extra work due to unordered nature of NCI’s or having no indexes at all, that performance is bound to suffer.

My goal here is to extend the functionality of Thomas’ little script to do two additional things which I find extremely helpful. One, have the script work all at once for every database on a particular server. Certainly this is not an issue for those of you in environments with one database per server, but I’d bet that’s more the exception than the rule for most of us.

Secondly, I find it very useful to have a row count included to assist me with deciding on exactly how to proceed with the data provided. While I don’t disagree with the need to potentially trash any tables you find with 0 rows, that cleanup project is for another day. I’m concerned today with missing CI’s on tables that truly need them. Unfortunately, row count is not included in either of the joined tables in Thomas’ script. I solve that problem by creating a temp table filled from sp_spaceused and join it to the original resultset.

A couple caveats for you before checking out this script. Firstly, being that I work in an environment that’s 95% SQL Server 2000, this script was written specifically for 8.0. Now, I did indeed test it on 2005 and it does work, but I’d bet a lot of coin there are much easier ways of doing this using DMV’s and other new features. Secondly, the script has a hard time with alternate schemas besides ‘dbo’. While I’m fully aware the script can probably be changed to take current schema into account (perhaps from the ‘information_schema.tables’ system view), I don’t have the need to in my environment so honestly, I just never took the time to change the script to handle alternate schemas. Third, don’t hassle me for using cursors. I don’t use them in production, but am not averse to using them carefully in maintenance scripts that get run “occasionally.” I know all about the performance implications but not really an issue here for a script that takes less than a second to run…and on an ad-hoc basis at that.

Thanks again to the SQLRockstar for inspiring me to clean this up and post it out there for all.

Have a Grateful day…Troy


USE master
DECLARE cur1 CURSOR FOR
  SELECT [name] FROM sysdatabases
  WHERE [name] NOT IN ('tempdb','master','msdb','model')
        
DECLARE @db VARCHAR(75)
DECLARE @sql1 NVARCHAR(2000)
OPEN cur1
FETCH NEXT FROM cur1 INTO @db
WHILE @@FETCH_STATUS <> -1
  BEGIN
    SET @sql1 = N'SET NOCOUNT ON

      -- CREATE TEMP TABLE TO HOLD ALL TABLE NAMES
      CREATE TABLE #so (dbName VARCHAR(75),
        tableName VARCHAR(75),
        tableRows INT)
    
      -- FILL FROM sysobjects
      INSERT INTO #so
      SELECT ''' + @db + ''' [db], name [table], 0
      FROM ' + @db + '.dbo.sysobjects
        INNER JOIN ' + @db + '.information_schema.tables ist ON ist.table_name = ' + @db + '.dbo.sysobjects.name
      WHERE xtype = ''u''
        AND id NOT IN
          (SELECT id FROM ' + @db + '.dbo.sysindexes WHERE indid = 1)
        AND ist.table_schema = ''dbo''
        AND ' + @db + '.dbo.sysobjects.name NOT IN (''cdata'',''cglobal'')

      -- CREATE CURSOR TO ITERATE TABLE NAMES AND RUN sp_spaceused AGAINST THEM
      DECLARE cur2 cursor for
        SELECT tableName FROM #so
      OPEN cur2
      DECLARE @tn VARCHAR(75)
      DECLARE @sql2 NVARCHAR(1000)
      FETCH NEXT FROM cur2 INTO @tn
      WHILE @@FETCH_STATUS <> -1
        BEGIN
          -- CREATE TEMP TABLE TO HOLD ROW AMOUNTS (AND OTHER STUFF WE DON'T CARE ABOUT NOW)
          CREATE TABLE #su
            (tableName varchar(100),
                NOR varchar(100),
                RS varchar(50),
                DS varchar(50),
                IxS varchar(50),
                US varchar(50))
          
          -- FILL THE NEW TEMP TABLE
          SET @sql2 = N''INSERT INTO #su EXEC ' + @db + '.dbo.sp_spaceused '''''' + @tn + ''''''''
          EXEC sp_EXECUTESQL @sql2
            
          -- JOIN THE 2 TEMP TABLES TOGETHER TO UPDATE THE FIRST WITH THE ROW NUMBER COUNT
          UPDATE #so
          SET tableRows = su.NOR
          FROM #su su
            INNER JOIN #so so ON so.tableName = su.tableName
          WHERE ISNULL(su.NOR,0) <> 0
          
          -- CLEAN UP
          DROP TABLE #su

          -- ITERATE        
          FETCH NEXT FROM cur2 INTO @tn
        END
  
      -- CLEAN UP
      CLOSE cur2        
      DEALLOCATE cur2

      -- OUR OUTPUT!!
      SELECT * FROM #so
      WHERE tableRows > 0

      -- CLEAN UP
      DROP TABLE #so'
    
    -- EXECUTE THE ENTIRE BLOCK OF DYNAMIC SQL
    EXEC SP_EXECUTESQL @sql1  
    
    -- ITERATE
    FETCH NEXT FROM cur1 INTO @db
  END

-- CLEAN UP
CLOSE cur1
DEALLOCATE cur1

Tags: , ,
Posted in Index Tuning | Comments (1)

FTI Catalog Relocation

March 1st, 2010
relocation

Had a relatively simple maintenance to perform this past weekend; the kind so simple, it’s the fodder of DBA dreams everywhere. I was assigned the task to move the data and log files for four separate databases to another drive on the same server. The server designation itself for these four database wasn’t to change; simply the file path to the mdf and ldf. To make it even easier, none of these four were involved with replication; actually quite a rarity in our environment. Our network operations team provided me with a new SAN share on this box, which afforded me the opportunity to ease some of the drive space issues I ran into occasionally.

At first glance, a simple detach-move-reattach was all that appeared necessary for this straightforward operation. But a deeper look prior to the maintenance window’s arrival indicated all four of these databases were full text enabled and populating six separate catalogs each. Ok, no need for panic, this doesn’t turn it from a half hour operation into a six-hour one, but dropping the 24 catalogs then recreating and repopulating them did indeed add a substantial amount of time to the project. It got me wondering just how efficient my approach really was.

Detaching and reattaching a database does by default disable full-text indexing on that database. After the files have been properly moved, the task at hand is to move the full-text catalogs to folder alongside the new mdf/ldf location. For example, the “Users” table is FTI’d and the original catalog in stored on the M drive, same as the data and t-log files. After the move, the mdf and ldf are now on the N drive but right-clicking the “Users” table shows the catalogs still mapped to M. Not only this, but they are grayed out in the full-text editing dialog box, disallowing a simple remapping of the catalog location. My solution was to carefully document the fields/names/primary keys/etc. used for each table’s FTIs, then remove all catalogs and re-add them individually with them mapped to N as needed. Not a mind-numbingly horrible operation as some DBA stuff goes, but complicated enough in that one truly has to concentrate and not miss ANY details whatsoever.

While I accomplished the whole thing with hours still to spare in the maintenance window, I told myself there just has to be an easier way to accomplish the FTI portion. I suppose I could script out the whole process from dropping of the original catalogs to the recreation at the new location in T-SQL. Now while that would indeed save me mountains of time during the window, just how long would it take to create those very scripts? Have I really saved myself work? I don’t think so with that option.

It occurred to me that this catalog location must of course be stored in one of the system tables somewhere. What if I simply ran an update against that row then repopulated the index? Or maybe there is some non-documented-super-duper-secret hidden system stored procedure I could fire off? Could it be that simple?

Turns out that it isn’t that simple after all.

Sudarshan offers a SQL Server 2005 solution towards the bottom of this forum post which seems to indicate the magic “Keep Full Text Indexes” checkbox included with the detach dialog takes care of the issue very easily. There are a couple gotchas according to Sudarshan, but his solution appears quick and elegant. Lest it be noted I haven’t actually tested it because I currently work in a predominantly SQL Server 2000 environment. This looks viable for ’05 and most likely ’08, but proceed cautiously due to me not knowing this author personally.

Microsoft offers up two support pages that address this very issue; here and here. As your typical MSDN pages do, these offer tremendous detail into the process; more than most humans need. They recommend not just a simple “point the catalogs somewhere else” solution, but registry changes as well. Now, well I’ve been around a bit and registry changes don’t scare me as much as they used to, I still try and avoid them whenever possible. Too many horror stories have surfaced over the years of even the slightest of registry changes wreaking total havoc across multiple systems, that I’m ALWAYS going to be hesitant to manually make those changes. If that’s what goes on behind the scene when I complete a wizard or close a dialog box, well then fine, that I trust. If I have a choice that’s slightly more laborious yet means I don’t have to visit regedit, then I’m all for it.

While my “manual” solution appears to possibly be the quickest methodology for a small number of databases, obviously one’s going to have major issues when it comes to scaling and trying to accomplish this over 20, 30 or more databases simultaneously. Like so so many other SQL Server topics, this comes down to an “It depends” answer once again. Take everything into account. How many databases? How long is your maintenance window? Your ability for producing similar scripts to run at a later time? Your tolerance for making direct registry changes? Take all these factors into account and I’m sure the solution you end up with with, will much more appropriate for your situation.

I’m very interested in your feedback on this topic. Have I over-complicated a relatively simple issue? Or have I stumbled upon one of those timeless aspects of SQL Server that simply needs dealing with on a manual basis? How have you accomplished this in the past? I look forward to your responses.

Have a Grateful day…Troy

Tags: , , , ,
Posted in Full Text Indexing | Comments (0)

Finding Duplicate Indexes in SQL Server 2000

November 19th, 2009

“Keep on rollin’ my old buddy; you’re movin’ way too slow.”  Jack Straw may have hailed from Wichita, but I can assure you he never had to deal with duplicate indexes slowing down his production environment.

We’re all aware of the perils of insufficient indexes in our SQL Server environment. One of the first truisms we are taught as DBAs is to avoid the dreaded index or table scans in our query plans. That’s a topic covered a billion times before; not sure there’s anything new I can bring to that conversation.

And while there are plenty of scripts bouncing around the blogs on how to discover duplicate indexes; many are oriented towards SS2005 or 2008; or worse yet, use a cursor to find these duplicates.  Since I currently work in the stone age with SQL Server 2000, I wanted a solution that would accomplish it’s goal without using dmv’s or cursors.

The script below is a mish-mosh of the work of others (specially Merrill Aldrich) with some changes I’ve included to make it easier to use in my environment.  The script first, then a couple comments on how I use it…

/*
SCRIPT TO FIND DUPLICATE INDEXES
*/

CREATE VIEW vw_index_list AS
SELECT tbl.[name] AS TableName,
idx.[name] AS IndexName,
INDEX_COL( tbl.[name], idx.indid, 1 ) AS col1,
INDEX_COL( tbl.[name], idx.indid, 2 ) AS col2,
INDEX_COL( tbl.[name], idx.indid, 3 ) AS col3,
INDEX_COL( tbl.[name], idx.indid, 4 ) AS col4,
INDEX_COL( tbl.[name], idx.indid, 5 ) AS col5,
INDEX_COL( tbl.[name], idx.indid, 6 ) AS col6,
INDEX_COL( tbl.[name], idx.indid, 7 ) AS col7,
INDEX_COL( tbl.[name], idx.indid, 8 ) AS col8,
INDEX_COL( tbl.[name], idx.indid, 9 ) AS col9,
INDEX_COL( tbl.[name], idx.indid, 10 ) AS col10,
INDEX_COL( tbl.[name], idx.indid, 11 ) AS col11,
INDEX_COL( tbl.[name], idx.indid, 12 ) AS col12,
INDEX_COL( tbl.[name], idx.indid, 13 ) AS col13,
INDEX_COL( tbl.[name], idx.indid, 14 ) AS col14,
INDEX_COL( tbl.[name], idx.indid, 15 ) AS col15,
INDEX_COL( tbl.[name], idx.indid, 16 ) AS col16,
dpages,
used,
rowcnt,
idx.[id] tableID
FROM SYSINDEXES idx
INNER JOIN SYSOBJECTS tbl ON idx.[id] = tbl.[id]
WHERE indid > 0
AND INDEXPROPERTY( tbl.[id], idx.[name], ‘IsStatistics’) = 0

GO

DECLARE @db SYSNAME, @srv SYSNAME
SET @db = DB_NAME()
SET @srv = @@SERVERNAME

SELECT @db,
@srv,
l1.tableID,
l1.tablename,
l1.indexname,
l2.indexname AS overlappingIndex,
l1.col1, l1.col2, l1.col3,
l1.col4, l1.col5, l1.col6,
l1.col7, l1.col8, l1.col9,
l1.col10, l1.col11, l1.col12,
l1.col13, l1.col14, l1.col15,
l1.col16,
l1.dpages,
l1.used,
l1.rowcnt,
‘SELECT INDEXPROPERTY(‘ + CONVERT(varchar(20),l1.tableID) + ‘, ”’ + l1.indexname + ”’, ”IsClustered”)’
FROM vw_index_list
INNER JOIN vw_index_list_TG l2 ON l1.tablename = l2.tablename
AND l1.indexname <> l2.indexname
AND l1.col1 = l2.col1
AND (l1.col2 IS NULL OR l2.col2 IS NULL OR l1.col2 = l2.col2)
AND (l1.col3 IS NULL OR l2.col3 IS NULL OR l1.col3 = l2.col3)
AND (l1.col4 IS NULL OR l2.col4 IS NULL OR l1.col4 = l2.col4)
AND (l1.col5 IS NULL OR l2.col5 IS NULL OR l1.col5 = l2.col5)
AND (l1.col6 IS NULL OR l2.col6 IS NULL OR l1.col6 = l2.col6)
AND (l1.col7 IS NULL OR l2.col7 IS NULL OR l1.col7 = l2.col7)
AND (l1.col8 IS NULL OR l2.col8 IS NULL OR l1.col8 = l2.col8)
AND (l1.col9 IS NULL OR l2.col9 IS NULL OR l1.col9 = l2.col9)
AND (l1.col10 IS NULL OR l2.col10 IS NULL OR l1.col10 = l2.col10)
AND (l1.col11 IS NULL OR l2.col11 IS NULL OR l1.col11 = l2.col11)
AND (l1.col12 IS NULL OR l2.col12 IS NULL OR l1.col12 = l2.col12)
AND (l1.col13 IS NULL OR l2.col13 IS NULL OR l1.col13 = l2.col13)
AND (l1.col14 IS NULL OR l2.col14 IS NULL OR l1.col14 = l2.col14)
AND (l1.col15 IS NULL OR l2.col15 IS NULL OR l1.col15 = l2.col15)
AND (l1.col16 IS NULL OR l2.col16 IS NULL OR l1.col16 = l2.col16)
ORDER BY l1.tablename,
l1.indexname

DROP VIEW vw_index_list

Upon running this, you have a couple options on what to do with the output.  I personally prefer to copy and paste the QA results window into Excel for further analysis.  It’s for this reason specifically that the script identifies the server and database name in the resultset.  This may be your only option if you operate in a disjointed environment where servers cannot reach each other through common linking.

On the other hand, you may very easily convert the last SELECT statement above into a SELECT INTO type statement and pour the data in a table.  This DevGuru article is where I turn for the specific syntax when I have a SELECT INTO brainf@rt.

A few very important points to consider when analyzing the results…

  • For each pair, the index with more keys generally will encapsulate the functionality of the smaller one.  A gerneral rule would be removing the “shorter” of the two.
  • Be very hesitant to remove any index that is clustered.  This is due to the fact that CI’s actually determine the physical order of the data on the disk.  For this reason alone, removing them would most certainly slow things down.  Use “INDEXPROPERTY(tableid,indexname,’IsClustered’)” to easily determine if you’re dealing with a clustered index.
  • Be weary of any index query hints that stored procedures may use against these duplicates.  The query is bound to break if a non-existent index is used.
  • Until you’ve had a bit of practice looking over the results and deciding which indexes can actually be removed; it might be a very good idea to script out the index definition prior to removal.  Better paranoid than sorry I always say. Later on down the line when you get a frantic request from your customer, saying their scripts or processes have slowed down or stopped working completely; you’ll be happy you did.
  • Although I don’t see why it wouldn’t work on SQL Server 2008, I personally have only tested it on 2000 and 2005.  Lest ye be warned.

This being my first *real* blog post, I’d love some feedback or comments.

Thanks and have a Grateful day…Troy

Tags: , ,
Posted in Index Tuning | Comments (1)

Get Adobe Flash playerPlugin by wpburn.com wordpress themes