Posts Tagged ‘2005’

Which JOIN Should I Use?

February 16th, 2011

A couple years ago, I purchased Paul Nielsen’s wonderful book “SQL Server 2005 Bible” published by Wiley.  It includes a very small little sidenote-type of box discussing an excellent parable regarding joins.  Paul’s analogy comes to mind almost every time I create a JOIN.  Remembering it alone is sufficient to refresh my memory on which JOIN does what.  It’s easy to remember, makes it’s point, and is succinct; what more could you ask?

So, I thought I would re-tell Paul’s example in order that this may possibly reach an individual or two that doesn’t have this book.  This will help both new and old SQL’ites alike.

Consider a very old style, ultra-conservative church where men are expected to sit on one side of the aisle (the right for the sake of this example), and women on the other (the left).  Married couples can indeed attend this church, but they still must sit on their respective sides.  The minister  instructs all married individuals who’s spouse is on the other side of the aisle, to stand up, grab their partner’s hand, and file out of the church.  This is our INNER JOIN.  Only matched “pieces” from each side are considered in the output.

Now imagine that instead of married individuals only, the minister instructs all the men to stand up.  Those that are married are to grab the hand of their spouse on the other side of the aisle, and then all men and only the married women are exited from the church.  Here is our OUTER JOIN; a RIGHT one in this case.  Our output consists of all members of the right group and only matches from the left.

Conversely,  perhaps the minister instead does the opposite.   He instructs all women to stand, grab their husband’s hand, and then all women and their partners exit the building.  This is our LEFT OUTER JOIN.  All members from the left group are included and as well only matches from the right group come along.

Or instead, the minister instructs everyone to stand up, all the married folks to grab hands, and then everyone leaves the building.  This is our FULL [OUTER] JOIN!  Everything is included in the output, but matched couples are still sent out together.

That’s it.  Simple, but extremely useful when any bit of doubt creeps into your mind about which JOIN may be appropriate in a specific situation.  Again, I give credit where it is due; and this time it is to Paul Nielsen for teaching me this wonderful memory aid.

Another post coming in the next couple days to announce a job change for me that I’m very very excited about.  Look for that by this time next week.

Have a grateful day…Troy

Tags: , , , ,
Posted in T-SQL | Comments (0)

Which is Faster; the Chicken or the Egg?

December 2nd, 2010

Due to a company policy, all UPDATEs at my office eventually headed to prod, are required to be based solely off a primary key and must be written out one line at a time.  This pertains if the PKs are sequential or not.  Yesterday I received a script for approval that contained 14,000 individual UPDATE statements based off an identity key and comprising 100% of the table.  So imagine if you will (said in my best Rod Serling voice) 14,000 of these…

UPDATE Customer SET firstName = ‘Foo’ WHERE id = 1
UPDATE Customer SET firstName = ‘Foo’ WHERE id = 2
UPDATE Customer SET firstName = ‘Foo’ WHERE id = 3

It occurred to me that as long as we have sequential ID’s to update off of, then it had to be substantially quicker to do a single UPDATE using a BETWEEN clause. So, 14,000 of the above statements turn in to just one…

UPDATE Customer SET firstName = ‘Foo’ WHERE id BETWEEN 1 AND 14000

But just how much faster is the BETWEEN vs. individual UPDATEs? Turns out the answer is "way way way". Unfortunately, "way way way" isn’t extremely scientific. So the geek in me had to see exactly how much faster and wrote some scripts to test it out.

On a local SQL Server 2008 R2 installation (no network interference), I filled two local temp tables with random numbers between 1 & 50. I then simply updated this value to ’51’ both ways. First one row at a time based off the incrementing PK, and secondly all at once using BETWEEN. The difference even surprised me. At all levels I tested, it was over 99% faster, and the larger the sample size, the more significant the time savings. The results…

SAMPLE SIZE INDIV UPDATES AVG BETWEEN AVG RATIO PCT. FASTER
500 641.8 4.8 0.75% 99.25%
5000 6837.8 19.6 0.29% 99.71%
10000 15742.4 51.6 0.33% 99.67%
15000 28546.4 63.0 0.22% 99.78%
20000 63462.6 95.8 0.15% 99.85%

Ok, so what conclusions can we make from this data?

  1. Not only is it faster using the BETWEEN function, but it’s significantly faster. Over 99% faster in the sample sizes I used
  2. The time savings (percentage-wise) tends to increase as the number of UPDATEs increase as well. Good to know because this may be something to take into account when doing that pesky 1,000,000 row update; but not so much when just on a couple rows.
  3. The back-and-forth, back-and-forth, back-and-forth REALLY adds up over a larger sample size. For example; going from just 500 to 5000 UPDATEs, we’ve increased the sample size 10-fold, the BETWEEN was 4x as long, yet the individual UPDATEs took over 10x as long!
  4. We bloggers will write about anything SQL related no matter how obvious it may be if it offers even a glimmer of additional insight into our product.

Have a Grateful day…Troy

Tags: , , , , ,
Posted in Performance Tuning, T-SQL | Comments (3)

Copy/Paste From SSMS to Excel

November 29th, 2010

From the “did you know” category…

Did you know you could highlight an object in SSMS, right-click it, and drag it to Excel? I oft have the need to list procedures in Excel and the “old” copy/paste methodology becomes cumbersome when you have to do it over and over again. But as I said; if you right-click and drag an object across your desktop to Excel, you can then un-click and have the option to move or copy it at your destination. This is very handy, particularly in a dual monitor setup where Excel’s open on one side and SSMS on the other.

By the way, before you ask, yes it works just as well with your destination being any of Microsoft’s Office Suite. I tested it in Word & Outlook just now and it works dandy. Not sure why I’m so surprised these Office applications work well together like that; I guess I shouldn’t be.

Have a Grateful day…Troy

Tags: , , , , , ,
Posted in Office, SSMS | Comments (3)

Dropping Multiple Tables at Once Via T-SQL

November 26th, 2010

From time to time we have the need to drop multiple tables within a single stored procedure. Were you aware that it can be done in single DROP TABLE statement? SELECTing multiple columns, constants, variables, etc. in a single SELECT is one of the first things we learn as SQL newbies. But dropping two or more tables at the same time, while slightly less common, is just as convenient.

So, instead of…

DROP TABLE A;
DROP TABLE B;
DROP TABLE C;

we can…
DROP TABLE A, B, C;

Not only can we do this with normal tables, we also have the ability to drop temp tables; both local and global, as well at the same time. How convenient is that?

This little block of code will create normal, local temp, & global temp tables, fill them in with sample data, show the data’s there, drop the tables using this “new” syntax, then show that all three tables are indeed gone.

 
USE tempdb
SET NOCOUNT ON

-- Create a local temp table
CREATE TABLE #localTempTable (i INT)

-- Fill the local temp table with sample data
INSERT INTO #localTempTable
SELECT id FROM sysobjects

-- Show that the local temp table is filled
SELECT COUNT(i) localTempTable FROM #localTempTable

GO

-- Create a global temp table
CREATE TABLE ##globalTempTable (i INT)

-- Fill the global temp table with sample data
INSERT INTO ##globalTempTable
SELECT id  FROM sysobjects

-- Show that the global temp table is filled
SELECT COUNT(i) globalTempTable FROM ##globalTempTable

GO

-- Create a real table
CREATE TABLE RealTable (i INT)

-- Fill the real table with sample data
INSERT INTO RealTable
SELECT id  FROM sysobjects

-- Show that the real table is filled
SELECT COUNT(i) realTable FROM RealTable

GO

-- Drop the tables using the "new" syntax
DROP TABLE #localTempTable, ##globalTempTable, RealTable

GO

-- Show that the objects are indeed gone
IF OBJECT_ID(N'tempdb..#localTempTable', N'U') IS NULL
  SELECT '#localTempTable is gone'
ELSE
  SELECT '#localTempTable is still there'
  
IF OBJECT_ID(N'tempdb..##globalTempTable', N'U') IS NULL
  SELECT '##globalTempTable is gone'
ELSE
  SELECT '##globalTempTable is still there'
  
IF OBJECT_ID(N'tempdb..RealTable', N'U') IS NULL
  SELECT 'RealTable is gone'
ELSE
  SELECT 'RealTable is still there'

And while these days it seems like we DELETE less and archive a lot more, it’s still nice to know that we do have a couple options when it comes to performing this most basic, but vital operation in SQL Server.

Have a grateful day…Troy

Tags: , , , ,
Posted in T-SQL | Comments (1)

What’s Your Favorite SSMS Font?

November 4th, 2010

A lot of us DBAs spend day after day looking at our SSMS windows in the “Courier New” font. It’s the default. It’s the one you learned from. It’s comfortable because it’s what we’re used to. Microsoft provides us a big, long list for us to choose from should we wish to mix things up.

Do you indeed change it from the default, or you one of those “leave well enough alone” kind of DBA’s? Choose one below and I’ll post the results in a week or two.

Have a grateful day…Troy

What is your favorite SSMS font?

      

View Results

Loading ... Loading ...

Tags: , ,
Posted in SSMS | Comments (1)

COUNTing with the DISTINCT Clause

October 4th, 2010
One ringy-dingy...

Ok, that’s not the count I’m referring to, but one of the many important tasks we DBAs are charged with from time to time is to convert INSERTS/UPDATES/DELETES (IUDs) into “count scripts” in order to verify the amount of records the DML statement would affect before it goes into production. While this may seem like one of the most basic things we have to do daily, the importance cannot be stressed enough. A proper and thorough series of checks and balances is vital in not only the biggest of organizations, but small companies as well. Nobody wants to run scripts against prod only to find out later that instead of affecting the 100 rows you thought it would, your dev left out the WHERE clause and you just updated every row in the table. Eeek! Now you have to take time away from other important things you could/should be doing, and have to test out your data recovery strategy. People (specially Vice Presidents) get extremely irritated at situations like this. Jobs WILL be on the line. Even if you were not the dev who wrote the script, you were still the DBA who let it slide past unchecked. This can be very hazardous to your career.

And since The Grateful DBA wants you to have a long and prosperous SQL career, I offer this little tip regarding one count script situation you might find yourself in; that of an IUD based off a SELECT DISTINCT statement. It’s a bit trickier than you might suspect, but since my readers are way above average intelligence, I have no doubts it’ll be eazy-peazy for you.

Here’s an example of a typical customer INSERT you may have to convert to a count script:

INSERT INTO Customer
     (ID,
     firstName,
     lastName,
     nickname,
     address1,
     address2,
     city,
     stateID,
     zip,
     entryDate)
SELECT DISTINCT
     1,
     t.firstName,
     t.lastName,
     t.firstName,
     t.address1,
     t.address2,
     @city,
     s.longStateName,
     ISNULL(t.zip,'00000'),
     getdate()
FROM CustomerTemp t
     INNER JOIN States s ON s.ID = t.ID

We have a little of everything here: constants, joined values, repeats, functions, & local variables. Let’s see what it would take to convert this to an accurate count of exactly how many records we’re inserting. Here are all the rules you need to remember…

CONSTANTS – All constants must be removed from the list. This includes things like numbers and string literals for instance. These will naturally return the same value each and every time iterated. In the case of our example script, this means we’ll need to remove the “1”.

REPEATS – Values in the final count list cannot be repeated more than once or else an error will indeed be thrown. In our example, we must remove one of the two “t.firstName”s.

FUNCTIONS – Function calls such as the ISNULL in our example are allowed and valid, but must be aliased. Even if it’s exactly the same as the fieldname itself, an alias is required. As well, deterministic functions such as “getdate()” must be removed.

LOCAL VARIABLES – Similarly to the rule concerning constants, local variables will also naturally insert the same value each time, so must be removed from the script. In our example, the “@city” is a goner.

THE WRAPPER – No, I’m not referring to Jay-Z or Eminem; I mean how the count statement is formed. From my (albeit brief) research, there are apparently several “legal” ways of forming the final statement. The method I’ve always used, and to be honest, have found easiest to read and grasp what’s being accomplished, is to wrap the new, slimmer select in it’s own “SELECT COUNT(*) FROM”. The inside SELECT must be surrounded by parenthesis and must have it’s own unique alias. Even if you have no intention on ever using the alias for anything important, nonetheless it’s required.

This is what our example above turns into after we’ve trimmed it down…

SELECT COUNT(*)
FROM
     (SELECT DISTINCT t.firstName,
          t.lastName,
          t.address1,
          t.address2,
          s.longStateName,
          ISNULL(t.zip,'00000') zip
     FROM CustomerTemp t
          INNER JOIN States s ON s.ID = t.ID) A

I hope this helps when you find yourself faced with a similar task to accomplish. As always, I welcome feedback.

Have a Grateful day…Troy

Tags: , , ,
Posted in T-SQL | Comments (1)

Missing Clustered Indexes w/ Row Counts

April 1st, 2010

This post is a follow-up to Thomas LaRock’s excellent article here, so first go read that. I’ll wait.

You back? Good.

As most of you know tables missing clustered indexes (CI) are fairly bad performance-wise, but why? Even disregarding the fact that SQL Azure actually requires a CI for each and every table in the cloud database, and hence you may be “tripped up” on this in the future. More importantly, SQL Server sequentially reads the data in a CI one extent at a time. Brad McGehee states, “This makes it very easy for the disk subsystem to read the data quickly from disk, especially if there is a lot of data to be retrieved.” Heaps cause the server to do so much extra work due to unordered nature of NCI’s or having no indexes at all, that performance is bound to suffer.

My goal here is to extend the functionality of Thomas’ little script to do two additional things which I find extremely helpful. One, have the script work all at once for every database on a particular server. Certainly this is not an issue for those of you in environments with one database per server, but I’d bet that’s more the exception than the rule for most of us.

Secondly, I find it very useful to have a row count included to assist me with deciding on exactly how to proceed with the data provided. While I don’t disagree with the need to potentially trash any tables you find with 0 rows, that cleanup project is for another day. I’m concerned today with missing CI’s on tables that truly need them. Unfortunately, row count is not included in either of the joined tables in Thomas’ script. I solve that problem by creating a temp table filled from sp_spaceused and join it to the original resultset.

A couple caveats for you before checking out this script. Firstly, being that I work in an environment that’s 95% SQL Server 2000, this script was written specifically for 8.0. Now, I did indeed test it on 2005 and it does work, but I’d bet a lot of coin there are much easier ways of doing this using DMV’s and other new features. Secondly, the script has a hard time with alternate schemas besides ‘dbo’. While I’m fully aware the script can probably be changed to take current schema into account (perhaps from the ‘information_schema.tables’ system view), I don’t have the need to in my environment so honestly, I just never took the time to change the script to handle alternate schemas. Third, don’t hassle me for using cursors. I don’t use them in production, but am not averse to using them carefully in maintenance scripts that get run “occasionally.” I know all about the performance implications but not really an issue here for a script that takes less than a second to run…and on an ad-hoc basis at that.

Thanks again to the SQLRockstar for inspiring me to clean this up and post it out there for all.

Have a Grateful day…Troy


USE master
DECLARE cur1 CURSOR FOR
  SELECT [name] FROM sysdatabases
  WHERE [name] NOT IN ('tempdb','master','msdb','model')
        
DECLARE @db VARCHAR(75)
DECLARE @sql1 NVARCHAR(3000)
OPEN cur1
FETCH NEXT FROM cur1 INTO @db
WHILE @@FETCH_STATUS <> -1
  BEGIN
    SET @sql1 = N'SET NOCOUNT ON

      -- CREATE TEMP TABLE TO HOLD ALL TABLE NAMES
      CREATE TABLE #so (dbName VARCHAR(75),
        tableName VARCHAR(75),
        tableRows INT)
    
      -- FILL FROM sysobjects
      INSERT INTO #so
      SELECT ''' + @db + ''' [db], name [table], 0
      FROM ' + @db + '.dbo.sysobjects
        INNER JOIN ' + @db + '.information_schema.tables ist ON ist.table_name = ' + @db + '.dbo.sysobjects.name
      WHERE xtype = ''u''
        AND id NOT IN
          (SELECT id FROM ' + @db + '.dbo.sysindexes WHERE indid = 1)
        AND ist.table_schema = ''dbo''
        AND ' + @db + '.dbo.sysobjects.name NOT IN (''cdata'',''cglobal'')

      -- CREATE CURSOR TO ITERATE TABLE NAMES AND RUN sp_spaceused AGAINST THEM
      DECLARE cur2 cursor for
        SELECT tableName FROM #so
      OPEN cur2
      DECLARE @tn VARCHAR(75)
      DECLARE @sql2 NVARCHAR(1000)
      FETCH NEXT FROM cur2 INTO @tn
      WHILE @@FETCH_STATUS <> -1
        BEGIN
          -- CREATE TEMP TABLE TO HOLD ROW AMOUNTS (AND OTHER STUFF WE DONT CARE ABOUT NOW)
          CREATE TABLE #su
            (tableName varchar(100),
                NOR varchar(100),
                RS varchar(50),
                DS varchar(50),
                IxS varchar(50),
                US varchar(50))
          
          -- FILL THE NEW TEMP TABLE
          SET @sql2 = N''INSERT INTO #su EXEC ' + @db + '.dbo.sp_spaceused '''''' + @tn + ''''''''
          EXEC sp_EXECUTESQL @sql2
            
          -- JOIN THE 2 TEMP TABLES TOGETHER TO UPDATE THE FIRST WITH THE ROW NUMBER COUNT
          UPDATE #so
          SET tableRows = su.NOR
          FROM #su su
            INNER JOIN #so so ON so.tableName = su.tableName
          WHERE ISNULL(su.NOR,0) <> 0
          
          -- CLEAN UP
          DROP TABLE #su

          -- ITERATE        
          FETCH NEXT FROM cur2 INTO @tn
        END
  
      -- CLEAN UP
      CLOSE cur2        
      DEALLOCATE cur2

      -- OUR OUTPUT!!
      SELECT * FROM #so
      WHERE tableRows > 0
                  ORDER BY tableRows DESC

      -- CLEAN UP
      DROP TABLE #so'
    
    -- EXECUTE THE ENTIRE BLOCK OF DYNAMIC SQL
    EXEC SP_EXECUTESQL @sql1  
    
    -- ITERATE
    FETCH NEXT FROM cur1 INTO @db
  END

-- CLEAN UP
CLOSE cur1
DEALLOCATE cur1

Tags: , ,
Posted in Index Tuning | Comments (1)

FTI Catalog Relocation

March 1st, 2010
relocation

Had a relatively simple maintenance to perform this past weekend; the kind so simple, it’s the fodder of DBA dreams everywhere. I was assigned the task to move the data and log files for four separate databases to another drive on the same server. The server designation itself for these four database wasn’t to change; simply the file path to the mdf and ldf. To make it even easier, none of these four were involved with replication; actually quite a rarity in our environment. Our network operations team provided me with a new SAN share on this box, which afforded me the opportunity to ease some of the drive space issues I ran into occasionally.

At first glance, a simple detach-move-reattach was all that appeared necessary for this straightforward operation. But a deeper look prior to the maintenance window’s arrival indicated all four of these databases were full text enabled and populating six separate catalogs each. Ok, no need for panic, this doesn’t turn it from a half hour operation into a six-hour one, but dropping the 24 catalogs then recreating and repopulating them did indeed add a substantial amount of time to the project. It got me wondering just how efficient my approach really was.

Detaching and reattaching a database does by default disable full-text indexing on that database. After the files have been properly moved, the task at hand is to move the full-text catalogs to folder alongside the new mdf/ldf location. For example, the “Users” table is FTI’d and the original catalog in stored on the M drive, same as the data and t-log files. After the move, the mdf and ldf are now on the N drive but right-clicking the “Users” table shows the catalogs still mapped to M. Not only this, but they are grayed out in the full-text editing dialog box, disallowing a simple remapping of the catalog location. My solution was to carefully document the fields/names/primary keys/etc. used for each table’s FTIs, then remove all catalogs and re-add them individually with them mapped to N as needed. Not a mind-numbingly horrible operation as some DBA stuff goes, but complicated enough in that one truly has to concentrate and not miss ANY details whatsoever.

While I accomplished the whole thing with hours still to spare in the maintenance window, I told myself there just has to be an easier way to accomplish the FTI portion. I suppose I could script out the whole process from dropping of the original catalogs to the recreation at the new location in T-SQL. Now while that would indeed save me mountains of time during the window, just how long would it take to create those very scripts? Have I really saved myself work? I don’t think so with that option.

It occurred to me that this catalog location must of course be stored in one of the system tables somewhere. What if I simply ran an update against that row then repopulated the index? Or maybe there is some non-documented-super-duper-secret hidden system stored procedure I could fire off? Could it be that simple?

Turns out that it isn’t that simple after all.

Sudarshan offers a SQL Server 2005 solution towards the bottom of this forum post which seems to indicate the magic “Keep Full Text Indexes” checkbox included with the detach dialog takes care of the issue very easily. There are a couple gotchas according to Sudarshan, but his solution appears quick and elegant. Lest it be noted I haven’t actually tested it because I currently work in a predominantly SQL Server 2000 environment. This looks viable for ’05 and most likely ’08, but proceed cautiously due to me not knowing this author personally.

Microsoft offers up two support pages that address this very issue; here and here. As your typical MSDN pages do, these offer tremendous detail into the process; more than most humans need. They recommend not just a simple “point the catalogs somewhere else” solution, but registry changes as well. Now, well I’ve been around a bit and registry changes don’t scare me as much as they used to, I still try and avoid them whenever possible. Too many horror stories have surfaced over the years of even the slightest of registry changes wreaking total havoc across multiple systems, that I’m ALWAYS going to be hesitant to manually make those changes. If that’s what goes on behind the scene when I complete a wizard or close a dialog box, well then fine, that I trust. If I have a choice that’s slightly more laborious yet means I don’t have to visit regedit, then I’m all for it.

While my “manual” solution appears to possibly be the quickest methodology for a small number of databases, obviously one’s going to have major issues when it comes to scaling and trying to accomplish this over 20, 30 or more databases simultaneously. Like so so many other SQL Server topics, this comes down to an “It depends” answer once again. Take everything into account. How many databases? How long is your maintenance window? Your ability for producing similar scripts to run at a later time? Your tolerance for making direct registry changes? Take all these factors into account and I’m sure the solution you end up with with, will much more appropriate for your situation.

I’m very interested in your feedback on this topic. Have I over-complicated a relatively simple issue? Or have I stumbled upon one of those timeless aspects of SQL Server that simply needs dealing with on a manual basis? How have you accomplished this in the past? I look forward to your responses.

Have a Grateful day…Troy

Tags: , , , ,
Posted in Full Text Indexing | Comments (0)

Shrinking SQL Server Transaction Logs in Simple Recovery Mode

January 12th, 2010

Most DBAs are aware that when a SQL Server database is set in simple recovery mode, then the transaction logs of it’s member databases are shrunk automatically upon a CHECKPOINT. This is all fair and good, but how many times have you viewed the folder containing your mdf’s and ldf’s and see that many of the ldf’s are countless times bigger than the minimum size?

In a production environment, there are generally so many active processes happening simultaneously that it’s usually best to leave well enough alone. But it’s a different story in a development, staging, or R&D environment. Some mornings I navigate to the ‘mssql/data’ folder on my dev server and see 90% of the ldf’s taking up huge amounts of room. In some of these cases, there is active development work going on and thusly the fluctuation in transaction log size is expected. But more often than not 75% of these 90% haven’t been touched for days or longer!

Now, I don’t know about you, but drive space is always at a premium in our environment. A client requests a refresh of the development database from production and you barely have enough room on the drive to place the backup file, let alone keep the old one just in case. Yes, you can pick away at the contents of the drive and perhaps find an old file or two you could 86, but i usually go right to the jugular and try shrinking all the transaction logs and see if that frees up enough space for your needs.

I’ll reiterate myself at the risk of redundancy, but again this is not something I would do in production. As stated earlier though, in a development or staging environment, there is no harm whatsoever in shrinking the transaction logs. I was using the tried and true method of right-clicking the database > ‘All Tasks’ > ‘Shrink the Database’ > ‘Files’ > choosing the log file from the dropdown > ‘Ok’ > ‘Cancel’. That worked well enough but was laborious on some of my dev servers which literally have hundreds of databases on them. I threw together a simple script that does this automatically for all databases on the server and all it takes is a simple ‘execute’ and a minute or two and you can regain all that potential space back. The script is nothing mind-numbingly complicated, but nevertheless I wanted to share it with you all.

Ran this on one my dev servers just this morning and freed up almost 10% of the drive’s space in a matter of seconds. No digging through the file structure to examine ldf sizes. No manual shrinking. Just a couple seconds and I all the sudden found myself with enough room to accomplish the refresh I needed to do. Absolutely perfect.

Would love any feedback or comments you may have.

In the meantime, be sure to have a grateful day…Troy


SET NOCOUNT ON;

DECLARE cur CURSOR FOR
SELECT [name]
FROM master.dbo.sysdatabases

DECLARE @db VARCHAR(50), @sql NVARCHAR(1000)

OPEN cur
FETCH NEXT FROM cur INTO @db
WHILE @@FETCH_STATUS <> -1
BEGIN
     SET @sql = N'DECLARE @logname VARCHAR(100) '
     SET @sql = @sql + 'USE [' + @db + ']
          SELECT @logname = RTRIM(LTRIM([name]))
          FROM sysfiles WHERE name LIKE ''%log%'' '
     SET @sql = @sql + 'DECLARE @sf NVARCHAR(300) '
     SET @sql = @sql + 'SET @sf = ''DBCC SHRINKFILE
          ('' + @logname + '') WITH NO_INFOMSGS '' '
     SET @sql = @sql + 'PRINT @sf EXEC SP_EXECUTESQL @sf '
     EXEC SP_EXECUTESQL @sql
     FETCH NEXT FROM cur INTO @db
END

CLOSE cur
DEALLOCATE cur

Tags: , , , , ,
Posted in Transaction Log | Comments (1)

LiteSpeed Compression Ratios

December 14th, 2009

My company uses Quest Software’s LiteSpeed for SQL Server for both nightly and ad-hoc backups.  The DBA’s here before me set everything up using a compression ratio of 1 and we get actual compression and backup times that are more than adequate for our needs.  I’ve often wondered exactly what effect raising that ratio in the LiteSpeed settings would make in both how long the backup took, but in the actual compression itself.  What trade-off between size and efficiency do I have to make should I decide to raise that setting up the scale?

Let me first make it VERY clear.  I am not an employee of Quest.  I have nothing whatsoever to gain or lose by this examination.  I happen to be an extremely big fan of LiteSpeed and have found it by far to be the best backup solution I’ve had the pleasure to work with during my short professional career.  I have tremendous respect for everyone from Quest I’ve had the pleasure to come in contact with; that includes Kevin Kline, Brent Ozar & others.  This is simply a non-scientific, ground-level look at LiteSpeed’s compression ratios.  Please by all means visit Quest’s website to get all the product details and info you need.   (http://www.quest.com/)  Take nothing I say here as gospel.  We clear folks??   I don’t wanna wake up to a cease-and-desist order from Quest tomorrow, ok?

The results graph first, then a discussion on my conclusions…
LS_test

The blue line and left x-axis represent actual compression for the ratios along the bottom.  The red line and right x-axis likewise represent the time taken vs. a native non-LS backup.  So for example, when I did my sample on compression level 3, it was compressed almost 81% and it took just about half the time of a full native backup.

Ok, so what observations can I make from these results?  As far as compression goes, there seems to be a significant jump from level one to two; that 6% could make a tremendous difference when extending these results to an entire company’s worth of backups.  From there, the compression indeed inches up, but never more than another 1/4 or 1/2 percent at a time.

Looking at the red line (time taken for the backup to complete vs a native one), we see time is even quicker than native all the way through level four.  Even five and six completed in about the same time as a native backup.  From there though, total time taken ramps up pretty quickly; specially at the highest compression ratio which took over five times as long as a native backup.

What compression ratio is best for your situation?  Like Paul Randal is so fond of saying, “It depends.”  This falls along with the dozens of other decisions we make as DBA’s on a daily basis; a trade-off between factors.  Are we severely limited in backup drive space yet have plenty of time to accomplish the backup?  If this is the case then I might choose level 7 or 8.  Is drive space never ever an issue, yet your evening backup window is?  Perhaps level 1 would be the optimal choice for you.

I think 90% of us fall somewhere between those two extremes.  Yes, storage is cheap, but rarely ever unlimited.  While some of us may have multi-terabyte server farms to backup to; it’s still irresponsible to waste it needlessly.  I wouldn’t be awfully surprised though if many of us fall into the “not-enough-hours-in-a-night” caveat where backup time is much more a significant factor.  From my testing and actual experience, I’ve found a compression factor of two or three brings the right balance into a variety of environments.

I would love some feedback and/or some opposing viewpoints from anyone; particularly if someone from Quest happens upon this.

Have a Grateful day…Troy

Tags: , , , , ,
Posted in Backups | Comments (4)

Get Adobe Flash player