Which is Faster; the Chicken or the Egg?

December 2nd, 2010
by The Grateful DBA

Due to a company policy, all UPDATEs at my office eventually headed to prod, are required to be based solely off a primary key and must be written out one line at a time.  This pertains if the PKs are sequential or not.  Yesterday I received a script for approval that contained 14,000 individual UPDATE statements based off an identity key and comprising 100% of the table.  So imagine if you will (said in my best Rod Serling voice) 14,000 of these…

UPDATE Customer SET firstName = ‘Foo’ WHERE id = 1
UPDATE Customer SET firstName = ‘Foo’ WHERE id = 2
UPDATE Customer SET firstName = ‘Foo’ WHERE id = 3

It occurred to me that as long as we have sequential ID’s to update off of, then it had to be substantially quicker to do a single UPDATE using a BETWEEN clause. So, 14,000 of the above statements turn in to just one…

UPDATE Customer SET firstName = ‘Foo’ WHERE id BETWEEN 1 AND 14000

But just how much faster is the BETWEEN vs. individual UPDATEs? Turns out the answer is "way way way". Unfortunately, "way way way" isn’t extremely scientific. So the geek in me had to see exactly how much faster and wrote some scripts to test it out.

On a local SQL Server 2008 R2 installation (no network interference), I filled two local temp tables with random numbers between 1 & 50. I then simply updated this value to ’51′ both ways. First one row at a time based off the incrementing PK, and secondly all at once using BETWEEN. The difference even surprised me. At all levels I tested, it was over 99% faster, and the larger the sample size, the more significant the time savings. The results…

SAMPLE SIZE INDIV UPDATES AVG BETWEEN AVG RATIO PCT. FASTER
500 641.8 4.8 0.75% 99.25%
5000 6837.8 19.6 0.29% 99.71%
10000 15742.4 51.6 0.33% 99.67%
15000 28546.4 63.0 0.22% 99.78%
20000 63462.6 95.8 0.15% 99.85%

Ok, so what conclusions can we make from this data?

  1. Not only is it faster using the BETWEEN function, but it’s significantly faster. Over 99% faster in the sample sizes I used
  2. The time savings (percentage-wise) tends to increase as the number of UPDATEs increase as well. Good to know because this may be something to take into account when doing that pesky 1,000,000 row update; but not so much when just on a couple rows.
  3. The back-and-forth, back-and-forth, back-and-forth REALLY adds up over a larger sample size. For example; going from just 500 to 5000 UPDATEs, we’ve increased the sample size 10-fold, the BETWEEN was 4x as long, yet the individual UPDATEs took over 10x as long!
  4. We bloggers will write about anything SQL related no matter how obvious it may be if it offers even a glimmer of additional insight into our product.

Have a Grateful day…Troy

Bookmark this on Delicious
Bookmark this on Digg
Share on Facebook
Bookmark this on Google Bookmarks
Share on LinkedIn
Bookmark this on Yahoo Bookmark

Tags: , , , , ,
Posted in Performance Tuning, T-SQL | Comments (3)

3 Responses to “Which is Faster; the Chicken or the Egg?”

  1. Crystal Says:

    Very cool! (R)(R) all the way!

  2. Jason Bacani Says:

    Consider Indiv Updates Avg of 641.8 versus Between Avg of 4.8.

    Difference is 641.8 – 4.8 = 638.0

    Then 638.8 / 4.8 = 132.7… Meaning 132.7 times faster…

    That then is a 13170 Percent increase, or 13170 percent faster.

    Does this make sense?

    Or is my math wrong?

  3. Tweets that mention The Grateful DBA » Which is Faster; the Chicken or the Egg? -- Topsy.com Says:

    [...] This post was mentioned on Twitter by Jason E Bacani, Troy Gallant. Troy Gallant said: [New Blog Post] Which is Faster; the Chicken or the Egg? http://troygallant.com/?p=842 [...]

Leave a Reply

Get Adobe Flash playerPlugin by wpburn.com wordpress themes