Search
Sunday, August 01, 2010 ..:: Home ::.. Register  Login
   Calendar  
     
  
   Search  
     
  
   Blogroll  
     
  
   Disclosure  
All blog entries are the opinions of the author and do not necessarily reflect the opinions of their employer. All the code presented is for explanation and demonstration purposes only. Any damages incurred to your site and/or data are not the responsibility of the author. Every effort is taken to ensure the code properly compiles, however sometimes there are some hiccups and you might be required to do your own debugging.
     
  
   TechTidBits (Blog)  
Jun 3

Written by: Peter Henry
Wednesday, June 03, 2009 8:35 PM

A long time ago I learnd about normalizing a db.  You know, no duplicates of data, ANYWHERE!  What's that called again, fifth normal form, right?

If you haven't heard of it, it's basically joins all over the place and at the added expense of potential confusion, you'll never have duplicated data anywhere (or so goes the rules).  You'll also most likely have to do lots of table joins to get any valuable/useful data on your web page, WinForms or reports (hopefully you're NOT using Crystal?  DOH! if you are haha).  The opposite of this is something like a data wharehouse, where the data changes very infrequently (or not at all) and speed of data retrieval is critical and therefore duplicating data is an acceptable tradeoff to data maintainability.

If you're still with me, you probably know about INNER JOINs, OUTER JOINs, left and right outer joins and those kinds of things (Cartesian products? hahah)!  You'll also (hopefully) know about UNION and UNION ALL and the differences between them (slight performance implications, good to have in your toolchest).  Pinal Dave has a great article talking about UNION and UNION ALL.  In there, he talks about the sql DISTINCT clause.

Pinal's article got me thinking about how it just irks me to see DISTINCTs in sql.  Ok, now I'm sure DM, SM and GJ might be trying to climb through their LCDs to get to me at this point (their VERY good DBAs) but let me explain.  From a developers standpoint, when I've seen a DISTINCT clause used, it's cause either someone's FORCED into it becaues of a badly designed schema, OR, and usually more likely, someone's just taking the easy way out!

This is sad, but I have to say, in my past experiences with consultants, business analysists, weak DBAs, jr devs, and ultra aggressive deadlines, I've seen WAY TOO MANY weak queries with the DISTINCT clause.  Why "weak?"  Chances are VERY high they're using the distinct in their select query cause they're whipping it up as they go along, testing it, tweaking it, testing it, refining it, tes....OH CRAP!!!??!?!?!  IT WORKS!!!!!!!!!! DON'T BREATHE, DON'T TOUCH IT!!!!!!!!!!!!!!!  It works, leave it alone, oooooohhhhhh we have duplicates?  Oh crap!  Oh, we can fix it with a DISTINCT!!!!!!  DOH!!!!!  THAT'S what irks me!  It's HIGHLY likely something else will break THAT sql/stored proc, report or function, trigger, or what ever sql it's in, it'll break, and it's break soon!!!!!!  And you'll be screwed in trying to find out WHY!!!!!!!  Every put a list of values into a listbox today only to find out in 6 months the web server's giving you an uncaught exception?  CLASSIC!!!!!!!!!!!!!!!

The chances are very good, if you took 15 more minutes, you could quickly see you're missing a PK in a join someplace, or maybe you have a table joining to itself but your joins are reversed (this actually happened to me today!).  Or maybe you're skipping an inner table that SHOULD be used to join two other tables together but you're picking the easy way out.  Don't!

So head my warning, please don't use a DISTINCT as the cheap/easy way out!  IF, and I mean ONLY IF, you have to use it (and I'll admit, ther are times you will have to use it, but they aren't many), then chances are you won't mind putting a few /*comments as to why you're using it*/ so the next DBA or dev who comes along won't try to pull it out, along with their hair!

THERE!  Now that's off my chest, it's time to grab a coffee and get back to coding!

 

 

Resources

Pinal Dave: SQL Server - Difference Between Union vs Union All - Optimal Performance Comparison

Copyright ©2009 Peter Henry

Tags:

4 comments so far...

Re: Are you a DISTINCT kind of person with your SQL queries?

Here, here! Although I almost think this shouldn't be publicised too much. Seeing a 'distinct' clause in query is a nice clue that the author probably doesn't know what they are doing and its worth looking closer at their query!

By Leon on   Tuesday, July 28, 2009 9:52 AM

Re: Are you a DISTINCT kind of person with your SQL queries?

Thank you Leon for the comment! I laughed when I read your comment about "shouldn't be publicised too much" haha At that risk, it is worthwhile getting the word out not to rely or use the DISTINCT clause by default. Too many people/places are complacent and let the DISTINCT slide by without further inquiries. But with your help, we'll try to eliminate it (where we can that is haha).

Have a good day!

By phenry on   Tuesday, July 28, 2009 9:55 AM

Re: Are you a DISTINCT kind of person with your SQL queries?

Amen, bro, amen; as true as now as it was a year ago. Even the legit use of distinct to create aggregates can be done as a group by, instead: that shows the Intent to Aggregate, gives same query plan as distinct, and provides the option of the powerful Having clause, if needed.

By John on   Tuesday, June 29, 2010 8:56 AM

Re: Are you a DISTINCT kind of person with your SQL queries?

re using GROUP BY instead of DISTINCT?
VERY interesting, I have to look into that one! Very cool, I hadn't thought of that, but ya you're right, that could be used. Nice. Thanks John for the comment and kind words! Have a good day!

By phenry on   Tuesday, June 29, 2010 9:00 AM

Your name:
Title:
Comment:
Add Comment    Cancel  
     
  
Copyright 1999-2010 by PCHenry.com   Terms Of Use  Privacy Statement