A long time ago I learnd about normalizing a db. You know, no duplicates of data, ANYWHERE! What's that called again, fifth normal form, right?
If you haven't heard of it, it's basically joins all over the place and at the added expense of potential confusion, you'll never have duplicated data anywhere (or so goes the rules). You'll also most likely have to do lots of table joins to get any valuable/useful data on your web page, WinForms or reports (hopefully you're NOT using Crystal? DOH! if you are haha). The opposite of this is something like a data wharehouse, where the data changes very infrequently (or not at all) and speed of data retrieval is critical and therefore duplicating data is an acceptable tradeoff to data maintainability.
If you're still with me, you probably know about INNER JOINs, OUTER JOINs, left and right outer joins and those kinds of things (Cartesian products? hahah)! You'll also (hopefully) know about UNION and UNION ALL and the differences between them (slight performance implications, good to have in your toolchest). Pinal Dave has a great article talking about UNION and UNION ALL. In there, he talks about the sql DISTINCT clause.
Pinal's article got me thinking about how it just irks me to see DISTINCTs in sql. Ok, now I'm sure DM, SM and GJ might be trying to climb through their LCDs to get to me at this point (their VERY good DBAs) but let me explain. From a developers standpoint, when I've seen a DISTINCT clause used, it's cause either someone's FORCED into it becaues of a badly designed schema, OR, and usually more likely, someone's just taking the easy way out!
This is sad, but I have to say, in my past experiences with consultants, business analysists, weak DBAs, jr devs, and ultra aggressive deadlines, I've seen WAY TOO MANY weak queries with the DISTINCT clause. Why "weak?" Chances are VERY high they're using the distinct in their select query cause they're whipping it up as they go along, testing it, tweaking it, testing it, refining it, tes....OH CRAP!!!??!?!?! IT WORKS!!!!!!!!!! DON'T BREATHE, DON'T TOUCH IT!!!!!!!!!!!!!!! It works, leave it alone, oooooohhhhhh we have duplicates? Oh crap! Oh, we can fix it with a DISTINCT!!!!!! DOH!!!!! THAT'S what irks me! It's HIGHLY likely something else will break THAT sql/stored proc, report or function, trigger, or what ever sql it's in, it'll break, and it's break soon!!!!!! And you'll be screwed in trying to find out WHY!!!!!!! Every put a list of values into a listbox today only to find out in 6 months the web server's giving you an uncaught exception? CLASSIC!!!!!!!!!!!!!!!
The chances are very good, if you took 15 more minutes, you could quickly see you're missing a PK in a join someplace, or maybe you have a table joining to itself but your joins are reversed (this actually happened to me today!). Or maybe you're skipping an inner table that SHOULD be used to join two other tables together but you're picking the easy way out. Don't!
So head my warning, please don't use a DISTINCT as the cheap/easy way out! IF, and I mean ONLY IF, you have to use it (and I'll admit, ther are times you will have to use it, but they aren't many), then chances are you won't mind putting a few /*comments as to why you're using it*/ so the next DBA or dev who comes along won't try to pull it out, along with their hair!
THERE! Now that's off my chest, it's time to grab a coffee and get back to coding!
Resources
Pinal Dave: SQL Server - Difference Between Union vs Union All - Optimal Performance Comparison