Search
Monday, October 22, 2018 ..:: Home ::.. Register  Login

Peter Henry on Facebook  Peter Henry on LinkedIn  Peter Henry on Twitter

   Calendar  
     
  
   Search  
     
  
   Blogroll  
     
  
   Disclosure  
All blog entries are the opinions of the author and do not necessarily reflect the opinions of their employer. All the code presented is for explanation and demonstration purposes only. Any damages incurred to your site and/or data are not the responsibility of the author. Every effort is taken to ensure the code properly compiles, however sometimes there are some hiccups and you might be required to do your own debugging.
     
  
   TechTidBits (Blog)  

How do you do an outer join with DataTables in .NET 2.0?

Sep 8

Written by:
Wednesday, September 08, 2010 8:07 PM  RssIcon

Outer joins in .NET 2.0How would you do an outer join using .NET 2.0?  If you're thinking EASY, LINQ or just query the db again, you're not really reading the question.  Read on, don't worry, it's not really a trick question.

Outer joins in .NET 2.0Last week, I had to join two DataTables together using an outer join.  Seems easy right?  And if you're thinking, PIECE OF CAKE!!!!!!!!!!!!  Just requery the DB.  Or mabye you're thinking to yourself, just use LINQ.  HHHHHMMMMM I would LOVE to use those things but the problem is I'm tied to .NET 2.0 and unfortunately the data is already in an XML file and I'm not able to requery the db again.  DOH!  If you're shrugging your shoulders......now you know where I was last week!

Well, as my boss Shane would say, just use brute force.  Yup, after googling/binging around, I realized, besides requerying db, which I already mentioned wasn't feasible, I would have to do it myself.  Ok, no big deal right.....take a deep breath and check this out, it's not that bad.

I'm dumbing down the problem to it's utter essance, but the deal is I had to read two DataSets from different XML files (which I've blogged someting about last week) then perform an "a right outer join" on two of the tables.  In other words, I wanted all the rows from Table A (which has an ordering column), join it to Table B but making sure to include all the rows from the second Table B.  Ok, ok,TECHNICALLY this is considered in SQL, a RIGHT outer join, but if you flip the tables around you get a left outer join. 

Teams DataSet

The main issue here is, I have an ordering column in the first table on the left, none on the one on the right, BUT I want to ensure I keep all the orders consistent.  hhhhmmm "consistent" eh?  Ya, consistent....I took that to mean IF I have rows without data in the OrderID field (ie coming from the right outer join), I want them to be ordered LAST and on top of that, I want those ordered in alphabetical order.  Seems complicated.  No, not really, the trick was to realize the OrderID column is an integer (thanks Einstein, REALLY? LOL) and I can use Int32.MaxValue as a way to order all the "outer joined values" to the bottom, THEN order those by alphabetical order.  Here's the magic method.

private void OuterJoinButton_Click( object sender, EventArgs e )
    {
      #region checking to see if required data loaded or not
      if( hockeyDataSet.Tables["DraftOrders"] == null )
      {
        LoadDataByXML_Click( this, e );
      }
      #endregion

      //create a DataTable which is an exact schema copy of the left hand side you want to be part of the outer join
      //you COULD .Clone() the table then loop through each DataRow doing an ImportRow() but then again, why when you have .Copy()?
      DataTable DraftOrdersDataTable = hockeyDataSet.Tables["DraftOrders"].Copy();

      //loop through Teams now and add missing ones, make their DraftID really high to order last
      //use the DataTable.Select() to identify the missing data points/elements you want to find in order to add
      //these are going to be the "outer" portions of your left outer joins
      //if there are rows, well, then they aren't missing and you don't need to add them,
      //but if they are missing, then you DO need to add them
      //I'm adding the Int32.MaxValue to ensure these rows added in the "outer join" are sorted last on purpose
      //you'll have to be careful here if you're doing this with an auto-incrementing counter column and reloading it with the DataTable.Rows.Add() method
      foreach( DataRow row in hockeyDataSet.Tables["Teams"].Rows )
      {
        DataRow[] missingTeam = DraftOrdersDataTable.Select( string.Format( "{0} = '{1}'", DraftOrdersDataTable.Columns[1].ColumnName, row[0].ToString() ) );
        if( missingTeam.Length == 0 )
        {
          //OverflowException if you do it this way
          //DataRow rowToAdd = DraftOrdersDataTable.NewRow();
          //rowToAdd[0] = Int32.MaxValue;
          //rowToAdd[1] = row[0].ToString();

          object[] rowToAdd = new object[] { Int32.MaxValue, row[0].ToString() };
          DraftOrdersDataTable.Rows.Add( rowToAdd );
        }
      }

      //use this if you JUST want to sort by the first column, but then that leaves all of the right join data without any real logical sort order
      //DataRow[] sortedDraftPics = DraftOrdersDataTable.Select( "", DraftOrdersDataTable.Columns[0].ColumnName );
      //use this to sort the right join data to the end, then sort them alaphabetically, a bit more logical/predictable when viewing
      DataRow[] sortedDraftPics = DraftOrdersDataTable.Select( "", string.Format( "{0}, {1}", DraftOrdersDataTable.Columns[0].ColumnName, DraftOrdersDataTable.Columns[1].ColumnName ) );

      //we have all our bits in order, now we just have to create a DataTable to display
      //again, create another DataTable to house our now sorted teams, again using the Clone() to create the exact copy we want,
      //loop through eacy row and import them in, then set the DataSource
      DataTable sortedDraftPicksToDisplay = DraftOrdersDataTable.Clone();
      foreach( DataRow row in sortedDraftPics )
      {
        sortedDraftPicksToDisplay.ImportRow( row );
      }

      OuterJoinDataGridView.DataSource = sortedDraftPicksToDisplay;
    }

The meat of the comments are embedded as to what an why I'm doing things but notably,

  • start with a .Copy() of the left hand DataTable, this ensures you have the correct schema and data you require
  • looping through that DataTable, call .Select() and look for the other column name, if you have NO DataRows, you found a missing entry which requires entry into the copied DataTable (that's the outer join part, easy eh?)
  • create an object[] to add your values then add that to the DataTable, this ensures you don't conflict with the field Auto Incrementor logic
  • and lastly, you'll have to do a bit of magic to sort that new table by the OrderByID column, followed by the TeamName column, this ensures your proper ordering

If you try to uncomment out the code above with the NewRow() (comment out the other logic LOL), you'll get this exception, but only on the second row being added, not the first.  The deal there is, the DataTable schema definition says the OrderByID is auto incrementing, and therefore, on the first NewRow() it's Int32.MaxValue and all is cool, but when you goto add the second row.....BOOM Int32.MaxValue + 1 is....well......too big for an Int32.

Overflow Exception

Conclusions?  Sure!  Yes, this is inefficient by today's standards with Linq, lambdas, and all that other database manipulation stuff.  Worse off this is HORRIBLY inefficient with memory, can you just count how many new DataTables are being created?  This most certainly is NOT done "in place."  And if you're used to profiling/performance analysis, yup, this most certainly looks like it's O(n) (order n), which means it's bad, as the number of data rows increase, your performance will degrade linearly due to the "touching" of every datarow multiple times.  Yup, it's slow and it's inefficient.  But, it's working!!!!  And without requering the db again, and without Linq.  Not pretty, but it works!

Now that you know one way to do an outer join in .NET 2.0 DataTables, it's time to grab a coffee and get coding.

 

Resources
Source code: http://www.pchenry.com:8080/svn/blog/trunk/2010/ReadingFromResources

PCHenry: How do you load a DataSet at runtime?

Tags:
Categories:
Location: Blogs Parent Separator TechTidBits

Your name:
Gravatar Preview
Your email:
(Optional) Email used only to show Gravatar.
Your website:
Title:
Comment:
Add Comment   Cancel 
     
  
Copyright 1999-2012 by PCHenry.com   Terms Of Use  Privacy Statement