Home > C#, Extension Methods, LINQ > Batch Updates and Deletes with LINQ to SQL

Batch Updates and Deletes with LINQ to SQL

A couple weeks ago, I read the article, LINQ to SQL Extension: Batch Deletion with Lambda Expression by Jeffrey Zhao.  In case you didn’t read the article, it discusses the downside of most O/R Mapping frameworks when it comes to multiple updates or deletes.  He states the fact that a SQL statement for each row flagged as update/delete in the entity set is created.  I went about implementing something similar to what Jeffrey envisioned and I’ll explain some of the hurdles I had to overcome to achieve it.

Since I recently converted my site to blogengine.net, I’ll use its database schema for all examples to follow.  In addition to this, all code snippets you’ll see were run in LINQPad, which at the moment is by far my favorite development tool that I’m using (now if they’d only get intellisense in there, I’d play with this program all day long).  If I were to include a review of LINQPad, it would be glowing but I don’t want to consume this post on that, however in case you do not know what LINQPad is, for the purpose of this article, it allows us to write LINQ to SQL statements along with any other arbitrary C# code (basically just a disclaimer that you will not see any DataContext creation/code in my samples since it is implicit in LINQPad).

Quick Links to All Articles

Batch Updates and Deletes with LINQ to SQL
LINQ to SQL Batch Updates/Deletes: Fix for ‘Could not translate expression’
I’ve Left Query Analyzer Hell For LINQPad Heaven

So what’s the problem with DeleteAllOnSubmit()?

As the title of this post implies, this is really only an issue when you want to update and/or delete rows from a table with the same ‘rules’ applied for every row in an entity set.  In Jeffrey’s article, he gave an example of ‘deleting all rows where creation date was older than a month’.  This is a perfect example of when a O/R Mapping framework issues multiple delete statements when a single ‘simple’ SQL statement could be written.  So lets say I had the following code:

var posts =
	from p in be_Posts
	where p.DateCreated
		.Value
		.CompareTo( DateTime.Today.AddMonths( -1 ) ) < 0
	select p;

be_Posts.DeleteAllOnSubmit( posts );
SubmitChanges();

Upon calling SubmitChanges(), the following SQL statements would be issued:

SELECT [t0].[PostID], [t0].[Title], [t0].[Description], 
	[t0].[PostContent], [t0].[DateCreated], [t0].[DateModified], 
	[t0].[Author], [t0].[IsPublished], [t0].[IsCommentEnabled], 
	[t0].[Raters], [t0].[Rating], [t0].[Slug]
FROM [be_Posts] AS [t0]
WHERE ([t0].[DateCreated]) < @p0
-- @p0: Input DateTime 
-- (Size = 0; Prec = 0; Scale = 0) [3/13/2008 12:00:00 AM]

DELETE FROM [be_Posts] WHERE [PostID] = @p0
-- @p0: Input UniqueIdentifier (Size = 0; Prec = 0; Scale = 0) [{GUID}]

DELETE FROM [be_Posts] WHERE [PostID] = @p0
-- @p0: Input UniqueIdentifier (Size = 0; Prec = 0; Scale = 0) [{GUID}] 

-- Inidivdual DELETE statements for each row 
-- in the IQueryable<be_Posts> variable.

So as you can see, to simply delete a list of be_Posts rows, LINQ to SQL first queries all the data.  The performance impact is directly based upon what your row contains (i.e. binary data, or PostContent in this case) and how many rows the LINQ query returned.  Then, LINQ to SQL iterates all rows return, flags them as deleted, then generates a single DELETE statement.  This DELETE statement creates a simple WHERE clause using the row’s primary key (PostID in my case).  In addition to this overhead, Jeffery did not like the need to call SubmitChanges() on the DataContext object to confirm database changes.  Now I whole heartedly agree about combining simple DELETE SQL statements into one, but I actually disagree with the desire to eliminate the call to SubmitChanges().  More on that later.

Okay, that’s a problem, how do we fix it?

In a LINQ to SQL utopian world, executing the C# code above, the provider would generate only a single DELETE statement as follows:

DELETE FROM [be_Posts] WHERE [DateCreated] < @p0
-- @p0: Input DateTime 
-- (Size = 0; Prec = 0; Scale = 0) [3/13/2008 12:00:00 AM]

So how can we do this?  One of the shortcomings of Jeffrey’s code (self admittedly) was that he was essentially generating the SQL statement himself by using an ExpressionVisitor and took LINQ to SQL provider ‘out of the loop’.  Because of this, he did not support all the functionality that LINQ to SQL provided (i.e. post.Title.Length < 10).  One of the comments on his post suggested ‘wrapping’ an INNER JOIN around the SQL generated by the original IQueryable expression and that is exactly what I was thinking (at least I keep telling myself that…nice suggestion Richard).  The great thing about this is now, we only have to do a little SQL statement generation to append on to whatever LINQ to SQL gives us.  This keeps all the ‘complicated’ stuff where it belongs: in Microsoft libraries.  So, now Utopia would end up with a SQL statement like so:

DELETE [be_Posts]
FROM [be_Posts] AS j0 INNER JOIN (

	SELECT [t0].[PostID]
	FROM [be_Posts] AS [t0]
	WHERE ([t0].[DateCreated]) &lt; @p0

) AS j1 ON (j0.[PostID] = j1.[PostID]) 

-- @p0: Input DateTime 
-- (Size = 0; Prec = 0; Scale = 0) [3/13/2008 12:00:00 AM]

To do this, I decided to create an extension method on the Table<TEntity> with an IQueryable<TEntity> argument.  I did this to emulate the LINQ to SQL libary’s Table<TEntity>.DeleteAllOnSubmit( IEnumerable<TEntity> entities ) function.  In addition to having a similar interface, it also allows you to build up your final IQueryable parameter through multiple Query Syntax queries.  Consider the following ‘trivial’ example.  I know this could be written in a single Lambda expression, but I would find it easier to read and maintain as individual queries.  Even if they were all combined into one Query Syntax query, to me it is more readable than ‘stuffing’ it all in on one Lambda parameter.

var posts = from p in be_Posts select p;

var oldPosts =
	from p in posts
	where p.DateCreated
		.Value
		.CompareTo( DateTime.Today.AddMonths( -1 ) ) &lt; 0
	select p;

var myPosts =
	from p in oldPosts
	where p.Author == "Terry Aney"
	select p;

var first10 = myPosts.Take( 10 );

// Do something to first10...

Without drilling into the supporting methods quite yet, essentially, my extension method is defined as follows:

public static int DeleteBatch&lt;TEntity&gt;(
	this Table&lt;TEntity&gt; table, IQueryable&lt;TEntity&gt; entities )
	where TEntity : class 
{ 
	DbCommand delete = table.GetDeleteBatchCommand&lt;TEntity&gt;( entities );

	var parameters = 
		from p in delete.Parameters.Cast&lt;DbParameter&gt;()
		select p.Value;

	return table.Context.ExecuteCommand( 
		delete.CommandText,
		parameters.ToArray() );
}

So we’ve got DELETE whipped, what about UPDATE? (take one)

Admittedly, the DeleteBatch() extension method was pretty easy to implement.  Batch updating is where I encountered all of my dilemmas.  Following the same principle as I described above, my function signature looked like:

public static int UpdateBatch&lt;TEntity&gt;( 
	this Table&lt;TEntity&gt; table,
	IQueryable&lt;TEntity&gt; entities,
	object evaluator ) where TEntity : class
{
}

Now you should notice the evaluator parameter of type object and begin to get suspicious. The first dilemma to deal with was how powerful was I going to make this extension method.  I saw Jeffrey’s example of passing a Lambda expression as a parameter that defined how to update each row being processed, but admittedly my expression tree knowledge was zero.  Additionally, I naively thought that was an unnecessary feature for me because I figured all my batch updates would be simply setting column values to a fixed value that could be determined up front.  So my first implementation, evaluator was passed in as an anonymous type and I used reflection on all the public properties and looked for matching field names on the TEntity.  So extending on my ‘first10’ Query Syntax example variable above my update might have looked like this:

be_Posts.UpdateBatch( first10, new { Author = "Chris Cavanagh" } );

Now the immediate downside you’ll notice to this, which is basically completely contradictory to O/R Mapping frameworks, is that I will get no intellisense or type checking on the Author property.  However, I accepted this because:

  1. As I mentioned, I had no expression tree knowledge/experience and this appeared to be the best I could do and;
  2. I figured most of my ‘batch statements’ were going to be executed in LINQPad anyway, and since it didn’t have intellisense yet I wasn’t really ‘losing’ anything 😉
    On a bit of a side note, although it does have a purpose, at this point, I was all proud of myself.  I had created the extension methods and was ready to write my first technical blog (until this point my website was all personal).  So I wanted to make sure that Google was going to be able to find my post about my ‘great’ extensions once I finished writing about them.  However, upon submitting my sitemap to Google, there were errors in the feed.  It turns out that my ‘modified’ date on my posts (URLs) were null and returning an invalid date.  This is because I hand populated blogengine.net’s database from my previous home grown blog site and I forgot to populate that field.  So I needed to simply update all my posts, setting ‘modified’ equal to ‘created’.  BatchUpdate, here I come…er, crap I can’t update columns based on other columns.  Jeffrey’s vision was right and I needed help.

All right, how hard can these expression trees really be?

I wouldn’t know, I’m still clueless.  Just kidding, I understand them slightly now, but only because my end implementation is still fairly simple.  To get to that point, I enlisted the help of Chris Cavanagh, a co-worker, since he was worlds beyond me in knowledge of expression trees (actually, the same can probably be said for any technology…but hey, I have him beat in VBScript and Transact SQL!).  So the second iteration’s function ended up looking like the following:

public static int UpdateBatch&lt;TEntity&gt;(
	this Table&lt;TEntity&gt; table,
	IQueryable&lt;TEntity&gt; entities,
	Expression&lt;Func&lt;TEntity, TEntity&gt;&gt; evaluator ) where TEntity : class
{
	DbCommand update = table.GetUpdateBatchCommand&lt;TEntity&gt;( 
		entities,
		evaluator );

	var parameters = 
		from p in update.Parameters.Cast&lt;DbParameter&gt;()
		select p.Value;

	return table.Context.ExecuteCommand( 
		update.CommandText,
		parameters.ToArray() );
}

Given this, to perform the ‘modified’ equals ‘created’ update example from above, you could have the following (the Author assignment is for a subsequent sample):

var posts = from p in be_Posts select p;
be_Posts.UpdateBatch( 
	posts, 
	p =&gt; new be_Posts { 
		DateModified = p.DateCreated,
		Author = "Chris Cavanagh" 
	} );

The Lambda expression parameter contains an object initializer.  From this, we extract the field names and values to assign to every row.  This expression will never be invoked; it’s used merely as a convenient syntax to describe the assignments we want and convert them into the appropriate SQL SET statements.  Of course we want the underlying LINQ Provider do all the SQL generation for us and we accomplished that in the following steps:

  • Using an ExpressionVisitor, visit the single MemberInitExpression (i.e. new be_Posts { … }) and process each MemberAssignment in MemberInitExpression.Bindings.
  • If the binding uses the argument in the assignment (i.e. DateModified = p.DateCreated),
    • We create a MethodCallExpression which represents a ‘simple’ select of only the assignment part (i.e. p.DateCreated) of the expression
    • Given the MethodCallExpression, call IQueryProvider.CreateQuery() to convert it into an IQueryable object.
    • Given the IQueryable object, call DataContext.GetCommand(), therefore letting the LINQ Provider generate SQL.
    • Given the resulting DbCommand.CommandText (i.e. SELECT DateCreated FROM be_Posts), parse out the required text to generate a SET statement (i.e. DateCreated).
    • Given the resulting DbCommand, take any DbParameter objects and merge them into the final update command.
  • If the binding does not use the argument in the assignment, but rather just predefined constant values (i.e. Author = “Chris Cavanagh”)
    • Compile and invoke the assignment expression to obtain the constant value*
    • Add the constant value as a parameter to the final update command.
  • Given all the individual SET statements, combine them into one SQL UPDATE statement with appropriate parameter names.

* For compiling the expression and invoking it, the following code was used:

var constant = 
	Expression
		.Lambda( assignment.Expression, null )
		.Compile().DynamicInvoke(); } );

So now, if we previewed the the SQL statement that would be generated, we’d see something like:

UPDATE [be_Posts]
SET [DateModified] = [DateCreated], [Author] = @p0

FROM [be_Posts] AS j0 INNER JOIN (

	SELECT [t0].[PostID]
	FROM [be_Posts] AS [t0]

) AS j1 ON (j0.[PostID] = j1.[PostID])

-- @p0: Input NVarChar (Size = 14; Prec = 0; Scale = 0) [Chris Cavanagh]

So you only mentioned one dilemma, where are the others?

So there were just a couple issues that threw me for a loop when testing out my extension methods (thankfully all of them were resolved).

  1. If I used the Take() function in my queries, the generated SQL didn’t seem to be processing it.  I was missing the TOP N part of the select statement.
  2. If I used the orderby keyword, the generated SQL was invalid.  Since it was wrapped in a sub query, it needed a TOP clause (easy enough to add TOP 100 PERCENT once I figured out the #1).
  3. During my update command creation, I was adding parameters as @p{fieldName}{0-N} to avoid any @p{0-N} naming conflicts with both the original INNER JOIN sub query and all the separate SET queries.  I was 100% positive that my name was correct in both the SQL SET statement I created along with the name used when I added the DbParameter to the final update command.  However, upon executing the command, I was getting an error saying that my parameter was not declared.
  4. What to do about SubmitChanges()?  I didn’t resolve this one and don’t even know if it is possible.  Like I said, I disagree with Jeffrey’s distaste for the additional call.  The simple reason being that it now behaves differently than rest of ‘standard LINQ’ CRUD operations.  If you were going to chain an UpdateBatch() or DeleteBatch() with a bunch of other CRUD actions, the standard ones will not apply until SubmitChanges() is called, yet my *Batch() methods will immediately apply.  It just feels wrong to me is about all I can say.  Although, if you expose stored procedures in your DataContext, executing them in the middle of standard CRUD operations behaves the same way in that they are immediately applied regardless of any call to SubmitChanges(), so I guess I can ‘stomach’ the behavior of my *Batch() functions.  If anyone knows how to hook into the DataContext ChangeSet, let me know!

#1 and #2 above were both easy.  They were due to me incorrectly parsing the DbCommand.CommandText created from the original entities parameter passed in used to create the INNER JOIN sub query.  #1 was only slightly tricky because I was trying to be clever and change what the SELECT actually returned.  By default, it would return every field on the table, but as I mentioned this could be a lot of data based on the table columns.  So I instead wanted to return only the fields required to complete the INNER JOIN – the primary key fields.  So when parsing, I mistakenly stripped off too much information and my command was no longer aware of any TOP clause that may have existed.  You can see this in detail in the code.

So the tricky one that took me a while to figure out was #3.  I just couldn’t see how it couldn’t be working.  I mistakenly sent a forum post over to LINQPad trying to ‘blame’ LINQPad for some weird behavior.  Shame on me.  Sorry about that Joseph (creator of LINQPad), I should have never doubted you!  It only got to that point because when I previewed my CommandText, it looked something like this (notice the parameter name in line 2 and 11):

UPDATE [be_Posts]
SET [DateModified] = [DateCreated], [Author] = @pAuthor0

FROM [be_Posts] AS j0 INNER JOIN (

	SELECT [t0].[PostID]
	FROM [be_Posts] AS [t0]

) AS j1 ON (j0.[PostID] = j1.[PostID])

-- @pAuthor0: Input NVarChar (Size = 14; Prec = 0; Scale = 0) [Chris Cavanagh]

The above output came from me ‘copying’ System.Data.Linq.SqlClient.SqlProvider.LogCommand method.  Looking at this, everything looked good.  I could even take this to SQL Management Studio and run it, obviously converting the — @pAuthor0 comment into a valid DECLARE / SET statement, and everything worked.

Well the problem ended up being that when I outputted this log, I was doing it before calling DataContext’s public int ExecuteCommand(string command, params object[] parameters).  There is a subtle thing going on here when you call ExecuteCommand that I missed entirely.  The LINQ provider is recreating a new DbCommand and simply looping every value in the parameters array and creating a new DbParameter with the name @p{0-N}.  Since I was only passing it an array of objects, it couldn’t possibly know what I named the parameters in my original update command.  So to fix this, I just had to massage my query construction to use only the @p0 syntax which was only slightly more tedious the the previously mentioned ‘easy’ @p{fieldname}0 syntax.

So how powerful is this bad boy?

So there you go.  Both UPDATE and DELETE statements can be executed in a single SQL statement and I believe that both methods are as powerful as you’d ever need them to be – but if I’m missing something let me know.  Let’s look at an exaggerated example of what can be done.

var posts = from p in be_Posts select p;

var oldPosts =
	from p in posts
	where p.DateCreated
		.Value
		.CompareTo( DateTime.Today.AddMonths( -1 ) ) &lt; 0
	select p;

var myPosts =
	from p in oldPosts
	where p.Author == "Terry Aney"
	select p;

var first10 = 
	myPosts.OrderByDescending( o =&gt; o.DateModified ).Take( 10 );

var anonymousAuthor = new { Author = "Chris Cavanagh" };
var great = "great!";

be_Posts.UpdateBatch( 
	first10,
	p =&gt; new be_Posts {
		DateModified = p.DateCreated.Value.AddDays( 15 ),
		Author = anonymousAuthor.Author,
		Description = null,
		Title = "The " + p.Title + " was " + great,
		Rating = 500 * 500
	} );

Anyone care to guess the SQL it creates? 😉  Here it is in full Transact SQL glory (if there is such a thing):

UPDATE [be_Posts]
SET [DateModified] = 
	DATEADD(
		ms, 
		(CONVERT(BigInt,@p2 * 86400000)) % 86400000, 
		DATEADD(
			day, 
			(CONVERT(BigInt,@p2 * 86400000)) / 86400000, [DateCreated])), 
	[Author] = @p3, [Description] = null, 
	[Title] = ((@p4 + [Title]) + @p5) + @p6, [Rating] = @p7

FROM [be_Posts] AS j0 INNER JOIN (

	SELECT TOP (10) [t0].[PostID]
	FROM [be_Posts] AS [t0]
	WHERE ([t0].[Author] = @p0) AND (([t0].[DateCreated]) &lt; @p1)
	ORDER BY [t0].[DateModified] DESC

) AS j1 ON (j0.[PostID] = j1.[PostID])

-- @p0: Input NVarChar (Size = 10; Prec = 0; Scale = 0) [Terry Aney]
-- @p1: Input DateTime (Size = 0; Prec = 0; Scale = 0) [3/14/2008 12:00:00 AM]
-- @p2: Input Float (Size = 0; Prec = 0; Scale = 0) [15]
-- @p3: Input NVarChar (Size = 14; Prec = 0; Scale = 0) [Chris Cavanagh]
-- @p4: Input NVarChar (Size = 4; Prec = 0; Scale = 0) [The ]
-- @p5: Input NVarChar (Size = 5; Prec = 0; Scale = 0) [ was ]
-- @p6: Input NVarChar (Size = 6; Prec = 0; Scale = 0) [great!]
-- @p7: Input Real (Size = 0; Prec = 0; Scale = 0) [250000]

Where can I get the source for this?

Download the source code and give it a go.  Let me know if you find any issues or have any suggestions.

Disclaimer: I’m not responsible for anything! 😉  I’ve given the code a good run through and use it in my day to day ‘maintenance queries’ against production data so I’m fairly confident that everything will be in working order.  However, when looking at the source, you’ll see some TODO’s for me to polish up the documentation and/or find better ways of parsing the little bit of SQL text I need to process.  It currently behaves itself for my needs, but rest assured, as soon as I address those…actually I should say, as soon as I can get Chris to explain in layman’s terms what is going on, I’ll update the documentation and the provided source code!

kick it on DotNetKicks.com

Advertisements
Categories: C#, Extension Methods, LINQ
  1. April 19, 2008 at 4:19 am

    Hey, wow. Great work with Expression trees and getting LINQ-to-SQL to do your bidding. I’ll be using this idea someday for sure.

  2. April 20, 2008 at 7:49 am

    Impressive! I think a lot of people will find this useful.

  3. June 8, 2008 at 6:39 pm

    This is cool. But i presume since you are using DbCommand (this is practically bypassing datacontext) we are loosing Object Tracking ability aren’t we ? Or this relies on nobody will use the context after submitchanges ?

    I’ll give a deeper look into code. Thanks for the invaluable effort !

  4. June 27, 2008 at 5:05 am

    Great Article to explore new features of .NET 3.5

  5. June 29, 2008 at 9:36 pm

    Sidar Ok – you are correct. You are going to lose object tracking using these methods but that was an acceptable consequence (or benefit as Zhao mentioned in his article).

    However, these use or exclusion of these extension methods should have no affect on the context. Imagine this as being similiar to:

    int System.Data.Linq.DataContext.ExecuteCommand(string command, params object[] parameters)

    except you are getting strongly typed syntax via Linq to Sql and Lambda expressions.

  6. martin
    July 3, 2008 at 8:33 am

    How to use the DeleteBatch?

  7. July 27, 2008 at 7:00 am

    Hi Terry,

    I have also implemented a batching process, but for batching the linq queries, not the updates and deletes. You can take a look at how I did it at my blog:
    http://tonesdotnetblog.wordpress.com/2008/07/23/linq-to-sql-batches-and-multiple-results-without-stored-procedures-by-tony-wright/

    Regards,
    Tony

  8. September 12, 2008 at 3:38 am

    Great Article to explore new features of .NET 3.5!really good item! 🙂

  9. Rico Alexander
    September 15, 2008 at 6:14 pm

    Seems like the main reason you would use something like this vs. a storedprocedure is if you were looking to keep change tracking. If you want to delete items you want them to disappear from the UI automatically.

  10. October 21, 2008 at 1:52 am

    martin – Guess I don’t know how to explain it any better than I did in the post :S

    Tony – Thanks for the link. You did great work as well. I’ll probably implement similar extension methods in my ‘LINQ toolkit’.

    Rico – I attempted to outline the main reasons in post, but to state it as ‘briefly’ as possible. I wanted to be able to delete/update 100’s/1000’s of rows via LINQ to SQL without requiring 100’s/1000’s of individual delete/update statements. Stored Procedures were probably the suggested way to go, but then you lose the ability to generate dynamic filtering criteria along with strongly typed code. So that is why I created these extensions.

    [quote]Seems like the main reason you would use something like this vs. a storedprocedure is if you were looking to keep change tracking.[/quote]

    And you might have misunderstood, so I want to clarify. Just like a stored procedure (because that is essentially what I am ‘creating/executing’ on the fly) you [b]will not[/b] have change tracking with my extension methods.

  11. November 1, 2008 at 10:23 am

    Great Article to explore new features of .NET 3.5!really good item! Very interesting! cheers

  12. December 6, 2008 at 7:35 pm

    I’ve put a Quick Links summary at the top of the post that will stay updated with any new posts semi related to these LINQ to SQL extension methods. Please see "I’ve Left Query Analyzer Hell for LINQPad Heaven" for notes on many of the features suggested in the comments.

  13. Ian Kulmatycki
    January 2, 2009 at 4:46 am

    Hi,

    This snippet has been of great use to me. Thanks a lot. I noticed an issue when doing batch updates against Linq entities with mapped column names that are different from the actual property name on the entity. I’m not sure if this is something that I am doing wierd/wrong to get the behavior. If it is, let me know, if not, I have slightly updated the GetDbStatement<TEntity>(… method.
    [b]

    // My update (by update I mean possible hack!)
    var colAttrs = binding.Member.GetCustomAttributes(typeof(System.Data.Linq.Mapping.ColumnAttribute), true);

    string name = null;
    if (colAttrs == null || colAttrs.Length == 0)
    {
    name = binding.Member.Name;
    }
    else
    {
    name =
    (from attr in colAttrs.OfType<System.Data.Linq.Mapping.ColumnAttribute>()
    select attr.Name).Single();
    }
    // My update (by update I mean possible hack!)

    var dbCol = (from c in dbCols
    where c.MappedName == name
    select c).FirstOrDefault();
    [/b]

    Hopefully a) this helps someone, or b) someone is able to determine what I am doing wrong.

    Thanks again,
    ian

  14. January 3, 2009 at 12:16 am

    Ian – you should get the latest code. I no longer have a function called GetDbStatement and I did make a mapping fix but it is not exactly like yours. So I would be curious to know if the fix resolves your issue or if you still need your code from above. Let me know.

  15. January 15, 2009 at 10:45 am

    Hi,

    Just a Quick question before the real problem 🙂
    What plugin are you using in the blogengine to post code ?

    So … updating in a batch … NICE. But what if you have some external sources for updating the DataContext ? How could I then batch the updates if possible. I guess its faster grouping 1000 updates in each batch or have I misunderstood something ?

    best regards
    Mikael Syska

  16. January 16, 2009 at 2:26 am

    Hi Mikael,

    First, I use the plugin from: http://lvildosola.blogspot.com/2007/02/code-snippet-plugin-for-windows-live.html

    Can you elaborate a bit on ‘if you have some external sources for updating the DataContext’, I’m not sure I’m following your situation?

  17. January 16, 2009 at 2:28 am

    Hi,

    I do alot of updating with files I download from a http server, so I only compare them, and change the DataContext … so the the Multiple batch update would not work for me if I have understood it right.Since I dont have a table that contains the updated values as you do in the examples.

    However … last night I came across the "MERGE" keyword in sql and will look into that today.

    So now I just need a simple way to find all the rows in the table that have a changed value on make that sql stament.

    mvh
    Mikael Syska

  18. January 16, 2009 at 2:29 am

    You are right. My tool isn’t really the same thing. I’m sure there are samples out there that walk a DataContext/Table looking for rows that have been modified. You could probably then generated a SQL update for each row and put in one huge SQL string ; delimited and issue it to the DB?

    Not 100% sure if this is what you mean, but if you have questions, let me know and I’ll see if I can help.

  19. January 16, 2009 at 2:30 am

    I’m using the DataContext to track changes … cause its made out of the box with LINQ to SQL, but I’m working on a way to speed of when I made lots of changes.

    Using MERGE to update …. and SQLBulkCopy class to do the inserts.

    Doing something like: DB.GetChangeSet().Updates.Where(a => a.GetType() == typeof(Syska.TravianMap.Tables.Village))

    Maybe I will make a blog post about it … cause I really like LINQ to SQL, that it keeps track of changes … but the speed of commiting the changes are awfull.

    mvh
    Mikael Syska

  20. January 16, 2009 at 2:31 am

    If you do post up a blog, make sure to let me (and this post) know. I’d love to read it.

  21. April 11, 2009 at 8:40 pm

    Hi,

    I downloaded your code, and since I’m working with the entity framework, I was thinking about porting this to the entity framework. But I have far to less knowledge of linq & entity framework to port this.
    Would it be difficult to port this to the entity framework?
    I’m preferring entity framework over linq2sql because of the support of many to many relationships.

  22. April 12, 2009 at 9:59 pm

    Simon,

    Couple things…Linq To Sql supports many to many relationships…just not represented in the exact same fashion as Entity Framework.

    Secondly, L2S works great for what our projects need and we don’t anticipate moving to EF any time soon. At this time, I don’t really have any plans on making similiar helpers in EF since I don’t use it.

    I’m not very well versed in EF but from what I’ve glanced at it, it looks like it has something called ‘Entity SQL’ which isn’t quite the same ‘objects’ I was working with when converting IQueryable objects into ‘batch delete’ statements.

    If you have a go at it, I could help you with deciphering my code, so let me know.

  23. May 6, 2009 at 8:04 am

    Hi Terry,

    Any resolution on deferring the deletes/updates until a call to SubmitChanges(). This really helps if you are doing a no. of operations and want to commit all while preserving atomicity.

    Thanks,

  24. May 6, 2009 at 4:44 pm

    @Syed – Unfortunately I haven’t. Not sure this could be possible with deriving from the DataContext class and adding methods there that then cache the update/delete statements until SubmitChanges() is called and execute the SQL then. There should be all the code you need to look at possibly doing that.

    To be honest, I haven’t really had a ‘need’ for that yet, but when the need arises, maybe I’ll take a look at deriving from DataContext (haven’t even looked to see if you can hook into SubmitChanges() or not).

    If you do anything, but sure to let us know.

  25. Fred Hanson
    July 9, 2009 at 3:24 am

    Re: SubmitChanges, why not use a Transaction?
    The code won’t be as elegant, but I think it would make atomicity *possible*, right?

  26. July 9, 2009 at 3:27 am

    Not as elegant, but yes, it would 🙂

  27. Remi
    August 27, 2009 at 9:24 pm

    Hello,
    Great code, it really helped me but I have a problem with UpdateBatch + boolean, when I do :

    db.ReservationsHotels.UpdateBatch(resas, r => new LinqAccess.ReservationsHotel()
    {
    ArrivalDate = (hasToModifyArrivalDate) ? newArrivalDate : r.ArrivalDate
    , IsDeleted = (hasToModifyIsDeleted) ? newIsDeleted : r.IsDeleted
    });
    The arrival date works but IsDeleted doesn’t :
    Incorrect syntax near the keyword ‘AS’.
    The SQL query generated : UPDATE [dbo].[ReservationsHotels]\r\nSET [ArrivalDate] = \r\n (CASE \r\n WHEN @p2 = 1 THEN @p3\r\n ELSE [ArrivalDate]\r\n END),[IsDeleted] = NULL AS [EMPTY]\r\n\r\nFROM [dbo].[ReservationsHotels] AS j0 INNER JOIN …
    Can you help me ?

  28. August 30, 2009 at 12:50 am

    @Remi – what if you put

    IsDeleted = (hasToModifyIsDeleted) ? newIsDeleted : r.IsDeleted.Value

    Does that help? I’m guessin gthat IsDeleted is a bool? type.

  29. Remi
    September 1, 2009 at 5:45 pm

    Thank you for your answer Terry, unfortunately, IsDeleted is not nullable, so I can’t apply r.IsDeleted.Value.
    It seems that the problem occurs in GetDbSetAssignment function, line :

    var selectQuery = ( table as IQueryable ).Provider.CreateQuery( selectExpression );

    selectQuery has the good value with ArrivalDate:
    {SELECT
    (CASE
    WHEN @p0 = 1 THEN @p1
    ELSE [t0].[ArrivalDate]
    END) AS [value]
    FROM [dbo].[ReservationsHotels] AS [t0]
    }

    but not with IsDeleted :
    {SELECT NULL AS [EMPTY]
    FROM [dbo].[ReservationsHotels] AS [t0]
    }

    Very strange.

    Do you have another idea ?
    Thanks

  30. September 2, 2009 at 12:55 am

    I’ve updated my code to throw an exception. You were included on an email I’ve sent some various MS folks. The problem is that the tertiary command syntax along with the fact that it is a boolean field is not translatable directly to Transact SQL by LINQ to SQL, therefore, it cannot be used ‘as is’ inside my BatchUpdate processing.

    I’ve asked the experts if they have any suggested work arounds. Cross your fingers that we get a response because I couldn’t find anything via Google.

  31. September 26, 2009 at 6:57 pm

    Thanks for compiling a great set of linq extentions. I am interested in modifying some of your ideas here to get asynchronous methods working.

  32. November 30, 2009 at 5:43 pm

    re: Edwin – No license issues. Use as you please.

  33. Pratik Gaikwad
    February 13, 2015 at 11:26 pm

    Hi Terry… Nice article and very much wonderful code… Quick question though… Wouldn’t batch update create SQL injection issue? Since you are creating a String command and then executing it?

    • February 14, 2015 at 1:17 pm

      No, as I stated above in the post, it creates parameters to pass to SQL statement, so should be no SQL Injection issues. Hope the code works out for you.

      • Pratik Gaikwad
        February 14, 2015 at 10:38 pm

        Do you have code for InsertBatch as well?

  34. February 15, 2015 at 9:02 am

    No. There is no real way to do insert batch that I know of that is any better than LINQ to SQL’s built in mechanism.

  35. Tomas P.
    August 6, 2015 at 2:43 am

    Hi. I found your batch delete very useful, but I’ve run into trouble with the batch update. I have a rather large table (48 columns) some columns allow null-values and some don’t. I’m reading my table 10000 rows at a time. I put the result in var posts and then I call updateBatch: dbContext.T_AllParams.UpdateBatch(posts, p => new T_AllParam { Copy = 2 }); It always crashes on row 1042 in LinqToSqlExtensions.cs with the error message that Length cannot be negative. endSelect is -1 since it is trying to find the index of “[t” in a select query that for some reason is “SELECT NULL AS [EMPTY]”. I have no idea of what I’m doing wrong or how I should get the batch update to work.

  36. August 6, 2015 at 8:25 am

    Tomas P – Can you email me your DB schema so I can try to reproduce and debug the problem?

    • Tomas P.
      August 12, 2015 at 6:23 am

      I can mail you scripts that recreates the database. I tried to find your mail but could not see it anywhere.

  37. August 12, 2015 at 7:42 am

    Use the contact form/page, and then I’ll email you back.

  38. September 30, 2015 at 2:01 pm

    Thanks for the update method. I had already wired my own Insert and Delete, but was looking for a full featured update. If you are interested in it, here is an insert that is better than the stock linq2sql insert. It might have some bugs and needs to be refactored to fit your structure. I’ve been using it in production for several years.

    public static void BatchInsert(this Table table, IEnumerable entities)
    where TEntity : class
    {
    var conn = table.Context.Connection;
    var originalState = conn.State;

    switch (originalState)
    {
    case ConnectionState.Broken:
    case ConnectionState.Closed:
    case ConnectionState.Open:
    try
    {
    if (conn.State == ConnectionState.Broken)
    conn.Close();

    if (conn.State == ConnectionState.Closed)
    conn.Open();

    using (var bulk = new SqlBulkCopy((SqlConnection) conn))
    {
    using (var reader = new LinqBulkCopyReader(entities))
    {
    //foreach (var column in reader.ColumnMappingList)
    // bulk.ColumnMappings.Add(column.Key, column.Value);
    bulk.DestinationTableName = reader.TableName;
    bulk.WriteToServer(reader);
    }
    }
    }
    finally
    {
    if (originalState != ConnectionState.Open)
    conn.Close();
    }
    break;
    default:
    throw new Exception(“DB Connection is not in a usable state for bulk copy.”);
    }
    }

    private sealed class LinqBulkCopyReader : IDataReader where TEntity : class
    {
    public LinqBulkCopyReader(IEnumerable entities)
    {
    _tableName = typeof(TEntity).GetAttribute().Name;

    _ordinalToColumnMap = ReflectionUtils.GetFields(typeof(TEntity))
    .OfType()
    .Select(pd => new { pd, att = pd.GetAttribute() })
    .Where(d => d.att != null)
    .Select(d => d.pd)
    .ToList();

    _nameToOrdinalMap = new Dictionary();
    for (var index = 0; index < _ordinalToColumnMap.Count; index++)
    {
    var pd = _ordinalToColumnMap[index];
    var name = pd.GetAttribute().Name ?? pd.Name;
    _nameToOrdinalMap[name] = index;
    }

    _enumerator = entities.GetEnumerator();
    }

    private readonly IEnumerator _enumerator;
    private readonly Dictionary _nameToOrdinalMap;
    private readonly List _ordinalToColumnMap;
    private readonly string _tableName;

    public void Close()
    {
    Dispose();
    }

    public void Dispose()
    {
    _enumerator.SafeDispose();
    }

    public string TableName
    {
    get { return _tableName; }
    }

    public bool Read()
    {
    return _enumerator.MoveNext();
    }

    public object GetValue(int i)
    {
    return _ordinalToColumnMap[i].GetValue(_enumerator.Current);
    }

    public int FieldCount
    {
    get
    {
    return _ordinalToColumnMap.Count;
    }
    }

    public int GetOrdinal(string name)
    {
    return _nameToOrdinalMap[name];
    }

    #region unimplemented methods
    // empty methods derived classes may want to implement
    public object this[int i] { get { throw new NotImplementedException(); } }
    public int Depth { get { throw new NotImplementedException(); } }
    public bool IsClosed { get { throw new NotImplementedException(); } }
    public int RecordsAffected { get { throw new NotImplementedException(); } }
    public DataTable GetSchemaTable() { throw new NotImplementedException(); }
    public bool NextResult() { throw new NotImplementedException(); }
    public object this[string name] { get { throw new NotImplementedException(); } }
    public bool GetBoolean(int i) { throw new NotImplementedException(); }
    public byte GetByte(int i) { throw new NotImplementedException(); }
    public long GetBytes(int i, long fieldOffset, byte[] buffer, int bufferoffset, int length) { throw new NotImplementedException(); }
    public char GetChar(int i) { throw new NotImplementedException(); }
    public long GetChars(int i, long fieldoffset, char[] buffer, int bufferoffset, int length) { throw new NotImplementedException(); }
    public IDataReader GetData(int i) { throw new NotImplementedException(); }
    public string GetDataTypeName(int i) { throw new NotImplementedException(); }
    public DateTime GetDateTime(int i) { throw new NotImplementedException(); }
    public decimal GetDecimal(int i) { throw new NotImplementedException(); }
    public double GetDouble(int i) { throw new NotImplementedException(); }
    public Type GetFieldType(int i) { throw new NotImplementedException(); }
    public float GetFloat(int i) { throw new NotImplementedException(); }
    public Guid GetGuid(int i) { throw new NotImplementedException(); }
    public short GetInt16(int i) { throw new NotImplementedException(); }
    public int GetInt32(int i) { throw new NotImplementedException(); }
    public long GetInt64(int i) { throw new NotImplementedException(); }
    public string GetName(int i) { throw new NotImplementedException(); }
    public string GetString(int i) { throw new NotImplementedException(); }
    public int GetValues(object[] values) { throw new NotImplementedException(); }
    public bool IsDBNull(int i) { throw new NotImplementedException(); }
    #endregion
    }

  39. March 17, 2016 at 2:24 pm

    Do you approve if I quote a handful of of your articles or blog posts but I supply credit and sources back to your web site? My site is in the extremely same region of interest as yours and my site visitors would surely benefit from some of the details you existing right here. Please allow me know if this okay with you. Cheers!

    • March 17, 2016 at 8:08 pm

      Sure, no problem.

  40. Kris
    December 8, 2016 at 7:32 am

    RE: If anyone knows how to hook into the DataContext ChangeSet, let me know!

    I have actually done something like this … although I have made a DataContextEx and have actually replaced all of the SubmitChanges stuff myself…

    I have actually implemented bulk operations such as cloning between tables myself, that can be either executed instantly to get results back or when .SubmitChanges is called…

    You can even go:

    Using dbml As New Database.DBML
    If dbml.PerformTransaction(Sub()
    Dim Clone = dbml.BulkOperations.AddClone(dbml.Person,
    dbml.TestPerson,
    dbml.Person.Where(Function(x) x.FirstName.StartsWith(“a”)),
    Function(x) New Database.TestPerson With {
    .ID = (dbml.Format(Now, “yyyyMMddHHmmssff”) & (dbml.NewId).ToString.Replace(“-“c, “”)).Substring(0, 32),
    .LastName = “x” & x.LastName,
    .EnteredBy = “Kris”
    },
    Function(x) x)
    ‘apply the clone now (rather than waiting for the SubmitChanges):
    Clone.Apply()
    ‘Get Results
    Dim results = Clone.Results
    ‘change one of the items we just cloned 🙂
    results.First.FirstName = “Test :)”
    ‘submit those changes
    dbml.SubmitChangesException()
    End Sub, DBMLBase.DataContextEx.TransactionHandleOptions.ShowMessage) Then
    ‘yay
    Else
    ‘nogo
    End If
    End Using

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: