Friday, June 05, 2009

Stored Procedures versus Ad-hoc SQL

Have you ever worked on a system where someone decreed that all database calls must be Stored Procedures, and ad-hoc SQL is strictly forbidden? I have and it leads to incredible development pain.

 

Let me first say that I have written many awesome 100+ line stored procedures. Stored procedures are definitely a good thing when writing complex queries. TSQL and Stored Procedures are fantastic. However, I think that the optimal decision is to use a mix of both ad-hoc sql and stored procedures rather than just stored procedures.

 

Reasons that are given for the above decree and why they are no longer true:

  • Security:
    • SQL injection: Resolved by using parameterized SQL which eliminate SQL injection possibility.
    • Granular security: If the app pool account is dbowner, then there is no additional security from applying security to each stored procedure (execute only privileges).

 

  • Performance:  (see performance testing section below for test results)
    • Execution plan Caching: In SQL Server 2005 and later, SQL server caches the execution plan based on tokenized versions of queries so performance of parameterized ad-hoc SQL is close to stored procedures.(see John Lam's comment below)

 

  • Maintenance
    • Changing Stored procedures doesn’t require recompiling code: With agile development and continuous integration, code is easy to change and deploy. Advantage of ad-hoc SQL: fewer lines of code overall because you don’t have to declare stored procedure signature. Fewer lines of code means risk of errors goes down, and maintenance cost goes down. Easier to refactor code with inline SQL because able to use "find and replace" in Visual Studio when renaming a column or changing a data type. Also, using a mix of stored procedures and ad-hoc SQL keeps the database cleaner. When using all stored procedures you quickly have 100s to keep track of and inevitably stored procedures get lost and a significant effort must be devoted to maintaining and auditing them all.
    • Faster coding, don't have to switch between sql server and Visual Studio. (Jeff Atwood also makes this point).
    • Better documentation: When TSQL code is in data layer, able to look at it in one place. (Jeff Atwood also makes this point)
    • Debugging faster, easier. Cumbersome to set breakpoints in stored procedures.

 

  • Transactions
    • It used to be possible to do transactions only in stored procedures and not in .NET. But now with the System.Transactions namespace it is easy. This is really great stuff. If you haven't used it, it is so awesome and clean.

 

Comments:

I am not anti-stored procedure. I have used them quite a bit and they are incredibly powerful. TSQL is just gorgeous when it comes to performance. When you have complex queries, they become good candidates for stored procedures because you can leverage temporary tables, common table expressions, etc.

Note ORMs like Entity Framework (the future of data access) uses parameterized sql, so if developers are willing to use those frameworks they are already using parameterized sql. For some reason developers are willing to use those frameworks but not inline SQL, even though they talk to the database in the same way.

 

 

What others are saying:

Jeff Atwood puts in eloquently in his blog post: who needs stored procedures anyway?.

 

And who can argue with the great John Lam? He says:

As a guy who dabbles in low-level bit twiddling stuff from time-to-time, the performance claims are quite interesting to me. The new (as of SQL Server 7.0) cached execution plan optimization in SQL Server looks to me a lot like JIT compilation. If this is, in fact, the case it seems to me that the only overhead that would be associated with dynamic SQL would be:

  1. The amount of bandwidth + time it takes to transmit the dynamic SQL text to the database.
  2. The amount of time it takes to calculate the hash of the dynamic SQL text to look up the cached execution plan.

I can imagine quite a few scenarios where the above overhead would disappear into the noise of the network roundtrip. What upsets me are the folks who spout forth anecdotal arguments that claim stored procedures have "much better" performance than dynamic SQL.

 

Conclusion

I cringe when I see an architectural decision which tries to make it easy for the server which is a $4000 resource over two years at the cost of making it harder for the developer which is a $400,000 resource (over two years).

A question from personal experience: I recently saw a 10 million dollar software project go over-budget and lose a ton of money. Why did the project fail? Using ad-hoc sql may not have saved the project but making similar productivity enhancing decisions like allowing ad-hoc SQL when it makes sense and is not a security or performance compromise, might just have tipped the project from red to black.

 

Performance Testing

I always have to test performance claims for myself. So I wrote a small console application to put the execution caching plan to the test. A method using a stored procedure was called 1000 times. A method using the same sql as paramaterized, ad-hoc SQL was called 1000 times. The times were about equal, actually they where slightly faster for the ad-hoc sql :).

Results (milliseconds):

clip_image001

clip_image002

clip_image003

static void Main(string[] args)

{

//--warm up

SelectContactStoredProc();

SelectContactParamaterizedSql();

//--

Stopwatch sw = new Stopwatch();

sw.Start();

for (int i = 1; i <= 1000; i++)

SelectContactStoredProc();

sw.Stop();

Console.WriteLine("Elapsed StoredProc:" + sw.ElapsedMilliseconds);

//--

sw.Reset();

sw.Start();

for (int i = 1; i <= 1000; i++)

SelectContactParamaterizedSql();

sw.Stop();

Console.WriteLine("Elapsed ParmSQL:" + sw.ElapsedMilliseconds);

Console.ReadLine();

}

public static void SelectContactParamaterizedSql()

{

DataGeneral.ParamBuilder pb = new DataGeneral.ParamBuilder();

pb.AddParam(SqlDbType.Int, "@contactGroupId", 11);

string sql = @"select ContactID,ContactGroupID,ContactTypeRefCode,FirstName,LastName,Prefix,Suffix,Email,Phone,PhoneExt,AltPhone,AltPhoneExt,MobilePhone,FaxPhone

from Contact where contactGroupId = @contactGroupId";

object ret = DataGeneral.GetScalar(sql, pb.Parameters, false);

}

public static void SelectContactStoredProc()

{

DataGeneral.ParamBuilder pb = new DataGeneral.ParamBuilder();

pb.AddParam(SqlDbType.Int, "@contactGroupId", 11);

object ret = DataGeneral.GetScalar("General_Contact_GetByContactGroupID", pb.Parameters, true);

}

1 comment:

Catto said...

Hey Joe,

Great post. Make a person think.

Thx 4 the info,

Catto