Skip to main content

Stored Procedures versus Ad-hoc SQL

Have you ever worked on a system where someone decreed that all database calls must be Stored Procedures, and ad-hoc SQL is strictly forbidden? I have and it leads to incredible development pain.

 

Let me first say that I have written many awesome 100+ line stored procedures. Stored procedures are definitely a good thing when writing complex queries. TSQL and Stored Procedures are fantastic. However, I think that the optimal decision is to use a mix of both ad-hoc sql and stored procedures rather than just stored procedures.

 

Reasons that are given for the above decree and why they are no longer true:

  • Security:
    • SQL injection: Resolved by using parameterized SQL which eliminate SQL injection possibility.
    • Granular security: If the app pool account is dbowner, then there is no additional security from applying security to each stored procedure (execute only privileges).

 

  • Performance:  (see performance testing section below for test results)
    • Execution plan Caching: In SQL Server 2005 and later, SQL server caches the execution plan based on tokenized versions of queries so performance of parameterized ad-hoc SQL is close to stored procedures.(see John Lam's comment below)

 

  • Maintenance
    • Changing Stored procedures doesn’t require recompiling code: With agile development and continuous integration, code is easy to change and deploy. Advantage of ad-hoc SQL: fewer lines of code overall because you don’t have to declare stored procedure signature. Fewer lines of code means risk of errors goes down, and maintenance cost goes down. Easier to refactor code with inline SQL because able to use "find and replace" in Visual Studio when renaming a column or changing a data type. Also, using a mix of stored procedures and ad-hoc SQL keeps the database cleaner. When using all stored procedures you quickly have 100s to keep track of and inevitably stored procedures get lost and a significant effort must be devoted to maintaining and auditing them all.
    • Faster coding, don't have to switch between sql server and Visual Studio. (Jeff Atwood also makes this point).
    • Better documentation: When TSQL code is in data layer, able to look at it in one place. (Jeff Atwood also makes this point)
    • Debugging faster, easier. Cumbersome to set breakpoints in stored procedures.

 

  • Transactions
    • It used to be possible to do transactions only in stored procedures and not in .NET. But now with the System.Transactions namespace it is easy. This is really great stuff. If you haven't used it, it is so awesome and clean.

 

Comments:

I am not anti-stored procedure. I have used them quite a bit and they are incredibly powerful. TSQL is just gorgeous when it comes to performance. When you have complex queries, they become good candidates for stored procedures because you can leverage temporary tables, common table expressions, etc.

Note ORMs like Entity Framework (the future of data access) uses parameterized sql, so if developers are willing to use those frameworks they are already using parameterized sql. For some reason developers are willing to use those frameworks but not inline SQL, even though they talk to the database in the same way.

 

 

What others are saying:

Jeff Atwood puts in eloquently in his blog post: who needs stored procedures anyway?.

 

And who can argue with the great John Lam? He says:

As a guy who dabbles in low-level bit twiddling stuff from time-to-time, the performance claims are quite interesting to me. The new (as of SQL Server 7.0) cached execution plan optimization in SQL Server looks to me a lot like JIT compilation. If this is, in fact, the case it seems to me that the only overhead that would be associated with dynamic SQL would be:

  1. The amount of bandwidth + time it takes to transmit the dynamic SQL text to the database.
  2. The amount of time it takes to calculate the hash of the dynamic SQL text to look up the cached execution plan.

I can imagine quite a few scenarios where the above overhead would disappear into the noise of the network roundtrip. What upsets me are the folks who spout forth anecdotal arguments that claim stored procedures have "much better" performance than dynamic SQL.

 

Conclusion

I cringe when I see an architectural decision which tries to make it easy for the server which is a $4000 resource over two years at the cost of making it harder for the developer which is a $400,000 resource (over two years).

A question from personal experience: I recently saw a 10 million dollar software project go over-budget and lose a ton of money. Why did the project fail? Using ad-hoc sql may not have saved the project but making similar productivity enhancing decisions like allowing ad-hoc SQL when it makes sense and is not a security or performance compromise, might just have tipped the project from red to black.

 

Performance Testing

I always have to test performance claims for myself. So I wrote a small console application to put the execution caching plan to the test. A method using a stored procedure was called 1000 times. A method using the same sql as paramaterized, ad-hoc SQL was called 1000 times. The times were about equal, actually they where slightly faster for the ad-hoc sql :).

Results (milliseconds):

clip_image001

clip_image002

clip_image003

static void Main(string[] args)

{

//--warm up

SelectContactStoredProc();

SelectContactParamaterizedSql();

//--

Stopwatch sw = new Stopwatch();

sw.Start();

for (int i = 1; i <= 1000; i++)

SelectContactStoredProc();

sw.Stop();

Console.WriteLine("Elapsed StoredProc:" + sw.ElapsedMilliseconds);

//--

sw.Reset();

sw.Start();

for (int i = 1; i <= 1000; i++)

SelectContactParamaterizedSql();

sw.Stop();

Console.WriteLine("Elapsed ParmSQL:" + sw.ElapsedMilliseconds);

Console.ReadLine();

}

public static void SelectContactParamaterizedSql()

{

DataGeneral.ParamBuilder pb = new DataGeneral.ParamBuilder();

pb.AddParam(SqlDbType.Int, "@contactGroupId", 11);

string sql = @"select ContactID,ContactGroupID,ContactTypeRefCode,FirstName,LastName,Prefix,Suffix,Email,Phone,PhoneExt,AltPhone,AltPhoneExt,MobilePhone,FaxPhone

from Contact where contactGroupId = @contactGroupId";

object ret = DataGeneral.GetScalar(sql, pb.Parameters, false);

}

public static void SelectContactStoredProc()

{

DataGeneral.ParamBuilder pb = new DataGeneral.ParamBuilder();

pb.AddParam(SqlDbType.Int, "@contactGroupId", 11);

object ret = DataGeneral.GetScalar("General_Contact_GetByContactGroupID", pb.Parameters, true);

}

Comments

Catto said…
Hey Joe,

Great post. Make a person think.

Thx 4 the info,

Catto

Popular posts from this blog

How to Create and Run Tableau Bridge on Linux Containers

Tableau Bridge is now availble on Linux Containers. Yay! Now what does this mean and how do I build and run Linux Containers? We will discuss the advantages of running Bridge on Linux Containers the steps to build them, and finally, we will provide some automation script ideas for monitoring and scaling Linux Bridge agents. Tableau Bridge Today Until recently, Tableau Bridge was only available as a Windows application running on a Windows VM. It supported only one bridge agent per Virtual or Physical Machine. Advantages of Bridge in Containers Better Hardware Utilization: Linux containers are more efficient than Windows VMs, requiring only about 1/50th of the disk space. Ability to Spin Up Multiple Bridge Agents: With Linux Containers, it becomes easier to spin up multiple bridge agents on a single machine, improving scalability and resource utilization. Infrastructure Automation: Linux Containers enable easier automation of provisioning bridge agents and upgrading Tableau Bridge, the...

RAM Disks do not speed up Visual Studio

  The limiting factor for Visual Studio is disk IO. I got a tip to speed up Visual Studio from Channel 9 by creating a RAM disk which sounded like a great idea. However, when I ran a thorough set of tests, I found that the performance difference between the Ram disk and the hard disk were not appreciably different. This was a big surprise since RAM is 240,000 times faster than disk (see my previous blog post). But the reason is because Visual Studio and Vista do a lot of caching. So compile times for the same project in RAM disk and on hard disk were pretty similar. I also tested the time it took to search the entire solution for a word, and times to open a solution. There was no discernable difference!   If you still want to try it out and create your own RAM disk, you can download a simple RAMDISK.EXE utility to create a RAM disk in just a few minutes. What is a RAM Disk ?   Ramdisk is a virtual drive created in RAM.   Performance Analysis Creating f...

SpreadsheetGear vs. SyncFusion vs. ComponentOne

I conducted a three month analysis comparing three top spreadsheet controls for .NET application developers: SpreadsheetGear , SyncFusion Spreadsheet , and ComponentOne Spread.NET . The definite winner was SpreadsheetGear, with ComponentOne Spread.net in second. SpreadsheetGear provides the most Excel-like experience, is the most performant, and provides the most responsive and capable product support. I build a reporting engine for my company and SpreadsheetGear made it awesome. Our customers have been very pleased. I wanted to share my extensive analysis with any other developers out there trying to find the right spreadsheet control for their .NET application.   Comparison Spreadsheet gear grid goes to 1 million! Wow. (SyncFusion goes to 500 by default, ComponentOne goes to 60k) SpreadsheetGear is an older control which has been around for a while. It is built by the same guys who built Formula1, so they have been doing spreadsheets for many years. Sp...