Stored procedure / query optimization
-
Hello, I have a question regarding stored procedure desing that provides the optimal performance. Let's say we have a table Products that consists of three columns: Name, Status, RegistrationTime. All columns are indexed and users should be able to lookup data by any of the columns. We have two main options to design stored procedures for data retrieval: 1. Design separate stored procedures for each search criteria: LookupProductsByName, LookupProductsByStatus, LookupProductsByTime. 2. Write a generic stored procedure that will fit any search criteria: CREATE PROCEDURE GetProducts ( @Name varchar(20), @Status int = NULL, @FromTime datetime = NULL, @ToTime datetime = NULL) AS BEGIN SELECT [Name], [Status], [RegistrationTime] FROM [Products] WHERE [Name]=CASE WHEN @Name<>NULL THEN @Name ELSE [Name] END AND [Status]=CASE WHEN @Status<>NULL THEN @Status ELSE [Status] END AND [RegistrationTime]>=CASE WHEN @FromTimestamp<>NULL THEN @FromTimestamp ELSE [RegistrationTime] END AND [RegistrationTime]<=CASE WHEN @ToTimestamp<>NULL THEN @ToTimestamp ELSE [RegistrationTime] END ORDER BY [RegistrationTime] END; The second option is very attractive, because it is obviously easier to maintain such code. However, I am a little concerned about performance of such stored procedure. It is not possible to foresee what index should be used, index can only be selected each during procedure execution, because search criteria can include either Name, Status or RegistrationTime. Will it make this SP inefficient? Or perormance difference in such case is not big (if any) and we should choose the second option because of its significant code reduction? Thanks in advance Вагиф Абилов MCP (Visual C++) Oslo, Norway Hex is for sissies. Real men use binary. And the most hardcore types use only zeros - uppercase zeros and lowercase zeros. Tomasz Sowinski
The 2nd option is not only less efficient than the first, it is less efficient than not using a stored procedure at all. It is also harder to maintain and optimize, because of the complexity. In my experience, for user-searches, stored procedures are a poor fit. This is because user's demand compex, dynamic searches, and stored procedures are unable to efficiently handle that. In addition, such stored procedures become hellishly long and complex.
-
The 2nd option is not only less efficient than the first, it is less efficient than not using a stored procedure at all. It is also harder to maintain and optimize, because of the complexity. In my experience, for user-searches, stored procedures are a poor fit. This is because user's demand compex, dynamic searches, and stored procedures are unable to efficiently handle that. In addition, such stored procedures become hellishly long and complex.
But you wouldn't defent using SQL statements directly, would you? While I see your point regarding performance, I can't understand reasons for not using stored procedure, or at least user-defined functions. Yes, I agree, for Web searches it is hard to write tens of SPs just to cover all possible cases, but at least UDF makes it possible to pack the code inside the database and not expose SQL directly to clients. BTW, thanks for the opinion regarding efficiency. Looks like too much of generalisation will harm performance. Вагиф Абилов MCP (Visual C++) Oslo, Norway Hex is for sissies. Real men use binary. And the most hardcore types use only zeros - uppercase zeros and lowercase zeros. Tomasz Sowinski
-
But you wouldn't defent using SQL statements directly, would you? While I see your point regarding performance, I can't understand reasons for not using stored procedure, or at least user-defined functions. Yes, I agree, for Web searches it is hard to write tens of SPs just to cover all possible cases, but at least UDF makes it possible to pack the code inside the database and not expose SQL directly to clients. BTW, thanks for the opinion regarding efficiency. Looks like too much of generalisation will harm performance. Вагиф Абилов MCP (Visual C++) Oslo, Norway Hex is for sissies. Real men use binary. And the most hardcore types use only zeros - uppercase zeros and lowercase zeros. Tomasz Sowinski
Well, since you asked... I would (and do) use SQL statements directly, specially in a web-based applications. Stored procedures offer some benefits, but they also have some drawbacks, such as portability and maintainability. Personally, I almost never use them, but I accept that many people do. The main benefit of stored procedures is one of security -- you can protect your database against adhoc user queries. In a web application, this benefit does not exist, because the user never has direct access anyway. In other situations, there are alternatives such as writing a network service component (web service or .net remoting) as an intermediate layer. The business value of such a web service is far more than that of stored procedures.
-
Well, since you asked... I would (and do) use SQL statements directly, specially in a web-based applications. Stored procedures offer some benefits, but they also have some drawbacks, such as portability and maintainability. Personally, I almost never use them, but I accept that many people do. The main benefit of stored procedures is one of security -- you can protect your database against adhoc user queries. In a web application, this benefit does not exist, because the user never has direct access anyway. In other situations, there are alternatives such as writing a network service component (web service or .net remoting) as an intermediate layer. The business value of such a web service is far more than that of stored procedures.
I see your point. Although in our team we try to avoid direct use of SQL. We have multiple tiers, and letting Web developers write SQL directly will make the system uncontrollable. BTW, I just received a reference to this outstanding article: Dynamic Search Conditions in T-SQL[^]. Exactly on this subject. Вагиф Абилов MCP (Visual C++) Oslo, Norway Hex is for sissies. Real men use binary. And the most hardcore types use only zeros - uppercase zeros and lowercase zeros. Tomasz Sowinski
-
Well, since you asked... I would (and do) use SQL statements directly, specially in a web-based applications. Stored procedures offer some benefits, but they also have some drawbacks, such as portability and maintainability. Personally, I almost never use them, but I accept that many people do. The main benefit of stored procedures is one of security -- you can protect your database against adhoc user queries. In a web application, this benefit does not exist, because the user never has direct access anyway. In other situations, there are alternatives such as writing a network service component (web service or .net remoting) as an intermediate layer. The business value of such a web service is far more than that of stored procedures.
Steven Campbell wrote: The main benefit of stored procedures is one of security -- you can protect your database against adhoc user queries. In a web application, this benefit does not exist, because the user never has direct access anyway. It is always wise to reduce the attack surface of your application. Even if you think that someone cannot get that far in it is still wise to protect all layers as there will always be some bit of code that was writted by a recent graduate, or someone working late on a Friday night, that is not as well thought out that could let someone in. Stored procedures also have the benefit of being precompiled, and therefor preoptimised. SQL Server 2000 has gone some way to optimising directly called SQL statements by caching the compiled versions so that if the query is run again it already has it in a compiled state and doesn't need to recompile it. But SPs will be precompiled always. I would suggest the argument that direct SQL is better than Stored Procedures due to portability issues is a false one. Having worked on many database systems over the years I have found that even directly called SQL statements need to be changed when moving from one database system to another, so you are still not guaranteed portability calling SQL directly. I would also suggest that the argument that directly called SQL statements are easier to maintain is incorrect. If you create stored procedures with a defined interface and the internal structure of the datamodel changes, all you need to do is change the Stored Procedure and everything that calls it will be no wiser to the changes further down. It has all the benefits of an extra layer of abstraction. Calling SQL directly on tables will mean that if the table changes then all the calls to it have to be updated and that can be a lot of work (especially if the database is used by many applications.)
Do you want to know more? Not getting the response you want from a question asked in an online forum: How to Ask Questions the Smart Way!
-
Steven Campbell wrote: The main benefit of stored procedures is one of security -- you can protect your database against adhoc user queries. In a web application, this benefit does not exist, because the user never has direct access anyway. It is always wise to reduce the attack surface of your application. Even if you think that someone cannot get that far in it is still wise to protect all layers as there will always be some bit of code that was writted by a recent graduate, or someone working late on a Friday night, that is not as well thought out that could let someone in. Stored procedures also have the benefit of being precompiled, and therefor preoptimised. SQL Server 2000 has gone some way to optimising directly called SQL statements by caching the compiled versions so that if the query is run again it already has it in a compiled state and doesn't need to recompile it. But SPs will be precompiled always. I would suggest the argument that direct SQL is better than Stored Procedures due to portability issues is a false one. Having worked on many database systems over the years I have found that even directly called SQL statements need to be changed when moving from one database system to another, so you are still not guaranteed portability calling SQL directly. I would also suggest that the argument that directly called SQL statements are easier to maintain is incorrect. If you create stored procedures with a defined interface and the internal structure of the datamodel changes, all you need to do is change the Stored Procedure and everything that calls it will be no wiser to the changes further down. It has all the benefits of an extra layer of abstraction. Calling SQL directly on tables will mean that if the table changes then all the calls to it have to be updated and that can be a lot of work (especially if the database is used by many applications.)
Do you want to know more? Not getting the response you want from a question asked in an online forum: How to Ask Questions the Smart Way!
I cannot echo Colin's last point strongly enough. Modularity is A GOOD THING. As Colin suggests, using sprocs frees the rest of your application from needing any knowledge about the underlying table structure. This will matter when you have to make a schema change for V2.0 and spend an eternity tracking down broken select/insert/deletes all over your code. If you use sprocs, all you need to do is assure that they still behave the same, and you're done. Web apps are actually quite risky, especially if you use query string data to build your sql statements. There are ways to exploit this to get malicious code run on your sql box. Passing the query string data as parameters to a sproc eleviates much of this problem (My experience here is a little limited, so if I am way off, someone please contradict me :-O ) Bill
-
Steven Campbell wrote: The main benefit of stored procedures is one of security -- you can protect your database against adhoc user queries. In a web application, this benefit does not exist, because the user never has direct access anyway. It is always wise to reduce the attack surface of your application. Even if you think that someone cannot get that far in it is still wise to protect all layers as there will always be some bit of code that was writted by a recent graduate, or someone working late on a Friday night, that is not as well thought out that could let someone in. Stored procedures also have the benefit of being precompiled, and therefor preoptimised. SQL Server 2000 has gone some way to optimising directly called SQL statements by caching the compiled versions so that if the query is run again it already has it in a compiled state and doesn't need to recompile it. But SPs will be precompiled always. I would suggest the argument that direct SQL is better than Stored Procedures due to portability issues is a false one. Having worked on many database systems over the years I have found that even directly called SQL statements need to be changed when moving from one database system to another, so you are still not guaranteed portability calling SQL directly. I would also suggest that the argument that directly called SQL statements are easier to maintain is incorrect. If you create stored procedures with a defined interface and the internal structure of the datamodel changes, all you need to do is change the Stored Procedure and everything that calls it will be no wiser to the changes further down. It has all the benefits of an extra layer of abstraction. Calling SQL directly on tables will mean that if the table changes then all the calls to it have to be updated and that can be a lot of work (especially if the database is used by many applications.)
Do you want to know more? Not getting the response you want from a question asked in an online forum: How to Ask Questions the Smart Way!
Reducing the attack surface of your application is a good thing, but there are many other things I would recommend before getting to stored procedures: * do not use sa to login * encrypt your database connection string * use parameterized queries The efficiency/speed argument is bogus, especially considering murphy's law. In the time you'll take to finish the average 6 month project, processors will have improved enough to make the small "precompile" argument meaningless. The same could be said of many techniques for improving performance. Just let it go, and concentrate on creating a well structured application that is easy to maintain. Performance is a concern, but one more likely to be solved by a well placed index. I'll admit that stored procedures are a useful abstraction for programmers that do not break down their applications into multiple well defined areas of responsibility. However, if you are in the habit of creating multi-tiered, well structured applications: * SQL is in one place, or sometimes there is no SQL at all (O/R Mapping). * Code is easily ported to different database platforms (again, the SQL is in one component that can be swapped out or regenerated) * There is a layer of abstraction between the business code and the database already I am content to let people use stored procedures if they want to. Most applications have bigger design problems than the decision whether to use stored procedures or not.
-
Reducing the attack surface of your application is a good thing, but there are many other things I would recommend before getting to stored procedures: * do not use sa to login * encrypt your database connection string * use parameterized queries The efficiency/speed argument is bogus, especially considering murphy's law. In the time you'll take to finish the average 6 month project, processors will have improved enough to make the small "precompile" argument meaningless. The same could be said of many techniques for improving performance. Just let it go, and concentrate on creating a well structured application that is easy to maintain. Performance is a concern, but one more likely to be solved by a well placed index. I'll admit that stored procedures are a useful abstraction for programmers that do not break down their applications into multiple well defined areas of responsibility. However, if you are in the habit of creating multi-tiered, well structured applications: * SQL is in one place, or sometimes there is no SQL at all (O/R Mapping). * Code is easily ported to different database platforms (again, the SQL is in one component that can be swapped out or regenerated) * There is a layer of abstraction between the business code and the database already I am content to let people use stored procedures if they want to. Most applications have bigger design problems than the decision whether to use stored procedures or not.
Steven Campbell wrote: * do not use sa to login * encrypt your database connection string * use parameterized queries The first and third of these I actually discussed in a blog entry of mine about a week ago so I am well aware of them. Talking about Murphy's Law: Encrypting a connection string can be more trouble than it is worth. A project I saw recently had encrypted strings, but eventually comments started to appear next to them showing the string in decrypted form to make them easier to work with (it defeats the purpose really - Also, it is better to design a security system that people will not be inclined to break in order to do their everyday jobs). The other problem with encrypted connection strings is that you have to store a key to decrypt them somewhere. Better to use a trusted connection then it doesn't matter whether you know what my connection string is or not as you will still need to know the name and password of a trusted account and access it through a trusted domain. Steven Campbell wrote: I'll admit that stored procedures are a useful abstraction for programmers that do not break down their applications into multiple well defined areas of responsibility Even if the application is broken into well defines areas I still think it is better to use stored procedures because they allow extra boundary checking. There are instances where a value on a record must be cross checked against other date to ensure validity. Constraints on tables can only go so far, for more complex checks SPs are better. Again, security comes into play here as more security checks can be performed, even if they should have been performed elsewhere you cannot guarantee that another part of the application has not been compromised already.
Do you want to know more? Not getting the response you want from a question asked in an online forum: How to Ask Questions the Smart Way!
-
Hello, I have a question regarding stored procedure desing that provides the optimal performance. Let's say we have a table Products that consists of three columns: Name, Status, RegistrationTime. All columns are indexed and users should be able to lookup data by any of the columns. We have two main options to design stored procedures for data retrieval: 1. Design separate stored procedures for each search criteria: LookupProductsByName, LookupProductsByStatus, LookupProductsByTime. 2. Write a generic stored procedure that will fit any search criteria: CREATE PROCEDURE GetProducts ( @Name varchar(20), @Status int = NULL, @FromTime datetime = NULL, @ToTime datetime = NULL) AS BEGIN SELECT [Name], [Status], [RegistrationTime] FROM [Products] WHERE [Name]=CASE WHEN @Name<>NULL THEN @Name ELSE [Name] END AND [Status]=CASE WHEN @Status<>NULL THEN @Status ELSE [Status] END AND [RegistrationTime]>=CASE WHEN @FromTimestamp<>NULL THEN @FromTimestamp ELSE [RegistrationTime] END AND [RegistrationTime]<=CASE WHEN @ToTimestamp<>NULL THEN @ToTimestamp ELSE [RegistrationTime] END ORDER BY [RegistrationTime] END; The second option is very attractive, because it is obviously easier to maintain such code. However, I am a little concerned about performance of such stored procedure. It is not possible to foresee what index should be used, index can only be selected each during procedure execution, because search criteria can include either Name, Status or RegistrationTime. Will it make this SP inefficient? Or perormance difference in such case is not big (if any) and we should choose the second option because of its significant code reduction? Thanks in advance Вагиф Абилов MCP (Visual C++) Oslo, Norway Hex is for sissies. Real men use binary. And the most hardcore types use only zeros - uppercase zeros and lowercase zeros. Tomasz Sowinski
I won't go in to the optimization discussion, but another way to do the same query without the case statements in the where clause is as follows:
CREATE PROCEDURE GetProducts ( @Name varchar(20), @Status int = NULL, @FromTime datetime = NULL, @ToTime datetime = NULL) AS BEGIN SELECT [Name], [Status], [RegistrationTime] FROM [Products] WHERE [Name]=IsNull(@Name, [Name]) AND [Status]=IsNull(@Status, [Status]) AND [RegistrationTime]>=IsNull(@FromTimestamp, [RegistrationTime]) AND [RegistrationTime]<=IsNull(@ToTimestamp, [RegistrationTime]) ORDER BY [RegistrationTime] END;
Hope this helps. Jeremy Oldham -
I won't go in to the optimization discussion, but another way to do the same query without the case statements in the where clause is as follows:
CREATE PROCEDURE GetProducts ( @Name varchar(20), @Status int = NULL, @FromTime datetime = NULL, @ToTime datetime = NULL) AS BEGIN SELECT [Name], [Status], [RegistrationTime] FROM [Products] WHERE [Name]=IsNull(@Name, [Name]) AND [Status]=IsNull(@Status, [Status]) AND [RegistrationTime]>=IsNull(@FromTimestamp, [RegistrationTime]) AND [RegistrationTime]<=IsNull(@ToTimestamp, [RegistrationTime]) ORDER BY [RegistrationTime] END;
Hope this helps. Jeremy OldhamThanks. This is of course preferrable syntax. But I ran some tests and figured out that optimization really sucks. So I will go back to specialized stored procedures. Вагиф Абилов MCP (Visual C++) Oslo, Norway Hex is for sissies. Real men use binary. And the most hardcore types use only zeros - uppercase zeros and lowercase zeros. Tomasz Sowinski
-
Steven Campbell wrote: * do not use sa to login * encrypt your database connection string * use parameterized queries The first and third of these I actually discussed in a blog entry of mine about a week ago so I am well aware of them. Talking about Murphy's Law: Encrypting a connection string can be more trouble than it is worth. A project I saw recently had encrypted strings, but eventually comments started to appear next to them showing the string in decrypted form to make them easier to work with (it defeats the purpose really - Also, it is better to design a security system that people will not be inclined to break in order to do their everyday jobs). The other problem with encrypted connection strings is that you have to store a key to decrypt them somewhere. Better to use a trusted connection then it doesn't matter whether you know what my connection string is or not as you will still need to know the name and password of a trusted account and access it through a trusted domain. Steven Campbell wrote: I'll admit that stored procedures are a useful abstraction for programmers that do not break down their applications into multiple well defined areas of responsibility Even if the application is broken into well defines areas I still think it is better to use stored procedures because they allow extra boundary checking. There are instances where a value on a record must be cross checked against other date to ensure validity. Constraints on tables can only go so far, for more complex checks SPs are better. Again, security comes into play here as more security checks can be performed, even if they should have been performed elsewhere you cannot guarantee that another part of the application has not been compromised already.
Do you want to know more? Not getting the response you want from a question asked in an online forum: How to Ask Questions the Smart Way!
Steven Campbell wrote: * do not use sa to login I strongly agree. That's one of the reasons I like sprocs: I can create a user that has only execute permissions on the sprocs, no direct select, insert, update (much less schema change) permissions. Steven Campbell wrote: * encrypt your database connection string One point to make here: no matter how you encrypt/protect your connection string, your application has to have the ability to decrypt it. So it's inherently vulnerable. It has to be, or your application could not read it. Certainly: it adds another layer of protection I am not sure it's worth the effort. Just my thoughts...I certainly cannot quantify any of this. Bill btw: I have enjoyed reading this thread. Thanks!