Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. SQL vs Code

SQL vs Code

Scheduled Pinned Locked Moved The Lounge
databasevisual-studio
43 Posts 19 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • C ChrisC ncmail

    Rather than remain silent and appear intelligent, I'll ask a question... Can't you use C# in SQL Server now? Wouldn't those C# optimizations work within the db itself? I haven't used any .NET in the db, so I'm interested to see what the answer is.

    A Offline
    A Offline
    Andy Brummer
    wrote on last edited by
    #28

    If that had allowed us to scale out it would have been an option. However, we were so resource constrained on the database servers that it probably would not have scaled better. The application was a web log processing application which processed anywhere between 10 and billions of records a day. We calculated rolling aggregates over the past week for various attributes to make predictions, and had another process to apply future data backwards to detect additional fraud. The original design used things like materialized views to generate the aggregates, however that was not able to scale with the hardware constraints that we had. The next design calculated lated loaded aggregates with a set of indexes on the data table. However inserts became too slow. It is possible that CLR aggregation would be able to help, but I was flabbergasted to find that opening up multiple threads and pulling chunks of raw data straight out the the main table while building aggregates on the fly with .net dictionaries was tremendously fast. It allowed us to load up a week of data in under a minute hitting 70% utilization of a gigabit network, and then processing at 3-5,000 transactions a second. For some of the larger clients we had a special build which was able to hit 25,000 to 30,000 with 10Gigs of data in memory. Not bad for a little C# console app. In fact, the biggest bottleneck was normalizing cookie data.

    I can imagine the sinking feeling one would have after ordering my book, only to find a laughably ridiculous theory with demented logic once the book arrives - Mark McCutcheon

    1 Reply Last reply
    0
    • R Rama Krishna Vavilala

      Ennis Ray Lynch, Jr. wrote:

      Whomever the user is they are already accustomed to a 30 minute wait

      It's a batch process and runs automatically in my case. No user will ever accept waiting for 30 minutes :)

      R Offline
      R Offline
      rghubert
      wrote on last edited by
      #29

      We find LINQ and CLR to be extremely interesting innovations that enable flexible options -- in combination with stored procedures/SQL -- to solve more problems fast. SQL is mature, LINQ and CLR are just getting started, so give them a chance, and a try...

      1 Reply Last reply
      0
      • M Mycroft Holmes

        Pete O'Hanlon wrote:

        retrieve items to perform complex processing

        But... why, surely TSQL (or the PL flavour) can do basically anything you can do in code. I have yet to run across any requirement that I have had to go out to c# to meet. And this week we have to implement cubic spline into our procs!

        Never underestimate the power of human stupidity RAH

        P Offline
        P Offline
        Pete OHanlon
        wrote on last edited by
        #30

        We do a lot of spatial work, which we previously had to delegate out to MapInfo to process; now we do it directly in Oracle Spatial and it works a treat.

        "WPF has many lovers. It's a veritable porn star!" - Josh Smith

        As Braveheart once said, "You can take our freedom but you'll never take our Hobnobs!" - Martin Hughes.

        My blog | My articles | MoXAML PowerToys | Onyx

        M 1 Reply Last reply
        0
        • A Andy Brummer

          Rama Krishna Vavilala wrote:

          No my experience with previous projects has been the same code is usually lot slower than SPs. Which makes sense as no data needs to be transfered over wire or via IPC.

          I've made huge performance improvements by changing queries to just stream raw table data to an app instead of doing any aggregation in SQL server. As soon as the memory required for the aggregation on sql server overflowed the available server memory, paging dragged performance to a standstill with hundreds of other queries backed up behind it. Streaming several 5-12Gigs of data from SQL required much less memory and we could use more efficient structures in C# so it was thousands of times faster.

          I can imagine the sinking feeling one would have after ordering my book, only to find a laughably ridiculous theory with demented logic once the book arrives - Mark McCutcheon

          D Offline
          D Offline
          Dark Yak
          wrote on last edited by
          #31

          This question can not be answered without knowing the details of the implementation. These are general remarks to consider : * In a performance perspective : SQL works a lot faster on processing dataset than individual row computation (in general). But as Andy said, there is some caveat : in one of my project, i do all computation by the database (around 10 millions rows), it was working fine, but slowly when we reach 60 millions, the total duration grows exponentially. Because the resources needed to run the procedure starves all the server ressources (meory, paging, ...). I think there is a balance in term of the computed dataset size where sql is better than code. You've just to find it :-) * In a design perspective : In my opinion, moving business logic (computation, calculation) inside the database should be avoid (if possible). I usally used database as data storage and few basic queries (basic agreggation). For complex data manipulation with business logic, it should be done in the BAL (Business Access Layer). Conclusion : - You have to determine what is best for you : architecture design versus performance design - You have to determine what is best approach : whole dataset processing, chunk of data processing, or streaming the process

          A 1 Reply Last reply
          0
          • P Pete OHanlon

            We do a lot of spatial work, which we previously had to delegate out to MapInfo to process; now we do it directly in Oracle Spatial and it works a treat.

            "WPF has many lovers. It's a veritable porn star!" - Josh Smith

            As Braveheart once said, "You can take our freedom but you'll never take our Hobnobs!" - Martin Hughes.

            My blog | My articles | MoXAML PowerToys | Onyx

            M Offline
            M Offline
            Mycroft Holmes
            wrote on last edited by
            #32

            Pete O'Hanlon wrote:

            We do a lot of spatial work

            90% of my work is CRUD financial calcs so SQL deals with that nicely. Tell me does oravle have the equivalent of the HierarchyID that SQL Server uses. We have approx 4tb of data to work on and I have opted for Oracle rather than SQL Server as the databse.

            Never underestimate the power of human stupidity RAH

            1 Reply Last reply
            0
            • R Rama Krishna Vavilala

              I just made a tough decision by opting for a complex SQL (with recursive CTEs) in favor of complex logic in code. I hope it does not bite me later.:~

              F Offline
              F Offline
              Fabio Franco
              wrote on last edited by
              #33

              I'm with you on that. I've always favored pure SQL for performance. In a project I worked on I advised the project leader of the possible performance pitfalls of using LINQtoSQL. Guess what? System got slow as hell and a lot of effort had to be put in in order to improve performance. The argument of productivity fell on this case. I beleive that the other case might be true when no complex operations happens with data from the database, the productivity might be good without affecting performance much.

              1 Reply Last reply
              0
              • D Dark Yak

                This question can not be answered without knowing the details of the implementation. These are general remarks to consider : * In a performance perspective : SQL works a lot faster on processing dataset than individual row computation (in general). But as Andy said, there is some caveat : in one of my project, i do all computation by the database (around 10 millions rows), it was working fine, but slowly when we reach 60 millions, the total duration grows exponentially. Because the resources needed to run the procedure starves all the server ressources (meory, paging, ...). I think there is a balance in term of the computed dataset size where sql is better than code. You've just to find it :-) * In a design perspective : In my opinion, moving business logic (computation, calculation) inside the database should be avoid (if possible). I usally used database as data storage and few basic queries (basic agreggation). For complex data manipulation with business logic, it should be done in the BAL (Business Access Layer). Conclusion : - You have to determine what is best for you : architecture design versus performance design - You have to determine what is best approach : whole dataset processing, chunk of data processing, or streaming the process

                A Offline
                A Offline
                Andy Brummer
                wrote on last edited by
                #34

                Dark Yak wrote:

                In my opinion, moving business logic (computation, calculation) inside the database should be avoid (if possible). I usally used database as data storage and few basic queries (basic agreggation). For complex data manipulation with business logic, it should be done in the BAL (Business Access Layer).

                As far as scaling into large datasets, there are other solutions like greenplum, cubes, hadoop, etc. that might provide better benefit than just writing custom code. So when it comes down to it, there is no clear cut answer to the question. What works in one context can easily be a horrible solution in another. There are so many choices now, it's hard to pick the right one.

                I can imagine the sinking feeling one would have after ordering my book, only to find a laughably ridiculous theory with demented logic once the book arrives - Mark McCutcheon

                1 Reply Last reply
                0
                • R Rama Krishna Vavilala

                  May be or may be not. The code would have been around 400 lines, the SQL is about 50 significant lines . The main thing with SQL is performance (a set operation on the Database server vs an step by step operations on the middle-tier). The performance difference came to be around 10 times (30 mins vs 6 hrs approx).

                  J Offline
                  J Offline
                  JasonPSage
                  wrote on last edited by
                  #35

                  Sounds like the right choice to me bro! If "It bites you later" - it's because something has changed... you have options... refine the SQL or go to the mid Tier - Sounds Like you Hit a home Run if you ask me!

                  Know way too many languages... master of none!

                  1 Reply Last reply
                  0
                  • R Rama Krishna Vavilala

                    Dan Neely wrote:

                    with in under a minute is very useful too.

                    Yes. All I need to do us to enter different date ranges.

                    D Offline
                    D Offline
                    Dan Neely
                    wrote on last edited by
                    #36

                    But your successor won't have that option to do his bogotesting?

                    3x12=36 2x12=24 1x12=12 0x12=18

                    1 Reply Last reply
                    0
                    • R Rama Krishna Vavilala

                      Ennis Ray Lynch, Jr. wrote:

                      . How many records are we talking about here?

                      5 million! But the processing is super complicated (recursion/trees).

                      Ennis Ray Lynch, Jr. wrote:

                      The code could be wrong

                      No my experience with previous projects has been the same code is usually lot slower than SPs. Which makes sense as no data needs to be transfered over wire or via IPC.

                      I Offline
                      I Offline
                      Ishmael Turner
                      wrote on last edited by
                      #37

                      I don't know your domain or the requirements of the query, but you mentioned Trees, Recursion, and CTEs. Maybe you already know about the Nested Set model? There is a good article about it at http://dev.mysql.com/tech-resources/articles/hierarchical-data.html[^]. This can improve the performance of queries of hierarchal data at a cost during insert and update. It can also massively simplify the SQL you use to query. Sorry if this is all well known... maybe you're already using it! :-O

                      1 Reply Last reply
                      0
                      • C ChrisC ncmail

                        Rather than remain silent and appear intelligent, I'll ask a question... Can't you use C# in SQL Server now? Wouldn't those C# optimizations work within the db itself? I haven't used any .NET in the db, so I'm interested to see what the answer is.

                        Y Offline
                        Y Offline
                        YSLGuru
                        wrote on last edited by
                        #38

                        "Can't you use C# in SQL Server now?" Sure you can. There's lots of ways to get around using native SQL (T-QSL for SQL Server or PL/SQL for Oracle) and let the procedural programmer avoid learning/using a set based langauge but they all have the same downside; they are procedural based solutions. So the answer to the second half of your question, "Wouldn't those C# optimizations work with the DB itself?" is 'Probably, but they would still not perform near as well as native SQL.'.

                        S 1 Reply Last reply
                        0
                        • P Pete OHanlon

                          You. No. The poor sap who has to enhance it 2 years down the line. Yes.

                          "WPF has many lovers. It's a veritable porn star!" - Josh Smith

                          As Braveheart once said, "You can take our freedom but you'll never take our Hobnobs!" - Martin Hughes.

                          My blog | My articles | MoXAML PowerToys | Onyx

                          Y Offline
                          Y Offline
                          YSLGuru
                          wrote on last edited by
                          #39

                          Well if that poor sap is as adminant about forcing a round peg into a square hole then yeah he probably will have a hard time. Everyone esle though who uses the application that makes use of this SQL Solution will be very happy that the solution selected was based not on what the programmer wanted to do but what was best in terms of performance. We already have enough examples of bad coding for Relational Database backends; where the creator of said solution either did not want to or could not understand how to use a set based language and instead kept looking for ways to make procedural answers work on set based probelms; or how to make a square peg fit in a round hole.

                          1 Reply Last reply
                          0
                          • R Rama Krishna Vavilala

                            I just made a tough decision by opting for a complex SQL (with recursive CTEs) in favor of complex logic in code. I hope it does not bite me later.:~

                            Y Offline
                            Y Offline
                            YSLGuru
                            wrote on last edited by
                            #40

                            Kudos for doing whats best in terms of performance and the end users instead of whats easier or more favored/fun.

                            1 Reply Last reply
                            0
                            • Y YSLGuru

                              "Can't you use C# in SQL Server now?" Sure you can. There's lots of ways to get around using native SQL (T-QSL for SQL Server or PL/SQL for Oracle) and let the procedural programmer avoid learning/using a set based langauge but they all have the same downside; they are procedural based solutions. So the answer to the second half of your question, "Wouldn't those C# optimizations work with the DB itself?" is 'Probably, but they would still not perform near as well as native SQL.'.

                              S Offline
                              S Offline
                              Sinisa Hajnal
                              wrote on last edited by
                              #41

                              We use CLRs for statistics of warehouse orders and intern transactions. It has in any given day several thousands of rows in items, several hundred to thousands of documents and same number of various locations to track. And we have data collected over ten years. Normally it's few seconds for day or month analysis, but we had to make few yearly and total statistics ('trends') and 'same month over years' comparisons. Nothing beats CLR, but yes, it needs several tweaks on the database to enable it. (We tried CTEs, too slow; same with table variables and temporary tables and cursors)

                              Y 1 Reply Last reply
                              0
                              • S Sinisa Hajnal

                                We use CLRs for statistics of warehouse orders and intern transactions. It has in any given day several thousands of rows in items, several hundred to thousands of documents and same number of various locations to track. And we have data collected over ten years. Normally it's few seconds for day or month analysis, but we had to make few yearly and total statistics ('trends') and 'same month over years' comparisons. Nothing beats CLR, but yes, it needs several tweaks on the database to enable it. (We tried CTEs, too slow; same with table variables and temporary tables and cursors)

                                Y Offline
                                Y Offline
                                YSLGuru
                                wrote on last edited by
                                #42

                                "(We tried CTEs, too slow; same with table variables and temporary tables and cursors)" Well that was your problem. Cursors are nothing but procedural methodology done within T-SQl so its no ownder you got bad performance. As far as CLR's the only scenario where CLR's will perform better then standard T-SQl that uses a proper set based (that means NO cursors) is where text manipulation is involed at some measurable level. For example if you want to find a key word within a very large amount of text then a CLR will do better. If however you are looking for aggregate values on some set of data then CLR's will not out perform properly written T-SQl code. I work with millions of rows on a regular basis and in a few queries, hundreads of millions of rows so I have some applicable experince with dealing with performance issues and there's no way I'd evcer use a cursor for any of the processes we do and I would only consider a CLR if the process was heavy in text manipulation. CLR's have tehir place and when used properly they're great. The problem is they are often used in the wrong scenario just as cursors are used when they should not be. This is because its much easier to use and understand a cursor in T-SQL when your background is in a procedural based language then to work out a pure set based solution (that means NO cursors). I totally understand why progarmmers opt to use cursors soo often because I've done procedural programming but that doesn't change the fact that a pure set base solution, except when heavy text manipulation is involved, will perfrom better %99 of the time if not higher.

                                S 1 Reply Last reply
                                0
                                • Y YSLGuru

                                  "(We tried CTEs, too slow; same with table variables and temporary tables and cursors)" Well that was your problem. Cursors are nothing but procedural methodology done within T-SQl so its no ownder you got bad performance. As far as CLR's the only scenario where CLR's will perform better then standard T-SQl that uses a proper set based (that means NO cursors) is where text manipulation is involed at some measurable level. For example if you want to find a key word within a very large amount of text then a CLR will do better. If however you are looking for aggregate values on some set of data then CLR's will not out perform properly written T-SQl code. I work with millions of rows on a regular basis and in a few queries, hundreads of millions of rows so I have some applicable experince with dealing with performance issues and there's no way I'd evcer use a cursor for any of the processes we do and I would only consider a CLR if the process was heavy in text manipulation. CLR's have tehir place and when used properly they're great. The problem is they are often used in the wrong scenario just as cursors are used when they should not be. This is because its much easier to use and understand a cursor in T-SQL when your background is in a procedural based language then to work out a pure set based solution (that means NO cursors). I totally understand why progarmmers opt to use cursors soo often because I've done procedural programming but that doesn't change the fact that a pure set base solution, except when heavy text manipulation is involved, will perfrom better %99 of the time if not higher.

                                  S Offline
                                  S Offline
                                  Sinisa Hajnal
                                  wrote on last edited by
                                  #43

                                  YSLGuru wrote:

                                  Well that was your problem. Cursors are nothing but procedural methodology done within T-SQl so its no ownder you got bad performance.

                                  Yes, we know, that's why we DO NOT do it unless it needs to be done. And our DBA has over 20 years of XP with relational databases and we take over only when he says he cannot do anything more to optimize query time. THEN we try CLR :) Most complicated thing I did was tariff calculation on goods transport that included dynamic items depending on distance, fuel consumption, client, special privileges etc. Initially done with CTE's it proved slow. Table variables filled independently for each step and combined near the end proved good solution, calculation time going from 20 seconds to .4 seconds

                                  1 Reply Last reply
                                  0
                                  Reply
                                  • Reply as topic
                                  Log in to reply
                                  • Oldest to Newest
                                  • Newest to Oldest
                                  • Most Votes


                                  • Login

                                  • Don't have an account? Register

                                  • Login or register to search.
                                  • First post
                                    Last post
                                  0
                                  • Categories
                                  • Recent
                                  • Tags
                                  • Popular
                                  • World
                                  • Users
                                  • Groups