Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. Table join performance - the saga continues...

Table join performance - the saga continues...

Scheduled Pinned Locked Moved The Lounge
learningpythondatabasecomsysadmin
30 Posts 24 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M Marc Clifton

    So I avoided getting into an argument with the CTO about the performance of table joins. The nice thing about communication channels like Slack is that you simply don't need to reply. His contention is that table joins are "hugely expensive", particularly for reporting. My contention is of course that a well designed database will not have this issue. I came across this[^] interesting discussion. But the bottom line is, I wonder why he has this attitude, particularly when his reports are dog slow, and the page request to the server times out if you ask for a report that spans more than a day (maybe a week.) Looking at the code, no wonder, because the reports seem to be generated almost entirely in Python, without leveraging table joins on the server itself at all! Granted, there are certainly times when a very specialized report might benefit from denormalization, but for Pete's sake, these are basic transaction detail and transaction total queries. There are so many things I disagree with the CTO about. I must have reached some level of emotional maturity before taking this job because I'm able to not get into an argument, I just say "whatever, dude" and move along. Marc

    Imperative to Functional Programming Succinctly Contributors Wanted for Higher Order Programming Project! Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny

    F Offline
    F Offline
    Foothill
    wrote on last edited by
    #2

    To play devil's advocate, learning how the database engine and query optimizer work can be daunting to those who have never coded or haven't had to code in years/decades such as your run of the mill CTO. Like a lot of developers that I have known, they do not realize the huge performance impact of improper table design, like making the primary key a string or Guid.

    if (Object.DividedByZero == true) { Universe.Implode(); } Meus ratio ex fortis machina. Simplicitatis de formae ac munus. -Foothill, 2016

    D R 2 Replies Last reply
    0
    • M Marc Clifton

      So I avoided getting into an argument with the CTO about the performance of table joins. The nice thing about communication channels like Slack is that you simply don't need to reply. His contention is that table joins are "hugely expensive", particularly for reporting. My contention is of course that a well designed database will not have this issue. I came across this[^] interesting discussion. But the bottom line is, I wonder why he has this attitude, particularly when his reports are dog slow, and the page request to the server times out if you ask for a report that spans more than a day (maybe a week.) Looking at the code, no wonder, because the reports seem to be generated almost entirely in Python, without leveraging table joins on the server itself at all! Granted, there are certainly times when a very specialized report might benefit from denormalization, but for Pete's sake, these are basic transaction detail and transaction total queries. There are so many things I disagree with the CTO about. I must have reached some level of emotional maturity before taking this job because I'm able to not get into an argument, I just say "whatever, dude" and move along. Marc

      Imperative to Functional Programming Succinctly Contributors Wanted for Higher Order Programming Project! Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny

      J Offline
      J Offline
      Jeremy Falcon
      wrote on last edited by
      #3

      I've concluded that most blanket statements are offered by those without any real knowledge. The true answer is, it depends. Tech changes, things get optimized, and so on and so forth. Even outside of that, anyone who knows anything about SQL knows two things: it depends on the data and the amount of joins in the query and it also depends on the indexes. In my experience a join can very very expensive, but it also can be quick. They're usually the most expensive on a database that's designed like garbage. Also, forgetting all other factors and just focusing on speed alone, sometimes it requires less bandwidth to send along the pipe one demoralized table to a client rather than several normalized ones. Not to mention the fact, dealing with drilling down on the B table for instance would require two queries at the very least avoid sending along two entire tables to the client so it can be processed that way. The truth is simple, if the app runs like crap, and you identified the bottleneck being either the DB or the processing of what comes out of it, then he doesn't know much. The proof is in the pudding my friend. The CPU cycles don't lie.

      Jeremy Falcon

      M 1 Reply Last reply
      0
      • F Foothill

        To play devil's advocate, learning how the database engine and query optimizer work can be daunting to those who have never coded or haven't had to code in years/decades such as your run of the mill CTO. Like a lot of developers that I have known, they do not realize the huge performance impact of improper table design, like making the primary key a string or Guid.

        if (Object.DividedByZero == true) { Universe.Implode(); } Meus ratio ex fortis machina. Simplicitatis de formae ac munus. -Foothill, 2016

        D Offline
        D Offline
        David ONeil
        wrote on last edited by
        #4

        "Ohhhh! Guid == 'Good'! Let's use it!"

        My CodeProject Articles :: Our forgotten astronomic heritage :: My website.
        "Sorry, buddy, but this mission counts on everyone being as silent as possible, and your farts are just too much of a wildcard." - Korra to Meelo, "Kuvira's Gambit"

        1 Reply Last reply
        0
        • J Jeremy Falcon

          I've concluded that most blanket statements are offered by those without any real knowledge. The true answer is, it depends. Tech changes, things get optimized, and so on and so forth. Even outside of that, anyone who knows anything about SQL knows two things: it depends on the data and the amount of joins in the query and it also depends on the indexes. In my experience a join can very very expensive, but it also can be quick. They're usually the most expensive on a database that's designed like garbage. Also, forgetting all other factors and just focusing on speed alone, sometimes it requires less bandwidth to send along the pipe one demoralized table to a client rather than several normalized ones. Not to mention the fact, dealing with drilling down on the B table for instance would require two queries at the very least avoid sending along two entire tables to the client so it can be processed that way. The truth is simple, if the app runs like crap, and you identified the bottleneck being either the DB or the processing of what comes out of it, then he doesn't know much. The proof is in the pudding my friend. The CPU cycles don't lie.

          Jeremy Falcon

          M Offline
          M Offline
          Marc Clifton
          wrote on last edited by
          #5

          Jeremy Falcon wrote:

          to send along the pipe one demoralized table

          Now I'm denormalized. ;)

          Jeremy Falcon wrote:

          The true answer is, it depends.

          Very true. It's just that he and I start with different baselines. I start with "let's assume the DB can optimize this right, given good table design, indices, etc." and if there's an issue, I look at how to optimize the query, and if that means a demoralized table :) then ok, but I had better have a good reason. The CTO starts with a different baseline, on the assumption that joins are always costly. The result is pushing processing onto the script language, of all things. Not much optimization can occur there, though frankly, the code is so krufty, I imagine there actually is a ton of optimization that could happen there, even without getting into the bowels of Django. Marc

          Imperative to Functional Programming Succinctly Contributors Wanted for Higher Order Programming Project! Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny

          F J D 3 Replies Last reply
          0
          • M Marc Clifton

            Jeremy Falcon wrote:

            to send along the pipe one demoralized table

            Now I'm denormalized. ;)

            Jeremy Falcon wrote:

            The true answer is, it depends.

            Very true. It's just that he and I start with different baselines. I start with "let's assume the DB can optimize this right, given good table design, indices, etc." and if there's an issue, I look at how to optimize the query, and if that means a demoralized table :) then ok, but I had better have a good reason. The CTO starts with a different baseline, on the assumption that joins are always costly. The result is pushing processing onto the script language, of all things. Not much optimization can occur there, though frankly, the code is so krufty, I imagine there actually is a ton of optimization that could happen there, even without getting into the bowels of Django. Marc

            Imperative to Functional Programming Succinctly Contributors Wanted for Higher Order Programming Project! Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny

            F Offline
            F Offline
            Foothill
            wrote on last edited by
            #6

            Marc Clifton wrote:

            on the assumption that joins are always costly

            I've 20 bucks that says your CTO cut his database teeth on MS Access :-\

            if (Object.DividedByZero == true) { Universe.Implode(); } Meus ratio ex fortis machina. Simplicitatis de formae ac munus. -Foothill, 2016

            1 Reply Last reply
            0
            • M Marc Clifton

              So I avoided getting into an argument with the CTO about the performance of table joins. The nice thing about communication channels like Slack is that you simply don't need to reply. His contention is that table joins are "hugely expensive", particularly for reporting. My contention is of course that a well designed database will not have this issue. I came across this[^] interesting discussion. But the bottom line is, I wonder why he has this attitude, particularly when his reports are dog slow, and the page request to the server times out if you ask for a report that spans more than a day (maybe a week.) Looking at the code, no wonder, because the reports seem to be generated almost entirely in Python, without leveraging table joins on the server itself at all! Granted, there are certainly times when a very specialized report might benefit from denormalization, but for Pete's sake, these are basic transaction detail and transaction total queries. There are so many things I disagree with the CTO about. I must have reached some level of emotional maturity before taking this job because I'm able to not get into an argument, I just say "whatever, dude" and move along. Marc

              Imperative to Functional Programming Succinctly Contributors Wanted for Higher Order Programming Project! Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny

              M Offline
              M Offline
              Mycroft Holmes
              wrote on last edited by
              #7

              I once had a DBA tell me that views were a bad idea because they had to be rebuilt every day. How did the guy get the job as CTO, I would presume there had to some experience with code, development or architecture. It sounds like this guy the epitomy of a manager, almost no knowledge of the subject matter and hoping to "manage" the department using his underlings knowledge, and you are not on site so the children have the most influence.

              Never underestimate the power of human stupidity RAH

              R V 2 Replies Last reply
              0
              • M Marc Clifton

                Jeremy Falcon wrote:

                to send along the pipe one demoralized table

                Now I'm denormalized. ;)

                Jeremy Falcon wrote:

                The true answer is, it depends.

                Very true. It's just that he and I start with different baselines. I start with "let's assume the DB can optimize this right, given good table design, indices, etc." and if there's an issue, I look at how to optimize the query, and if that means a demoralized table :) then ok, but I had better have a good reason. The CTO starts with a different baseline, on the assumption that joins are always costly. The result is pushing processing onto the script language, of all things. Not much optimization can occur there, though frankly, the code is so krufty, I imagine there actually is a ton of optimization that could happen there, even without getting into the bowels of Django. Marc

                Imperative to Functional Programming Succinctly Contributors Wanted for Higher Order Programming Project! Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny

                J Offline
                J Offline
                Jeremy Falcon
                wrote on last edited by
                #8

                Marc Clifton wrote:

                Now I'm denormalized.

                You're welcome. :laugh:

                Marc Clifton wrote:

                The result is pushing processing onto the script language

                I've seen this argument too, and there can be some merit to it. But, this side is almost always played by someone who simply just sucks at database administration. All other things being equal about the only reasonable argument I can buy in regards to do some stuff in script on a client is distributing the workload rather than the server handle it all. But I seriously doubt anyone saying a blanket "joins suck" has any idea about mitigating a workload like that over a distributed process.

                Jeremy Falcon

                F 1 Reply Last reply
                0
                • J Jeremy Falcon

                  Marc Clifton wrote:

                  Now I'm denormalized.

                  You're welcome. :laugh:

                  Marc Clifton wrote:

                  The result is pushing processing onto the script language

                  I've seen this argument too, and there can be some merit to it. But, this side is almost always played by someone who simply just sucks at database administration. All other things being equal about the only reasonable argument I can buy in regards to do some stuff in script on a client is distributing the workload rather than the server handle it all. But I seriously doubt anyone saying a blanket "joins suck" has any idea about mitigating a workload like that over a distributed process.

                  Jeremy Falcon

                  F Offline
                  F Offline
                  Foothill
                  wrote on last edited by
                  #9

                  Jeremy Falcon wrote:

                  distributing the workload rather than the server handle it all

                  I have been wanting to learn how to build CLR SQL function DLLs to perform the kind of work that would be a candidate for distributed processing because the SQL statement would be 20K+ lines long but I just haven't found the time yet. It has the potential liberate DBAs everywhere from developers that don't know/refuse to use the more complicated T SQL bits. CLR in SQL tutorial

                  if (Object.DividedByZero == true) { Universe.Implode(); } Meus ratio ex fortis machina. Simplicitatis de formae ac munus. -Foothill, 2016

                  1 Reply Last reply
                  0
                  • M Marc Clifton

                    Jeremy Falcon wrote:

                    to send along the pipe one demoralized table

                    Now I'm denormalized. ;)

                    Jeremy Falcon wrote:

                    The true answer is, it depends.

                    Very true. It's just that he and I start with different baselines. I start with "let's assume the DB can optimize this right, given good table design, indices, etc." and if there's an issue, I look at how to optimize the query, and if that means a demoralized table :) then ok, but I had better have a good reason. The CTO starts with a different baseline, on the assumption that joins are always costly. The result is pushing processing onto the script language, of all things. Not much optimization can occur there, though frankly, the code is so krufty, I imagine there actually is a ton of optimization that could happen there, even without getting into the bowels of Django. Marc

                    Imperative to Functional Programming Succinctly Contributors Wanted for Higher Order Programming Project! Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny

                    D Offline
                    D Offline
                    David Crow
                    wrote on last edited by
                    #10

                    Your argument reminds me of a time I worked for an organization that used a dBase-equivalent as their database engine. It did what it needed to do so there weren't many complaints, except when it came to certain reports. One in particular took nearly an hour to run, so it was only requested as needed. I didn't have much going on one day and I decided to poke around in the code that was in charge of that report. I looked it over and felt confident that I knew what it was doing, something the equivalent of "SELECT * FROM <table>" and then the result set was further processed (e.g., filtering, sorting) by the desktop code. I commented out a few lines of code and changed the query so that the database engine was doing all of the work rather than the desktop code. When I went to test that report, it came back almost instantly, so much so that I just assumed it failed. I did this several times and each time the report was instant and complete. I left it in place so that my supervisor could try it out when he returned. He was obviously skeptical at first, but when he saw it go, it was convincing enough that he told several users out in the business office to try it out too. Much praise was thrown our way. Moral of the story: design the tables correctly, and let the database engine handle the workload.

                    "One man's wage rise is another man's price increase." - Harold Wilson

                    "Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons

                    "You can easily judge the character of a man by how he treats those who can do nothing for him." - James D. Miles

                    1 Reply Last reply
                    0
                    • M Mycroft Holmes

                      I once had a DBA tell me that views were a bad idea because they had to be rebuilt every day. How did the guy get the job as CTO, I would presume there had to some experience with code, development or architecture. It sounds like this guy the epitomy of a manager, almost no knowledge of the subject matter and hoping to "manage" the department using his underlings knowledge, and you are not on site so the children have the most influence.

                      Never underestimate the power of human stupidity RAH

                      R Offline
                      R Offline
                      RossMW
                      wrote on last edited by
                      #11

                      I had a similiar discussion once with our Oracle DBA. I was asked to run queries on a views he had over tables rather then the table directly as the view will not impact on the tables performance...:confused:

                      V Richard DeemingR 2 Replies Last reply
                      0
                      • M Marc Clifton

                        So I avoided getting into an argument with the CTO about the performance of table joins. The nice thing about communication channels like Slack is that you simply don't need to reply. His contention is that table joins are "hugely expensive", particularly for reporting. My contention is of course that a well designed database will not have this issue. I came across this[^] interesting discussion. But the bottom line is, I wonder why he has this attitude, particularly when his reports are dog slow, and the page request to the server times out if you ask for a report that spans more than a day (maybe a week.) Looking at the code, no wonder, because the reports seem to be generated almost entirely in Python, without leveraging table joins on the server itself at all! Granted, there are certainly times when a very specialized report might benefit from denormalization, but for Pete's sake, these are basic transaction detail and transaction total queries. There are so many things I disagree with the CTO about. I must have reached some level of emotional maturity before taking this job because I'm able to not get into an argument, I just say "whatever, dude" and move along. Marc

                        Imperative to Functional Programming Succinctly Contributors Wanted for Higher Order Programming Project! Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny

                        J Offline
                        J Offline
                        JohnLBevan
                        wrote on last edited by
                        #12

                        Some people read blogs offering advice and put the conclusions into their "internal rules list" instead of understanding the reasoning behind that advice. Normally those people are more likely to make it into management (and thus become CTOs) because they're able to make decisions faster & don't act "wishy washy"; i.e. because they just jump straight to an existing conclusion without taking time to assess the context and apply any thought. In some cases avoiding joins is better; though only if done in a sensible way (e.g. just moving joins into Python probably won't help). A good example is a BI solution where you're pulling data from a (normalised database) transactional system, and de-normalise it in your ETL phase, moving the grunt work to the out of hours batch instead of running it each time a user runs a report. Simple answer is to try to present a reasoned argument and (better / if time) build out both solutions to prove your point with real data. Good luck!

                        1 Reply Last reply
                        0
                        • M Marc Clifton

                          So I avoided getting into an argument with the CTO about the performance of table joins. The nice thing about communication channels like Slack is that you simply don't need to reply. His contention is that table joins are "hugely expensive", particularly for reporting. My contention is of course that a well designed database will not have this issue. I came across this[^] interesting discussion. But the bottom line is, I wonder why he has this attitude, particularly when his reports are dog slow, and the page request to the server times out if you ask for a report that spans more than a day (maybe a week.) Looking at the code, no wonder, because the reports seem to be generated almost entirely in Python, without leveraging table joins on the server itself at all! Granted, there are certainly times when a very specialized report might benefit from denormalization, but for Pete's sake, these are basic transaction detail and transaction total queries. There are so many things I disagree with the CTO about. I must have reached some level of emotional maturity before taking this job because I'm able to not get into an argument, I just say "whatever, dude" and move along. Marc

                          Imperative to Functional Programming Succinctly Contributors Wanted for Higher Order Programming Project! Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny

                          B Offline
                          B Offline
                          Brady Kelly
                          wrote on last edited by
                          #13

                          Love your sig! :thumbsup:

                          Do what thou wilt shall be the whole of the Law. - Liber AL vel Legis 1:40, Aleister Crowley

                          1 Reply Last reply
                          0
                          • M Marc Clifton

                            So I avoided getting into an argument with the CTO about the performance of table joins. The nice thing about communication channels like Slack is that you simply don't need to reply. His contention is that table joins are "hugely expensive", particularly for reporting. My contention is of course that a well designed database will not have this issue. I came across this[^] interesting discussion. But the bottom line is, I wonder why he has this attitude, particularly when his reports are dog slow, and the page request to the server times out if you ask for a report that spans more than a day (maybe a week.) Looking at the code, no wonder, because the reports seem to be generated almost entirely in Python, without leveraging table joins on the server itself at all! Granted, there are certainly times when a very specialized report might benefit from denormalization, but for Pete's sake, these are basic transaction detail and transaction total queries. There are so many things I disagree with the CTO about. I must have reached some level of emotional maturity before taking this job because I'm able to not get into an argument, I just say "whatever, dude" and move along. Marc

                            Imperative to Functional Programming Succinctly Contributors Wanted for Higher Order Programming Project! Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny

                            J Offline
                            J Offline
                            Jorgen Andersson
                            wrote on last edited by
                            #14

                            Marc Clifton wrote:

                            table joins are "hugely expensive", particularly for reporting

                            Since I've been working for ten years doing exactly reporting, from databases, I call that bullshit.

                            Marc Clifton wrote:

                            a well designed database will not have this issue.

                            That's a must, but not all that matters. The key to performance at every stage, is to keep the amount of data down. What you always have to keep in mind is the order of logical execution. First the joins are done, then the filtering(WHERE), aggregation (GROUP BY), Filtering on aggregates (HAVING) and so on. Note that the optimizer can decide to change the physical execution order if it believes it performs better. And the most important trick I've learned is to change the order of execution using CTE's. If you filter before you join, you handle less data.

                            Wrong is evil and must be defeated. - Jeff Ello

                            1 Reply Last reply
                            0
                            • M Mycroft Holmes

                              I once had a DBA tell me that views were a bad idea because they had to be rebuilt every day. How did the guy get the job as CTO, I would presume there had to some experience with code, development or architecture. It sounds like this guy the epitomy of a manager, almost no knowledge of the subject matter and hoping to "manage" the department using his underlings knowledge, and you are not on site so the children have the most influence.

                              Never underestimate the power of human stupidity RAH

                              V Offline
                              V Offline
                              Vark111
                              wrote on last edited by
                              #15

                              Mycroft Holmes wrote:

                              I once had a DBA tell me that views were a bad idea because they had to be rebuilt every day.

                              Materialized Views (an Oracle thing) do have to be rebuilt at some defined interval (not necessarily every day - the timing just depends on the nature of your data). That's not to say they are a bad idea because of that. Just have to be aware of it when you design your data arch.

                              1 Reply Last reply
                              0
                              • R RossMW

                                I had a similiar discussion once with our Oracle DBA. I was asked to run queries on a views he had over tables rather then the table directly as the view will not impact on the tables performance...:confused:

                                V Offline
                                V Offline
                                Vark111
                                wrote on last edited by
                                #16

                                If the view was a Materialized View, then your Oracle DBA was correct.

                                1 Reply Last reply
                                0
                                • M Marc Clifton

                                  So I avoided getting into an argument with the CTO about the performance of table joins. The nice thing about communication channels like Slack is that you simply don't need to reply. His contention is that table joins are "hugely expensive", particularly for reporting. My contention is of course that a well designed database will not have this issue. I came across this[^] interesting discussion. But the bottom line is, I wonder why he has this attitude, particularly when his reports are dog slow, and the page request to the server times out if you ask for a report that spans more than a day (maybe a week.) Looking at the code, no wonder, because the reports seem to be generated almost entirely in Python, without leveraging table joins on the server itself at all! Granted, there are certainly times when a very specialized report might benefit from denormalization, but for Pete's sake, these are basic transaction detail and transaction total queries. There are so many things I disagree with the CTO about. I must have reached some level of emotional maturity before taking this job because I'm able to not get into an argument, I just say "whatever, dude" and move along. Marc

                                  Imperative to Functional Programming Succinctly Contributors Wanted for Higher Order Programming Project! Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny

                                  C Offline
                                  C Offline
                                  charlieg
                                  wrote on last edited by
                                  #17

                                  I'll take that $20 and raise it. My guess is that the CTO had "a bad experience" early in his career with databases and joins, so much so that his view is prejudiced with obsolete knowledge. Run into it time and again with other technical items.

                                  Charlie Gilley Stuck in a dysfunctional matrix from which I must escape... "Where liberty dwells, there is my country." B. Franklin, 1783 “They who can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety.” BF, 1759

                                  1 Reply Last reply
                                  0
                                  • R RossMW

                                    I had a similiar discussion once with our Oracle DBA. I was asked to run queries on a views he had over tables rather then the table directly as the view will not impact on the tables performance...:confused:

                                    Richard DeemingR Offline
                                    Richard DeemingR Offline
                                    Richard Deeming
                                    wrote on last edited by
                                    #18

                                    If it was MS SQL Server, it wouldn't surprise me if his views all had the WITH (NOLOCK) hint on every table. But AFAIK, Oracle rather sensibly doesn't have that option.


                                    "These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer

                                    "These people looked deep within my soul and assigned me a number based on the order in which I joined" - Homer

                                    1 Reply Last reply
                                    0
                                    • M Marc Clifton

                                      So I avoided getting into an argument with the CTO about the performance of table joins. The nice thing about communication channels like Slack is that you simply don't need to reply. His contention is that table joins are "hugely expensive", particularly for reporting. My contention is of course that a well designed database will not have this issue. I came across this[^] interesting discussion. But the bottom line is, I wonder why he has this attitude, particularly when his reports are dog slow, and the page request to the server times out if you ask for a report that spans more than a day (maybe a week.) Looking at the code, no wonder, because the reports seem to be generated almost entirely in Python, without leveraging table joins on the server itself at all! Granted, there are certainly times when a very specialized report might benefit from denormalization, but for Pete's sake, these are basic transaction detail and transaction total queries. There are so many things I disagree with the CTO about. I must have reached some level of emotional maturity before taking this job because I'm able to not get into an argument, I just say "whatever, dude" and move along. Marc

                                      Imperative to Functional Programming Succinctly Contributors Wanted for Higher Order Programming Project! Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny

                                      Richard DeemingR Offline
                                      Richard DeemingR Offline
                                      Richard Deeming
                                      wrote on last edited by
                                      #19

                                      Don't argue about it - measure it! Measure the performance of his solution, loading the data from all tables into memory and then manipulating it with imperative code. And then measure the performance of a query that joins the tables in the database. If his version is faster, then there's something seriously wrong with the query, or with the database. :)


                                      "These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer

                                      "These people looked deep within my soul and assigned me a number based on the order in which I joined" - Homer

                                      M 1 Reply Last reply
                                      0
                                      • Richard DeemingR Richard Deeming

                                        Don't argue about it - measure it! Measure the performance of his solution, loading the data from all tables into memory and then manipulating it with imperative code. And then measure the performance of a query that joins the tables in the database. If his version is faster, then there's something seriously wrong with the query, or with the database. :)


                                        "These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer

                                        M Offline
                                        M Offline
                                        Middle Manager
                                        wrote on last edited by
                                        #20

                                        This was my thought too. Seems like a no-brainer. However if there are ugly politics at play or people's egos getting bruised one must weight his own's benefit it interfering. It sounds like Marc may be doing this by choosing to let this guy wallow in his own shit.

                                        1 Reply Last reply
                                        0
                                        • M Marc Clifton

                                          So I avoided getting into an argument with the CTO about the performance of table joins. The nice thing about communication channels like Slack is that you simply don't need to reply. His contention is that table joins are "hugely expensive", particularly for reporting. My contention is of course that a well designed database will not have this issue. I came across this[^] interesting discussion. But the bottom line is, I wonder why he has this attitude, particularly when his reports are dog slow, and the page request to the server times out if you ask for a report that spans more than a day (maybe a week.) Looking at the code, no wonder, because the reports seem to be generated almost entirely in Python, without leveraging table joins on the server itself at all! Granted, there are certainly times when a very specialized report might benefit from denormalization, but for Pete's sake, these are basic transaction detail and transaction total queries. There are so many things I disagree with the CTO about. I must have reached some level of emotional maturity before taking this job because I'm able to not get into an argument, I just say "whatever, dude" and move along. Marc

                                          Imperative to Functional Programming Succinctly Contributors Wanted for Higher Order Programming Project! Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny

                                          J Offline
                                          J Offline
                                          James Curran
                                          wrote on last edited by
                                          #21

                                          Well, of course joining two tables will take some time. It doesn't matter where you join them -- in the DB or in the Python code --- it's going to take some time. However, it is always better to let the DB do what it is good it --- working with data. I'm always amazed the number of developers who don't recognize that simple divide: - EVERYTHING related to data (querying, filtering, selecting) should be done in the Database. - NOTHING else should be.

                                          Truth, James

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups