Top 100
-
So I am working with some legacy data trying to clean it up and move it into a data mart. Scenario 5 files of transactions varying from 10k to 10m rows with 100 columns (yeah I know a disgusting number of columns but I'm still working on it) I load the csv file into a staging table using BCP with all columns as varchar. Update the dimension tables Working with the smallest table 13k rows. Using a select statement with joins to the dimension table to get the FK ids I have 10 inner joins so far. The Issue To reduce the response time I added TOP 100 to the select statement and was horrified to get a result in 2:26, when I remove the TOP 10 the execution time DROPS to 16 seconds. I always thought top N wold reduce the response time not increase it by orders f magnitude!
Never underestimate the power of human stupidity RAH
-
So I am working with some legacy data trying to clean it up and move it into a data mart. Scenario 5 files of transactions varying from 10k to 10m rows with 100 columns (yeah I know a disgusting number of columns but I'm still working on it) I load the csv file into a staging table using BCP with all columns as varchar. Update the dimension tables Working with the smallest table 13k rows. Using a select statement with joins to the dimension table to get the FK ids I have 10 inner joins so far. The Issue To reduce the response time I added TOP 100 to the select statement and was horrified to get a result in 2:26, when I remove the TOP 10 the execution time DROPS to 16 seconds. I always thought top N wold reduce the response time not increase it by orders f magnitude!
Never underestimate the power of human stupidity RAH
I guess it makes a sort of the set before picking the TOP 100.
"When did ignorance become a point of view" - Dilbert
-
I guess it makes a sort of the set before picking the TOP 100.
"When did ignorance become a point of view" - Dilbert
If I had an order by clause I would expect that but there is no order by. I wonder if SQL 2008 has some implicit ordering (usually the first column in the select statement) that is impacting the query?
Never underestimate the power of human stupidity RAH
-
If I had an order by clause I would expect that but there is no order by. I wonder if SQL 2008 has some implicit ordering (usually the first column in the select statement) that is impacting the query?
Never underestimate the power of human stupidity RAH
It depends on the query plan. But you're having some joins so I would assume so.
"When did ignorance become a point of view" - Dilbert
-
It depends on the query plan. But you're having some joins so I would assume so.
"When did ignorance become a point of view" - Dilbert
I have around 12 joins and adding more as the day dissapears. I am not bothered about the query plan at this point as I still designing the procedure. I was just astonished that the TOP 10 increases the response time!
Never underestimate the power of human stupidity RAH
-
I have around 12 joins and adding more as the day dissapears. I am not bothered about the query plan at this point as I still designing the procedure. I was just astonished that the TOP 10 increases the response time!
Never underestimate the power of human stupidity RAH
If you are doing table scans in there, this will have an impact as your code has to read all the rows to determine what the top 100 actually are. The QEP will identify where the problems are.
"WPF has many lovers. It's a veritable porn star!" - Josh Smith
As Braveheart once said, "You can take our freedom but you'll never take our Hobnobs!" - Martin Hughes.
-
If you are doing table scans in there, this will have an impact as your code has to read all the rows to determine what the top 100 actually are. The QEP will identify where the problems are.
"WPF has many lovers. It's a veritable porn star!" - Josh Smith
As Braveheart once said, "You can take our freedom but you'll never take our Hobnobs!" - Martin Hughes.
My assumption is that without and order by clause it should not have to scan the table, just get the first 10 records. It was during development so the performance was not really relevant as I had to tune the indexes after stabilising the query. I was just astonished it was slower than the full select. Transforming this data in is going to take about 6 hours by the look of it, thankfully it is monthly not daily!
Never underestimate the power of human stupidity RAH