Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. Database & SysAdmin
  3. Database
  4. Slow query when using @variable in Where clause [modified]

Slow query when using @variable in Where clause [modified]

Scheduled Pinned Locked Moved Database
databasecsshelpannouncement
34 Posts 4 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M Meysam Mahfouzi

    No. it was not intentional. The actual query is using pagination (using ROWNUMBER()) and rows are actually ordered by the Date column. This is just a simple sample query.

    W Offline
    W Offline
    Wendelius
    wrote on last edited by
    #10

    That's what I suspected. And now we're actually talking about a whole different story. Now if you modify the query and use a date column, for example something like:

    DECLARE @c TINYINT
    SET @c = 0
    SELECT TOP 10 *
    FROM News
    WHERE Culture = @c
    ORDER BY AdditionDate

    you need to add a new index to your table on columns Culture, AdditionDate. If you make that modification to your query, your filter factor will be dramatically different so the index will be efficient. There's a simple rule of thumb (which is correct in most of the cases). If the filter factor regarding an index is less than 5%, using index is efficient. Above that it's not efficient. So in your first example if you had 4 different cultures and let's say that they were evenly distributed, the filter factor was 25%. So in that case using the index (culture) is going to be a very poor solution by the optimizer. But if it does use the index in that scenario (as I suspect it did) you will suffer from poor performance. In such cases actually removing the index makes the performance better (unless the index is used in other scenarios). The reasons why some of your test were performing better was because the data was unevenly distributed. So what I suggest is that you first modify the query to the final version and begin optimization after that (if needed anymore). What I said about UPDATE STATISTICS is still a valid statement.

    The need to optimize rises from a bad design.My articles[^]

    M 1 Reply Last reply
    0
    • M Meysam Mahfouzi

      Hi Ben, I couldn't see the execution plan for the long running query since it didn't finish executing after 25 minutes and I couldn't wait more :)

      B Offline
      B Offline
      Ben Fair
      wrote on last edited by
      #11

      It looks like Mika has helped you very well, and hopefully you're on your way to resolving the problem. I should've been more explicit in my message that I was referring to the Estimated Execution Plan rather than the Actual Execution Plan. For the Estimated execution plan, the query is not actually run, it's just parsed and analyzed. It's handy to see how SQL Server plans on executing the query, which may be different than the execution plan that it actually uses. However, I've found it to be a very good resource in situations like this where something is long-running and the actual execution plan is hard to come by.

      Hope in one hand and poop in the other; see which fills up first. Hope and change were good slogans, now show us more than words.

      1 Reply Last reply
      0
      • W Wendelius

        That's what I suspected. And now we're actually talking about a whole different story. Now if you modify the query and use a date column, for example something like:

        DECLARE @c TINYINT
        SET @c = 0
        SELECT TOP 10 *
        FROM News
        WHERE Culture = @c
        ORDER BY AdditionDate

        you need to add a new index to your table on columns Culture, AdditionDate. If you make that modification to your query, your filter factor will be dramatically different so the index will be efficient. There's a simple rule of thumb (which is correct in most of the cases). If the filter factor regarding an index is less than 5%, using index is efficient. Above that it's not efficient. So in your first example if you had 4 different cultures and let's say that they were evenly distributed, the filter factor was 25%. So in that case using the index (culture) is going to be a very poor solution by the optimizer. But if it does use the index in that scenario (as I suspect it did) you will suffer from poor performance. In such cases actually removing the index makes the performance better (unless the index is used in other scenarios). The reasons why some of your test were performing better was because the data was unevenly distributed. So what I suggest is that you first modify the query to the final version and begin optimization after that (if needed anymore). What I said about UPDATE STATISTICS is still a valid statement.

        The need to optimize rises from a bad design.My articles[^]

        M Offline
        M Offline
        Meysam Mahfouzi
        wrote on last edited by
        #12

        Dear Mika, Here is the most similar query to the actual working one in our database:

        DECLARE @Start INT, @Count INT

        SET @Start = 10
        SET @Count = 5

        DECLARE @c TINYINT
        SET @c = 1;

        WITH paging AS (
        SELECT id, title, description, ROW_NUMBER() OVER (ORDER BY Date DESC) rownum
        FROM News
        WHERE Culture = @c
        )
        SELECT * FROM paging WHERE rownum BETWEEN @Start AND (@Start + @Count - 1)

        The above query runs very fast but when I set @c to 0, it becomes a long-running query. From execution plan it's seen that only Date index is used while Culture index is not. I have uploaded an image of the execution plan of the above query here[^] for you to have a look at. I also ran "UPDATE STATISTICS News" which made no difference. Thanks for any help

        modified on Wednesday, January 28, 2009 2:49 AM

        W 1 Reply Last reply
        0
        • R Rob Philpott

          I had a very similar issue recently myself. Not eaxactly sure why it is but it was suggested that SQL Server can't optimise queries which contain variables - they could change as the query executes. Anyway, the general conclusion and advice I drew was to avoid using variables and join onto other tables where possible. Have you tried inserting @c into a single line temporary table and joining on that? I admit that its rather a grubby solution, but then again it is SQL Server and sometimes you have to hold its hand.

          Regards, Rob Philpott.

          M Offline
          M Offline
          Meysam Mahfouzi
          wrote on last edited by
          #13

          Rob Philpott wrote:

          Have you tried inserting @c into a single line temporary table and joining on that?

          Dear Rob, I used the following query but it didn't make any difference:

          DECLARE @Start INT, @Count INT

          SET @Start = 10
          SET @Count = 5

          DECLARE @c TINYINT
          SET @c = 0;

          WITH paging AS (
          SELECT id, title, [description], ROW_NUMBER() OVER (ORDER BY Date DESC) rownum
          FROM news
          JOIN (SELECT @C AS Cult) AS CC ON news.Culture = CC.Cult
          --WHERE Culture = @c
          )
          SELECT * FROM paging WHERE rownum BETWEEN @Start AND (@Start + @Count - 1)

          modified on Wednesday, January 28, 2009 3:03 AM

          R 1 Reply Last reply
          0
          • M Meysam Mahfouzi

            Dear Mika, Here is the most similar query to the actual working one in our database:

            DECLARE @Start INT, @Count INT

            SET @Start = 10
            SET @Count = 5

            DECLARE @c TINYINT
            SET @c = 1;

            WITH paging AS (
            SELECT id, title, description, ROW_NUMBER() OVER (ORDER BY Date DESC) rownum
            FROM News
            WHERE Culture = @c
            )
            SELECT * FROM paging WHERE rownum BETWEEN @Start AND (@Start + @Count - 1)

            The above query runs very fast but when I set @c to 0, it becomes a long-running query. From execution plan it's seen that only Date index is used while Culture index is not. I have uploaded an image of the execution plan of the above query here[^] for you to have a look at. I also ran "UPDATE STATISTICS News" which made no difference. Thanks for any help

            modified on Wednesday, January 28, 2009 2:49 AM

            W Offline
            W Offline
            Wendelius
            wrote on last edited by
            #14

            How many rows do you have in the table that have Culture 0 and how many for Culture 1? Also is the estimated execution plan same in both cases?

            The need to optimize rises from a bad design.My articles[^]

            M 1 Reply Last reply
            0
            • W Wendelius

              How many rows do you have in the table that have Culture 0 and how many for Culture 1? Also is the estimated execution plan same in both cases?

              The need to optimize rises from a bad design.My articles[^]

              M Offline
              M Offline
              Meysam Mahfouzi
              wrote on last edited by
              #15

              Right now all culture values in database are 1. And yes, the Estimated Execution Plan is exactly the same for both @c = 1 and @c = 0.

              W 1 Reply Last reply
              0
              • M Meysam Mahfouzi

                Right now all culture values in database are 1. And yes, the Estimated Execution Plan is exactly the same for both @c = 1 and @c = 0.

                W Offline
                W Offline
                Wendelius
                wrote on last edited by
                #16

                Okay, first you could add an index with two columns: Culture, Date DESC:

                CREATE INDEX X_Test ON News (Culture, Date DESC);

                After that, see what happens for the execution plan (and the performance). After doing that, you could experiment a change to the query (I suspect that the above modification will not be enough, but if it is, the skip the rest):

                SELECT TOP(@Count)
                id, title, description
                FROM News a
                WHERE Culture = @c
                AND @Start <= (SELECT COUNT(*)
                FROM News b
                WHERE b.Date < a.Date
                AND b.Culture = @c)
                ORDER BY Date DESC

                Again see execution plan and performance. Also is the ID column defined as IDENTITY and are the records added to the table in date order. In other words what I'm after, if the identity is greater on one row is it always true that its a newer post based on date column? If you could post the results for those two operations.

                The need to optimize rises from a bad design.My articles[^]

                M 1 Reply Last reply
                0
                • W Wendelius

                  Okay, first you could add an index with two columns: Culture, Date DESC:

                  CREATE INDEX X_Test ON News (Culture, Date DESC);

                  After that, see what happens for the execution plan (and the performance). After doing that, you could experiment a change to the query (I suspect that the above modification will not be enough, but if it is, the skip the rest):

                  SELECT TOP(@Count)
                  id, title, description
                  FROM News a
                  WHERE Culture = @c
                  AND @Start <= (SELECT COUNT(*)
                  FROM News b
                  WHERE b.Date < a.Date
                  AND b.Culture = @c)
                  ORDER BY Date DESC

                  Again see execution plan and performance. Also is the ID column defined as IDENTITY and are the records added to the table in date order. In other words what I'm after, if the identity is greater on one row is it always true that its a newer post based on date column? If you could post the results for those two operations.

                  The need to optimize rises from a bad design.My articles[^]

                  M Offline
                  M Offline
                  Meysam Mahfouzi
                  wrote on last edited by
                  #17

                  Dear Mika, Adding the following index:

                  CREATE INDEX X_Test ON News (Culture, Date DESC);

                  didn't solve the problem. But after I disabled the index which only was on Date column, both queries ran very fast. (that is, disabling an index apparently solved the problem). The index containing the two columns (Culture, Date) was not being used as long as there was an index on Date column. So now, this query runs very fast for both @c = 0 and @c = 1

                  DECLARE @Start INT, @Count INT

                  SET @Start = 1
                  SET @Count = 5

                  DECLARE @c TINYINT
                  SET @c = 1;

                  WITH paging AS (
                  SELECT id, title, [description], ROW_NUMBER() OVER (ORDER BY Date DESC) rownum
                  FROM News
                  WHERE Culture = @c
                  )
                  SELECT * FROM paging WHERE rownum BETWEEN @Start AND (@Start + @Count - 1)

                  But it turned out to be a temporary cause of happiness! If I start changing the value of @start parameter, things will start to change. When @c is set to 1, the query runs very fast for the following @start values: 1, 10, 100, 1000, 10,000 But when I set @start to 100,000, the query become a long-running one (It took 2:27 mins to execute). After executing the query again (with @start set to 100,000) it executed very fast (probably due to previous execution). Now when I set @start to 200,000 again it goes to sleep... When @c is set to 0, it will execute very fast for any value of @start variable. Regarding your question about Date column: As I told you, I've inserted 1,500,000 rows randomly into News table. Therefor, even though the ID column is an incrementing Identity column, the Date is not necessary greater for greater ID columns. In the real-world running database though, the Date value is greater for greater ID columns for sure. Here is the execution plan of my query with the new two-column index: look[^] and here is the execution plan for your query: look[^] ** p.s. I also noted that the order of columns in Index is important in this case. If I move the Culture to become the second column in index, the index becomes of no use.

                  W 1 Reply Last reply
                  0
                  • M Meysam Mahfouzi

                    Dear Mika, Adding the following index:

                    CREATE INDEX X_Test ON News (Culture, Date DESC);

                    didn't solve the problem. But after I disabled the index which only was on Date column, both queries ran very fast. (that is, disabling an index apparently solved the problem). The index containing the two columns (Culture, Date) was not being used as long as there was an index on Date column. So now, this query runs very fast for both @c = 0 and @c = 1

                    DECLARE @Start INT, @Count INT

                    SET @Start = 1
                    SET @Count = 5

                    DECLARE @c TINYINT
                    SET @c = 1;

                    WITH paging AS (
                    SELECT id, title, [description], ROW_NUMBER() OVER (ORDER BY Date DESC) rownum
                    FROM News
                    WHERE Culture = @c
                    )
                    SELECT * FROM paging WHERE rownum BETWEEN @Start AND (@Start + @Count - 1)

                    But it turned out to be a temporary cause of happiness! If I start changing the value of @start parameter, things will start to change. When @c is set to 1, the query runs very fast for the following @start values: 1, 10, 100, 1000, 10,000 But when I set @start to 100,000, the query become a long-running one (It took 2:27 mins to execute). After executing the query again (with @start set to 100,000) it executed very fast (probably due to previous execution). Now when I set @start to 200,000 again it goes to sleep... When @c is set to 0, it will execute very fast for any value of @start variable. Regarding your question about Date column: As I told you, I've inserted 1,500,000 rows randomly into News table. Therefor, even though the ID column is an incrementing Identity column, the Date is not necessary greater for greater ID columns. In the real-world running database though, the Date value is greater for greater ID columns for sure. Here is the execution plan of my query with the new two-column index: look[^] and here is the execution plan for your query: look[^] ** p.s. I also noted that the order of columns in Index is important in this case. If I move the Culture to become the second column in index, the index becomes of no use.

                    W Offline
                    W Offline
                    Wendelius
                    wrote on last edited by
                    #18

                    Maysam Mahfouzi wrote:

                    after I disabled the index which only was on Date column, both queries ran very fast

                    Yeah, that's on of those annoying features for SQL Server optimizer. It isn't very good in index selection.

                    Maysam Mahfouzi wrote:

                    But when I set @start to 100,000, the query become a long-running one

                    Yes, that was predictable since the further you go, the more rows have to be read before you're "in the right position". And I suspect that this was the way both query versions behaved?

                    Maysam Mahfouzi wrote:

                    In the real-world running database though, the Date value is greater for greater ID columns for sure.

                    Is that something we can rely on? If it is, it would be possible to change the query to be based on id (however in that case you should regenerate the test environment data to correspond actual data). Another option is that can you change the parameters? Instead of defining start point as number of previous rows is ot ok to use some other mechanism (...thinking about the solution...)

                    The need to optimize rises from a bad design.My articles[^]

                    M 1 Reply Last reply
                    0
                    • W Wendelius

                      Maysam Mahfouzi wrote:

                      after I disabled the index which only was on Date column, both queries ran very fast

                      Yeah, that's on of those annoying features for SQL Server optimizer. It isn't very good in index selection.

                      Maysam Mahfouzi wrote:

                      But when I set @start to 100,000, the query become a long-running one

                      Yes, that was predictable since the further you go, the more rows have to be read before you're "in the right position". And I suspect that this was the way both query versions behaved?

                      Maysam Mahfouzi wrote:

                      In the real-world running database though, the Date value is greater for greater ID columns for sure.

                      Is that something we can rely on? If it is, it would be possible to change the query to be based on id (however in that case you should regenerate the test environment data to correspond actual data). Another option is that can you change the parameters? Instead of defining start point as number of previous rows is ot ok to use some other mechanism (...thinking about the solution...)

                      The need to optimize rises from a bad design.My articles[^]

                      M Offline
                      M Offline
                      Meysam Mahfouzi
                      wrote on last edited by
                      #19

                      Mika Wendelius wrote:

                      And I suspect that this was the way both query versions behaved?

                      It only happened for @c = 1

                      Mika Wendelius wrote:

                      Is that something we can rely on?

                      Probably, but I think it would be better to write a query independent of Date values. Frankly, I think my query is not much complicated and everybody now and then my find himself dealing with such queries. It's really sad that SQL Server is not able to handle it. Thanks for trying to help me Mika, please let me know if you come up with any solution.

                      W 1 Reply Last reply
                      0
                      • M Meysam Mahfouzi

                        Mika Wendelius wrote:

                        And I suspect that this was the way both query versions behaved?

                        It only happened for @c = 1

                        Mika Wendelius wrote:

                        Is that something we can rely on?

                        Probably, but I think it would be better to write a query independent of Date values. Frankly, I think my query is not much complicated and everybody now and then my find himself dealing with such queries. It's really sad that SQL Server is not able to handle it. Thanks for trying to help me Mika, please let me know if you come up with any solution.

                        W Offline
                        W Offline
                        Wendelius
                        wrote on last edited by
                        #20

                        What about the parameter question I asked? I was thinking about a solution where you would define a starting point as float. Initial starting point is 0 and you pass how many rows you want per page (for example 10). Next time you call this query, you would specify the starting point as the float found on the last row of the page you got and then again how many rows you want. So what this means in action is that first you create a new column:

                        ALTER TABLE News
                        ADD StartPoint AS ((Culture * 100000000) + CONVERT(float, Date))
                        PERSISTED

                        Then you index it:

                        CREATE NONCLUSTERED INDEX X_StartPoint ON News (StartPoint)

                        Now the query would be like:

                        DECLARE @Start float, @Count INT
                        SET @Start = 0
                        SET @Count = 5
                        DECLARE @c TINYINT
                        SET @c = 1;
                        SELECT TOP(@Count)
                        id, title, description, StartPoint
                        FROM News a
                        WHERE Culture = @c
                        AND StartPoint > @Start
                        AND StartPoint < ((@c + 1) * 100000000)
                        ORDER BY Start

                        The need to optimize rises from a bad design.My articles[^]

                        M 1 Reply Last reply
                        0
                        • M Meysam Mahfouzi

                          Rob Philpott wrote:

                          Have you tried inserting @c into a single line temporary table and joining on that?

                          Dear Rob, I used the following query but it didn't make any difference:

                          DECLARE @Start INT, @Count INT

                          SET @Start = 10
                          SET @Count = 5

                          DECLARE @c TINYINT
                          SET @c = 0;

                          WITH paging AS (
                          SELECT id, title, [description], ROW_NUMBER() OVER (ORDER BY Date DESC) rownum
                          FROM news
                          JOIN (SELECT @C AS Cult) AS CC ON news.Culture = CC.Cult
                          --WHERE Culture = @c
                          )
                          SELECT * FROM paging WHERE rownum BETWEEN @Start AND (@Start + @Count - 1)

                          modified on Wednesday, January 28, 2009 3:03 AM

                          R Offline
                          R Offline
                          Rob Philpott
                          wrote on last edited by
                          #21

                          Right, might be barking up the wrong tree then. How about using a proper temporary table such:

                          declare @c tinyint
                          set @c = 0

                          create table #c (Culture tinyint)

                          insert into #c select @c

                          select id, title, [description] from news join #c on news.Culture = #c.Culture

                          drop table #c

                          Regards, Rob Philpott.

                          1 Reply Last reply
                          0
                          • W Wendelius

                            What about the parameter question I asked? I was thinking about a solution where you would define a starting point as float. Initial starting point is 0 and you pass how many rows you want per page (for example 10). Next time you call this query, you would specify the starting point as the float found on the last row of the page you got and then again how many rows you want. So what this means in action is that first you create a new column:

                            ALTER TABLE News
                            ADD StartPoint AS ((Culture * 100000000) + CONVERT(float, Date))
                            PERSISTED

                            Then you index it:

                            CREATE NONCLUSTERED INDEX X_StartPoint ON News (StartPoint)

                            Now the query would be like:

                            DECLARE @Start float, @Count INT
                            SET @Start = 0
                            SET @Count = 5
                            DECLARE @c TINYINT
                            SET @c = 1;
                            SELECT TOP(@Count)
                            id, title, description, StartPoint
                            FROM News a
                            WHERE Culture = @c
                            AND StartPoint > @Start
                            AND StartPoint < ((@c + 1) * 100000000)
                            ORDER BY Start

                            The need to optimize rises from a bad design.My articles[^]

                            M Offline
                            M Offline
                            Meysam Mahfouzi
                            wrote on last edited by
                            #22

                            Thanks Mika for the solution, But I can not make the column persisted because of the following error: Computed column 'StartPoint' in table 'News' cannot be persisted because the column is non-deterministic.

                            W 1 Reply Last reply
                            0
                            • M Meysam Mahfouzi

                              Thanks Mika for the solution, But I can not make the column persisted because of the following error: Computed column 'StartPoint' in table 'News' cannot be persisted because the column is non-deterministic.

                              W Offline
                              W Offline
                              Wendelius
                              wrote on last edited by
                              #23

                              Sorry, at least one typo. The date column must be in angle brackets. I made succesfully a test like this. Is it similar to your structure:

                              CREATE TABLE News (
                              [Date] datetime,
                              [Id] int,
                              [Culture] tinyint
                              )

                              ALTER TABLE News
                              ADD StartPoint AS ((Culture * 100000000) + CONVERT(float, [Date]))
                              PERSISTED

                              The need to optimize rises from a bad design.My articles[^]

                              M 1 Reply Last reply
                              0
                              • W Wendelius

                                Sorry, at least one typo. The date column must be in angle brackets. I made succesfully a test like this. Is it similar to your structure:

                                CREATE TABLE News (
                                [Date] datetime,
                                [Id] int,
                                [Culture] tinyint
                                )

                                ALTER TABLE News
                                ADD StartPoint AS ((Culture * 100000000) + CONVERT(float, [Date]))
                                PERSISTED

                                The need to optimize rises from a bad design.My articles[^]

                                M Offline
                                M Offline
                                Meysam Mahfouzi
                                wrote on last edited by
                                #24

                                Mika Wendelius wrote:

                                CREATE TABLE News ( [Date] datetime, [Id] int, [Culture] tinyint ) -- ALTER TABLE News ADD StartPoint AS ((Culture * 100000000) + CONVERT(float, [Date])) PERSISTED

                                I can execute this query too, but when I want to alter the current News table, it tells me that the columns is non-deterministic. :confused: Apart from that, I didn't get the concept of multiplying Culture by 100000000. Why didn't you write something just like this?

                                ADD StartPoint AS CONVERT(float, [Date])

                                And yet another question, what't the difference between sorting on a float column rather than date column?

                                W 1 Reply Last reply
                                0
                                • M Meysam Mahfouzi

                                  Mika Wendelius wrote:

                                  CREATE TABLE News ( [Date] datetime, [Id] int, [Culture] tinyint ) -- ALTER TABLE News ADD StartPoint AS ((Culture * 100000000) + CONVERT(float, [Date])) PERSISTED

                                  I can execute this query too, but when I want to alter the current News table, it tells me that the columns is non-deterministic. :confused: Apart from that, I didn't get the concept of multiplying Culture by 100000000. Why didn't you write something just like this?

                                  ADD StartPoint AS CONVERT(float, [Date])

                                  And yet another question, what't the difference between sorting on a float column rather than date column?

                                  W Offline
                                  W Offline
                                  Wendelius
                                  wrote on last edited by
                                  #25

                                  Maysam Mahfouzi wrote:

                                  when I want to alter the current News table, it tells me that the columns is non-deterministic

                                  Could you post the script for the News table.

                                  Maysam Mahfouzi wrote:

                                  I didn't get the concept of multiplying Culture by 100000000

                                  The idea is to create a single column which orders the culture and date columns together. To combine those columns to a single column and to have unique values I converted the date to float so that I can add culture to it. However since Cultures are 1,2,3... I cannot add them to the date since it would change the "date portion" of the float so I decided to multiply the culture with a number big enough so that it won't get mixed with the date. For example if I would add a new record to the new right now with culture 1 the result would be 100039839.592728. After that I can use these numbers as "pointers" to rows to define a specific starting point to the query. This eliminates the need to read previous records. So what I'm after is that if you have selected news page by page, the next query can always start from the exact point where the last record came from.

                                  The need to optimize rises from a bad design.My articles[^]

                                  M 1 Reply Last reply
                                  0
                                  • W Wendelius

                                    Maysam Mahfouzi wrote:

                                    when I want to alter the current News table, it tells me that the columns is non-deterministic

                                    Could you post the script for the News table.

                                    Maysam Mahfouzi wrote:

                                    I didn't get the concept of multiplying Culture by 100000000

                                    The idea is to create a single column which orders the culture and date columns together. To combine those columns to a single column and to have unique values I converted the date to float so that I can add culture to it. However since Cultures are 1,2,3... I cannot add them to the date since it would change the "date portion" of the float so I decided to multiply the culture with a number big enough so that it won't get mixed with the date. For example if I would add a new record to the new right now with culture 1 the result would be 100039839.592728. After that I can use these numbers as "pointers" to rows to define a specific starting point to the query. This eliminates the need to read previous records. So what I'm after is that if you have selected news page by page, the next query can always start from the exact point where the last record came from.

                                    The need to optimize rises from a bad design.My articles[^]

                                    M Offline
                                    M Offline
                                    Meysam Mahfouzi
                                    wrote on last edited by
                                    #26

                                    So I suppose that you are suggesting an alternative way to replace ROW_NUMBER() function, so that we do not need to calculate row number each time from scratch, am I right? If we use:

                                    ROW_NUMBER() OVER(ORDER BY news.[Date] DESC)

                                    each time all rows should be sorted based on Date and then row number will be assigned to them. But with your approach, we do not need to sort Date column at all. If I'm right, can we conclude that ROW_NUMBER() OVER(ORDER BY news.[Date] DESC) is not a good option for tables containing a millions of rows? here is my table script:

                                    CREATE TABLE [News](
                                    [ID] [bigint] IDENTITY(1,1) NOT NULL,
                                    [Url] [varchar](300) NULL,
                                    [Title] [nvarchar](200) NULL,
                                    [Description] [nvarchar](3000) NULL,
                                    [Date] [datetime] NULL,
                                    [Rank] [float] NULL,
                                    [ViewCount] [int] NULL,
                                    [Vote] [smallint] NULL,
                                    [Culture] [tinyint] NULL,
                                    CONSTRAINT [PK_NewsItem_1] PRIMARY KEY CLUSTERED
                                    (
                                    [ID] ASC
                                    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
                                    ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
                                    ) ON [PRIMARY]

                                    W 1 Reply Last reply
                                    0
                                    • M Meysam Mahfouzi

                                      So I suppose that you are suggesting an alternative way to replace ROW_NUMBER() function, so that we do not need to calculate row number each time from scratch, am I right? If we use:

                                      ROW_NUMBER() OVER(ORDER BY news.[Date] DESC)

                                      each time all rows should be sorted based on Date and then row number will be assigned to them. But with your approach, we do not need to sort Date column at all. If I'm right, can we conclude that ROW_NUMBER() OVER(ORDER BY news.[Date] DESC) is not a good option for tables containing a millions of rows? here is my table script:

                                      CREATE TABLE [News](
                                      [ID] [bigint] IDENTITY(1,1) NOT NULL,
                                      [Url] [varchar](300) NULL,
                                      [Title] [nvarchar](200) NULL,
                                      [Description] [nvarchar](3000) NULL,
                                      [Date] [datetime] NULL,
                                      [Rank] [float] NULL,
                                      [ViewCount] [int] NULL,
                                      [Vote] [smallint] NULL,
                                      [Culture] [tinyint] NULL,
                                      CONSTRAINT [PK_NewsItem_1] PRIMARY KEY CLUSTERED
                                      (
                                      [ID] ASC
                                      )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
                                      ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
                                      ) ON [PRIMARY]

                                      W Offline
                                      W Offline
                                      Wendelius
                                      wrote on last edited by
                                      #27

                                      Yes you interpretaion was correct. That's what I tried to explain (but with lots of useless word between :)) It seems that SQL Server 2005 won't create this column while 2008 will. Too bad. However, I think we could do this with the datetime also

                                      DECLARE @Start datetime, @Count INT
                                      SET @Start = GETDATE()
                                      SET @Count = 5
                                      DECLARE @c TINYINT
                                      SET @c = 1;
                                      SELECT TOP(@Count)
                                      id, title, description, Date
                                      FROM News a
                                      WHERE Culture = @c
                                      AND Date < @Start
                                      ORDER BY Date DESC

                                      The idea is the same. Define a starting point to the current datetime, fetch page and when you want the next page, define the starting point to the last date in the previous page (from the last row). The date column must be unique in this version so two different news must have at least 1 ms difference in date. You can try it by testing different values for the @Start variable.

                                      The need to optimize rises from a bad design.My articles[^]

                                      M 1 Reply Last reply
                                      0
                                      • W Wendelius

                                        Yes you interpretaion was correct. That's what I tried to explain (but with lots of useless word between :)) It seems that SQL Server 2005 won't create this column while 2008 will. Too bad. However, I think we could do this with the datetime also

                                        DECLARE @Start datetime, @Count INT
                                        SET @Start = GETDATE()
                                        SET @Count = 5
                                        DECLARE @c TINYINT
                                        SET @c = 1;
                                        SELECT TOP(@Count)
                                        id, title, description, Date
                                        FROM News a
                                        WHERE Culture = @c
                                        AND Date < @Start
                                        ORDER BY Date DESC

                                        The idea is the same. Define a starting point to the current datetime, fetch page and when you want the next page, define the starting point to the last date in the previous page (from the last row). The date column must be unique in this version so two different news must have at least 1 ms difference in date. You can try it by testing different values for the @Start variable.

                                        The need to optimize rises from a bad design.My articles[^]

                                        M Offline
                                        M Offline
                                        Meysam Mahfouzi
                                        wrote on last edited by
                                        #28

                                        But I guess there is a problem with this approach. That is, when we show pagination buttons at the bottom of page, how can user go to page 7 from page 1 without knowing the last date of page 6?

                                        W 1 Reply Last reply
                                        0
                                        • M Meysam Mahfouzi

                                          But I guess there is a problem with this approach. That is, when we show pagination buttons at the bottom of page, how can user go to page 7 from page 1 without knowing the last date of page 6?

                                          W Offline
                                          W Offline
                                          Wendelius
                                          wrote on last edited by
                                          #29

                                          Okay, if it's possible to skip pages, then this won't work. I started to think some kind of pre-pagination, but I think that won't work either... Hmm, are the news deleted often? Could it be acceptable that if a record is deleted, a time consuming operation is executed? What I'm thinking is that if we would have a ordinal number on the rows (1,2,3,4...) and we know the maximum at every moment then the page 6 (with 10 news on page) would be from (maximum - (5 * 10)) to (maximum - (6 * 10)) + 1. Now if we create an unique index on the ordinal column we get the maximum very fast and the query will also be very fast. The downside is that if a record is deleted, the ordinals must be rearranged starting from the deleted position to the maximum. However this would be one simple operation (although it may take awhile). Do you see problems in that solution?

                                          The need to optimize rises from a bad design.My articles[^]

                                          M 1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups