Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. Database & SysAdmin
  3. Database
  4. SQL query to read all the records one by one

SQL query to read all the records one by one

Scheduled Pinned Locked Moved Database
databasecsharpquestion
13 Posts 8 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • R R Giskard Reventlov

    If you are only processing a subset of records why not cut down the size of the result set by using a where clause to filter out those records you will not need? (Sorry if you already know that but your question isn't clear). Like: select * from [table] where [column] = 'foo' It's best to bring back only those records you'll actually need: perhaps if you were to explain your need a little more clearly: e.g. are you intending to do this in Sql or c#?

    me, me, me "The dinosaurs became extinct because they didn't have a space program. And if we become extinct because we don't have a space program, it'll serve us right!" Larry Niven nils illegitimus carborundum

    Y Offline
    Y Offline
    Yvon Bourgouin
    wrote on last edited by
    #3

    In fact, I had something like SELECT [IJK],[XC],[YC],[ZC] FROM [MSDP]; and it was taking 10 to 15 minutes to get the records. I tried to change it to SELECT TOP 200 [IJK],[XC],[YC],[ZC] FROM [MSDP] ORDER BY [IJK]. It's fast for the first 200 records. To get the following records, I use SELECT TOP 200 [IJK],[XC],[YC],[ZC] FROM [MSDP] WHERE [IJK] NOT IN (SELECT TOP 200 [IJK],[XC],[YC],[ZC] FROM [MSDP] ORDER BY [IJK]) ORDER BY [IJK]; the second 200 changes in a loop to 400,600 and so on. At the beginning, it's quite fast but the performances are degrading quite fast. Any help?

    R S S D I 5 Replies Last reply
    0
    • Y Yvon Bourgouin

      In fact, I had something like SELECT [IJK],[XC],[YC],[ZC] FROM [MSDP]; and it was taking 10 to 15 minutes to get the records. I tried to change it to SELECT TOP 200 [IJK],[XC],[YC],[ZC] FROM [MSDP] ORDER BY [IJK]. It's fast for the first 200 records. To get the following records, I use SELECT TOP 200 [IJK],[XC],[YC],[ZC] FROM [MSDP] WHERE [IJK] NOT IN (SELECT TOP 200 [IJK],[XC],[YC],[ZC] FROM [MSDP] ORDER BY [IJK]) ORDER BY [IJK]; the second 200 changes in a loop to 400,600 and so on. At the beginning, it's quite fast but the performances are degrading quite fast. Any help?

      R Offline
      R Offline
      R Giskard Reventlov
      wrote on last edited by
      #4

      Does it need to be ordered? If it does at least put an index on [IJK] that should help. Consider indexing the other columns. Have you created an execution plan? How does that look?

      me, me, me "The dinosaurs became extinct because they didn't have a space program. And if we become extinct because we don't have a space program, it'll serve us right!" Larry Niven nils illegitimus carborundum

      1 Reply Last reply
      0
      • Y Yvon Bourgouin

        In fact, I had something like SELECT [IJK],[XC],[YC],[ZC] FROM [MSDP]; and it was taking 10 to 15 minutes to get the records. I tried to change it to SELECT TOP 200 [IJK],[XC],[YC],[ZC] FROM [MSDP] ORDER BY [IJK]. It's fast for the first 200 records. To get the following records, I use SELECT TOP 200 [IJK],[XC],[YC],[ZC] FROM [MSDP] WHERE [IJK] NOT IN (SELECT TOP 200 [IJK],[XC],[YC],[ZC] FROM [MSDP] ORDER BY [IJK]) ORDER BY [IJK]; the second 200 changes in a loop to 400,600 and so on. At the beginning, it's quite fast but the performances are degrading quite fast. Any help?

        S Offline
        S Offline
        Scubapro
        wrote on last edited by
        #5

        This way your selecting all records (however in parts though). Where's your filter (WHERE CLAUSE), since you need only some records?

        Y 1 Reply Last reply
        0
        • Y Yvon Bourgouin

          Hi everyone, I'm using a huge database of geostatistical blocks (million of records). I would like to read the records one by one and process only some. I tried the datareader but it's slowing terribly. Is there a read next or something like that in SQL? I'm programming in C#. Thanks in advance, Yvon

          J Offline
          J Offline
          J4amieC
          wrote on last edited by
          #6

          Yvon Bourgouin wrote:

          million of records

          Yvon Bourgouin wrote:

          I would like to read the records one by one

          Can you spot the problem here. Trying to iterate over millions of rows will, of course, be slow. Why would you not limit the returned results to the rows you actually want?

          1 Reply Last reply
          0
          • S Scubapro

            This way your selecting all records (however in parts though). Where's your filter (WHERE CLAUSE), since you need only some records?

            Y Offline
            Y Offline
            Yvon Bourgouin
            wrote on last edited by
            #7

            I didn't put the WHERE clause because it's very complex but it's working.

            J 1 Reply Last reply
            0
            • Y Yvon Bourgouin

              In fact, I had something like SELECT [IJK],[XC],[YC],[ZC] FROM [MSDP]; and it was taking 10 to 15 minutes to get the records. I tried to change it to SELECT TOP 200 [IJK],[XC],[YC],[ZC] FROM [MSDP] ORDER BY [IJK]. It's fast for the first 200 records. To get the following records, I use SELECT TOP 200 [IJK],[XC],[YC],[ZC] FROM [MSDP] WHERE [IJK] NOT IN (SELECT TOP 200 [IJK],[XC],[YC],[ZC] FROM [MSDP] ORDER BY [IJK]) ORDER BY [IJK]; the second 200 changes in a loop to 400,600 and so on. At the beginning, it's quite fast but the performances are degrading quite fast. Any help?

              S Offline
              S Offline
              supercat9
              wrote on last edited by
              #8

              That is going to be really slow, since retrieving e.g. records 4801-5000 will use 5,000 records' worth of processing to retrieve 200 records; retrieving records 49,801-50,000 will use 50,000 records' worth of processing to retrieve 200 records. To get through 1,000,000 records would require processing about 2,500,000,000. To add insult to injury, deleting records from the table may cause some records (not involved in the deletion) to be skipped. Adding records may cause some records to be duplicated, but that's a far smaller problem. If the value in column IJK uniquely identifies records, use "SELECT TOP 200 ... WHERE [IJK] > @lastijk" where parameter @lastijk is the last value you retrieved. I'd guess that IJK is probably unique, since if it isn't your code won't work properly at any speed, but if it is not unique, use "SELECT TOP 200 ... WHERE [IJK] > @lastijk" followed by "SELECT ... WHERE [IJK] = @lastijk" (no TOP 200 qualifier there, and use the last record from the first query for @lastijk); all of the records where IJK equals @lastijk will appear in the second query, so you should ignore any items from the first query where IJK equals @lastijk. Incidentally, to get reasonable performance, you'll need to have an index--preferably clustered--on @lastijk. I'd guess you probably do have one, or else your performance would be really slow for even the first 200 records.

              Y 1 Reply Last reply
              0
              • Y Yvon Bourgouin

                In fact, I had something like SELECT [IJK],[XC],[YC],[ZC] FROM [MSDP]; and it was taking 10 to 15 minutes to get the records. I tried to change it to SELECT TOP 200 [IJK],[XC],[YC],[ZC] FROM [MSDP] ORDER BY [IJK]. It's fast for the first 200 records. To get the following records, I use SELECT TOP 200 [IJK],[XC],[YC],[ZC] FROM [MSDP] WHERE [IJK] NOT IN (SELECT TOP 200 [IJK],[XC],[YC],[ZC] FROM [MSDP] ORDER BY [IJK]) ORDER BY [IJK]; the second 200 changes in a loop to 400,600 and so on. At the beginning, it's quite fast but the performances are degrading quite fast. Any help?

                D Offline
                D Offline
                David Skelly
                wrote on last edited by
                #9

                Yvon Bourgouin wrote:

                At the beginning, it's quite fast but the performances are degrading quite fast

                I'm not surprised. Think about what you are asking the database to do here: SELECT TOP 200... SELECT TOP 200 WHERE NOT IN (SELECT TOP 200...) SELECT TOP 200 WHERE NOT IN (SELECT TOP 400...) SELECT TOP 200 WHERE NOT IN (SELECT TOP 600...) ... repeat lots of times ... SELECT TOP 200 WHERE NOT IN (SELECT TOP 1,000,000...) If I understand you correctly, you are running the same query over and over, asking the database to do more work every time. You say that you are only processing a small number of the rows. So, why fetch them all? Why not put something into the WHERE clause to cut down the number of rows you select? Once you have fetched these rows, how do you decide which ones you will process and which ones you will ignore and what is stopping you from moving that selection logic back to the database? Sorry if that's obvious, I'm sure you have considered it, but I don't quite understand why you don't do that.

                1 Reply Last reply
                0
                • Y Yvon Bourgouin

                  In fact, I had something like SELECT [IJK],[XC],[YC],[ZC] FROM [MSDP]; and it was taking 10 to 15 minutes to get the records. I tried to change it to SELECT TOP 200 [IJK],[XC],[YC],[ZC] FROM [MSDP] ORDER BY [IJK]. It's fast for the first 200 records. To get the following records, I use SELECT TOP 200 [IJK],[XC],[YC],[ZC] FROM [MSDP] WHERE [IJK] NOT IN (SELECT TOP 200 [IJK],[XC],[YC],[ZC] FROM [MSDP] ORDER BY [IJK]) ORDER BY [IJK]; the second 200 changes in a loop to 400,600 and so on. At the beginning, it's quite fast but the performances are degrading quite fast. Any help?

                  I Offline
                  I Offline
                  i j russell
                  wrote on last edited by
                  #10

                  Returning millions of records takes time. If you can't limit the results via a WHERE clause, you might like to try the following approach;

                  CREATE PROCEDURE GetData
                  (
                  @start INT
                  )
                  AS
                  WITH Query
                  AS
                  (
                  SELECT
                  ROW_NUMBER() OVER
                  (
                  ORDER BY [IJK] ASC
                  ) as position,
                  [IJK],
                  [XC],
                  [YC],
                  [ZC]
                  FROM [MSDP]
                  )
                  SELECT [IJK],
                  [XC],
                  [YC],
                  [ZC]
                  FROM Query
                  WHERE position BETWEEN @start AND @start + 200

                  To run the sp, you use

                  EXEC GetData 200000

                  [Replace the 200000 with the start value]. You could also try returning more than 200 records by changing the sp.

                  Y 1 Reply Last reply
                  0
                  • I i j russell

                    Returning millions of records takes time. If you can't limit the results via a WHERE clause, you might like to try the following approach;

                    CREATE PROCEDURE GetData
                    (
                    @start INT
                    )
                    AS
                    WITH Query
                    AS
                    (
                    SELECT
                    ROW_NUMBER() OVER
                    (
                    ORDER BY [IJK] ASC
                    ) as position,
                    [IJK],
                    [XC],
                    [YC],
                    [ZC]
                    FROM [MSDP]
                    )
                    SELECT [IJK],
                    [XC],
                    [YC],
                    [ZC]
                    FROM Query
                    WHERE position BETWEEN @start AND @start + 200

                    To run the sp, you use

                    EXEC GetData 200000

                    [Replace the 200000 with the start value]. You could also try returning more than 200 records by changing the sp.

                    Y Offline
                    Y Offline
                    Yvon Bourgouin
                    wrote on last edited by
                    #11

                    Thanks, that might solve my problem...

                    1 Reply Last reply
                    0
                    • S supercat9

                      That is going to be really slow, since retrieving e.g. records 4801-5000 will use 5,000 records' worth of processing to retrieve 200 records; retrieving records 49,801-50,000 will use 50,000 records' worth of processing to retrieve 200 records. To get through 1,000,000 records would require processing about 2,500,000,000. To add insult to injury, deleting records from the table may cause some records (not involved in the deletion) to be skipped. Adding records may cause some records to be duplicated, but that's a far smaller problem. If the value in column IJK uniquely identifies records, use "SELECT TOP 200 ... WHERE [IJK] > @lastijk" where parameter @lastijk is the last value you retrieved. I'd guess that IJK is probably unique, since if it isn't your code won't work properly at any speed, but if it is not unique, use "SELECT TOP 200 ... WHERE [IJK] > @lastijk" followed by "SELECT ... WHERE [IJK] = @lastijk" (no TOP 200 qualifier there, and use the last record from the first query for @lastijk); all of the records where IJK equals @lastijk will appear in the second query, so you should ignore any items from the first query where IJK equals @lastijk. Incidentally, to get reasonable performance, you'll need to have an index--preferably clustered--on @lastijk. I'd guess you probably do have one, or else your performance would be really slow for even the first 200 records.

                      Y Offline
                      Y Offline
                      Yvon Bourgouin
                      wrote on last edited by
                      #12

                      Thanks, looks good to me, the performance was degrading quickly, this would probably fix it...

                      1 Reply Last reply
                      0
                      • Y Yvon Bourgouin

                        I didn't put the WHERE clause because it's very complex but it's working.

                        J Offline
                        J Offline
                        Johan Hakkesteegt
                        wrote on last edited by
                        #13

                        Your logic must be really really really complex, if it can not be translated to a WHERE clause. However assuming that you are right, you can always at least cut out some part of the records with the WHERE clause. Look at the search logic you have written in your application, find the single largest search parameter, and translate only that to sql.

                        My advice is free, and you may get what you paid for.

                        1 Reply Last reply
                        0
                        Reply
                        • Reply as topic
                        Log in to reply
                        • Oldest to Newest
                        • Newest to Oldest
                        • Most Votes


                        • Login

                        • Don't have an account? Register

                        • Login or register to search.
                        • First post
                          Last post
                        0
                        • Categories
                        • Recent
                        • Tags
                        • Popular
                        • World
                        • Users
                        • Groups