Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. Visual Basic
  4. List(Of Integer) vs. an array (VB.NET 2008)

List(Of Integer) vs. an array (VB.NET 2008)

Scheduled Pinned Locked Moved Visual Basic
questioncsharpcssvisual-studio
11 Posts 4 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S Offline
    S Offline
    Steven St John
    wrote on last edited by
    #1

    I am a hobbyist programmer, and this may be a newbie question that has been asked before, but I was unsuccessful searching through the forums. I am writing a data analysis program that reads disk files and parses the information. Each file is one "session" and each session contains many "trials". Trials are identified by a number of items (trial length, trial stimulus, etc.) and also holds data which are a bunch of integers. So, I am building a trial class that has a number of properties, but one of those properties will be this array of integers (of unknown number). I'm wondering if I should represent that in the class as a simple array which heavily makes use of ReDim statements as the numbers are read and added to the array, or if I should instead represent them as a List(Of Integer) using the Add method to grow the collection. This is a philosophical question, not a coding question. I'm sure I will be able to code it either way. What I don't know is whether one or the other is preferable considering performance or other issues. I understand that using a collection would be a bad idea because of boxing/unboxing issues, but I was given to believe List(Of T) does not suffer this penalty. If List(Of Integer) is preferable, would you go on to say that one should always use generics and that arrays of values are always less preferable? Or perhaps a List(Of Integer) and an array of integers are the same under the hood? Thanks.

    M A 2 Replies Last reply
    0
    • S Steven St John

      I am a hobbyist programmer, and this may be a newbie question that has been asked before, but I was unsuccessful searching through the forums. I am writing a data analysis program that reads disk files and parses the information. Each file is one "session" and each session contains many "trials". Trials are identified by a number of items (trial length, trial stimulus, etc.) and also holds data which are a bunch of integers. So, I am building a trial class that has a number of properties, but one of those properties will be this array of integers (of unknown number). I'm wondering if I should represent that in the class as a simple array which heavily makes use of ReDim statements as the numbers are read and added to the array, or if I should instead represent them as a List(Of Integer) using the Add method to grow the collection. This is a philosophical question, not a coding question. I'm sure I will be able to code it either way. What I don't know is whether one or the other is preferable considering performance or other issues. I understand that using a collection would be a bad idea because of boxing/unboxing issues, but I was given to believe List(Of T) does not suffer this penalty. If List(Of Integer) is preferable, would you go on to say that one should always use generics and that arrays of values are always less preferable? Or perhaps a List(Of Integer) and an array of integers are the same under the hood? Thanks.

      M Offline
      M Offline
      MicroVirus
      wrote on last edited by
      #2

      To be honest you pose some really good questions that really hit the fundamentals of .NET and their language integration. I don't really know enough to fully answer you. One thing I can tell you though: memory reallocation is a very expensive operation, no matter what the implementation, and thus using constant ReDim's as the data is coming in is a bad idea. Probably 90% of the processing time will then be taken by the ReDim statement(s). The List(Of T) on the other hand allocates blocks of memory at a time, so only once the block is filled does it reallocate. I believe, but have no hard proof for this, that the utility classes .NET provides are all quite fast (in .NET relative terms, of course). Quite possibly, using a similar system of allocated larger blocks at a time and only reallocating when a block is filled yourself with ReDim's could be a bit faster. At the very least, you'll probably avoid some function call overhead. You've got me curious now :) I'll look in to it, barring someone else being able to answer these questions before I get time :P You could also always write your own test where you time both implementations. One test where you ReDim the array and add a random integer constantly, and the other where you use a List(Of Integer) and then checking the runtime of each on, say, 10000 or so iterations. EDIT 1: First test is in. Adding 100,000 integers to a list using ReDim's or List with standard capacity: VB ReDim's: 8919.503 ms List(Of Integer): 2.0002 ms So, ReDim-ing as you can see is really a bad idea. I do have to note though that on 10,000 items the time is not noticeable. EDIT 2: Included a ReDim mimic version of List(Of T)'s implementation: doubling capacity every time we overflow. The results are in:" VB ReDim's: 8907.5024 ms List(Of Integer): 2.0001 ms VB ReDim with Capacity: 2.0001 ms The conclusion is: use List(Of Integer). There is no speed difference noticeable (not even in my test) between implementing it yourself. But List(Of T) handles a lot of things for you, has lots of extra functionality, and negates the need for manual array management = double win.

      modified on Tuesday, June 21, 2011 2:01 PM

      T S 2 Replies Last reply
      0
      • M MicroVirus

        To be honest you pose some really good questions that really hit the fundamentals of .NET and their language integration. I don't really know enough to fully answer you. One thing I can tell you though: memory reallocation is a very expensive operation, no matter what the implementation, and thus using constant ReDim's as the data is coming in is a bad idea. Probably 90% of the processing time will then be taken by the ReDim statement(s). The List(Of T) on the other hand allocates blocks of memory at a time, so only once the block is filled does it reallocate. I believe, but have no hard proof for this, that the utility classes .NET provides are all quite fast (in .NET relative terms, of course). Quite possibly, using a similar system of allocated larger blocks at a time and only reallocating when a block is filled yourself with ReDim's could be a bit faster. At the very least, you'll probably avoid some function call overhead. You've got me curious now :) I'll look in to it, barring someone else being able to answer these questions before I get time :P You could also always write your own test where you time both implementations. One test where you ReDim the array and add a random integer constantly, and the other where you use a List(Of Integer) and then checking the runtime of each on, say, 10000 or so iterations. EDIT 1: First test is in. Adding 100,000 integers to a list using ReDim's or List with standard capacity: VB ReDim's: 8919.503 ms List(Of Integer): 2.0002 ms So, ReDim-ing as you can see is really a bad idea. I do have to note though that on 10,000 items the time is not noticeable. EDIT 2: Included a ReDim mimic version of List(Of T)'s implementation: doubling capacity every time we overflow. The results are in:" VB ReDim's: 8907.5024 ms List(Of Integer): 2.0001 ms VB ReDim with Capacity: 2.0001 ms The conclusion is: use List(Of Integer). There is no speed difference noticeable (not even in my test) between implementing it yourself. But List(Of T) handles a lot of things for you, has lots of extra functionality, and negates the need for manual array management = double win.

        modified on Tuesday, June 21, 2011 2:01 PM

        T Offline
        T Offline
        Thomas Krojer
        wrote on last edited by
        #3

        additional to your EDIT 2: did you also test starting with an big array (say 1.000.000) and redim to double size if overflow, and one redim at the end (to reduce to used size)? how fast would this be?

        I cannot remember: What did I before google?

        M 1 Reply Last reply
        0
        • T Thomas Krojer

          additional to your EDIT 2: did you also test starting with an big array (say 1.000.000) and redim to double size if overflow, and one redim at the end (to reduce to used size)? how fast would this be?

          I cannot remember: What did I before google?

          M Offline
          M Offline
          MicroVirus
          wrote on last edited by
          #4

          I did not test that, as I didn't think from the OP that an array as large as 1,000,000 would be relevant. The ReDim would be slow if needed, but then again it won't be needed much. Anyway, it'd be the same performance as starting with New List(Of Integer)(1000000), then using Add to add the items, and finally calling TrimExcess() on the list. You can test all this yourself by writing a very simple test program. I used Now() to get the time before and after the operation and output the difference in milliseconds. Took me no more than 10 minutes to write the entire test program and do the tests.

          T 1 Reply Last reply
          0
          • M MicroVirus

            I did not test that, as I didn't think from the OP that an array as large as 1,000,000 would be relevant. The ReDim would be slow if needed, but then again it won't be needed much. Anyway, it'd be the same performance as starting with New List(Of Integer)(1000000), then using Add to add the items, and finally calling TrimExcess() on the list. You can test all this yourself by writing a very simple test program. I used Now() to get the time before and after the operation and output the difference in milliseconds. Took me no more than 10 minutes to write the entire test program and do the tests.

            T Offline
            T Offline
            Thomas Krojer
            wrote on last edited by
            #5

            I tried to save this 10 minutes :)

            I cannot remember: What did I before google?

            M 1 Reply Last reply
            0
            • T Thomas Krojer

              I tried to save this 10 minutes :)

              I cannot remember: What did I before google?

              M Offline
              M Offline
              MicroVirus
              wrote on last edited by
              #6

              I wasn't on my own computer, so I poured all of those 10 minutes into EDIT 1 & 2 :P

              T 1 Reply Last reply
              0
              • M MicroVirus

                I wasn't on my own computer, so I poured all of those 10 minutes into EDIT 1 & 2 :P

                T Offline
                T Offline
                Thomas Krojer
                wrote on last edited by
                #7

                ok, i used a few minutes ... and proofed you are right. AND i want YOUR computer *g* both ways used 3 milliseconds on my machine :(

                I cannot remember: What did I before google?

                1 Reply Last reply
                0
                • S Steven St John

                  I am a hobbyist programmer, and this may be a newbie question that has been asked before, but I was unsuccessful searching through the forums. I am writing a data analysis program that reads disk files and parses the information. Each file is one "session" and each session contains many "trials". Trials are identified by a number of items (trial length, trial stimulus, etc.) and also holds data which are a bunch of integers. So, I am building a trial class that has a number of properties, but one of those properties will be this array of integers (of unknown number). I'm wondering if I should represent that in the class as a simple array which heavily makes use of ReDim statements as the numbers are read and added to the array, or if I should instead represent them as a List(Of Integer) using the Add method to grow the collection. This is a philosophical question, not a coding question. I'm sure I will be able to code it either way. What I don't know is whether one or the other is preferable considering performance or other issues. I understand that using a collection would be a bad idea because of boxing/unboxing issues, but I was given to believe List(Of T) does not suffer this penalty. If List(Of Integer) is preferable, would you go on to say that one should always use generics and that arrays of values are always less preferable? Or perhaps a List(Of Integer) and an array of integers are the same under the hood? Thanks.

                  A Offline
                  A Offline
                  AspDotNetDev
                  wrote on last edited by
                  #8

                  Read SlimList (I'm the author, but I think it's very relevant to your question). It discusses a bit about how List works. The List class basically uses an array that doubles in capacity each time capacity is reached. If you average out the time spent on each add operation, that comes to a performance of O(1) for each insert operation. Performing an array resize each time you add an item would average to O(N) for each insert operation (where N is the total number of objects inserted). Of course, you could reimplement the capacity-doubling functionality to make it O(1), but then you'd just be making List all over again. Read about big-O notation if you don't know what O(1) and O(N) mean. Go with List. Read the above article if you want a more in-depth understanding of List (as well as an understanding of how it can theoretically be improved upon). Also, one more note: do not use SlimList. It merely demonstrates a theoretical improvement that is in practice so slow that it is not worthwile to with any real production code.

                  Steven St. John wrote:

                  I understand that using a collection would be a bad idea because of boxing/unboxing issues, but I was given to believe List(Of T) does not suffer this penalty.

                  The word "collection" applies to both arrays and Lists. In either case, those collections will only suffer a boxing penalty if you try to store a non-reference type in a reference variable. For example, if you put an integer into a List(Of Object). However, putting an integer into a List(Of Integer) would not cause boxing.

                  Help a brotha out and vote Managing Your JavaScript Library in ASP.NET as the best ASP.NET article of May 2011.

                  S 1 Reply Last reply
                  0
                  • M MicroVirus

                    To be honest you pose some really good questions that really hit the fundamentals of .NET and their language integration. I don't really know enough to fully answer you. One thing I can tell you though: memory reallocation is a very expensive operation, no matter what the implementation, and thus using constant ReDim's as the data is coming in is a bad idea. Probably 90% of the processing time will then be taken by the ReDim statement(s). The List(Of T) on the other hand allocates blocks of memory at a time, so only once the block is filled does it reallocate. I believe, but have no hard proof for this, that the utility classes .NET provides are all quite fast (in .NET relative terms, of course). Quite possibly, using a similar system of allocated larger blocks at a time and only reallocating when a block is filled yourself with ReDim's could be a bit faster. At the very least, you'll probably avoid some function call overhead. You've got me curious now :) I'll look in to it, barring someone else being able to answer these questions before I get time :P You could also always write your own test where you time both implementations. One test where you ReDim the array and add a random integer constantly, and the other where you use a List(Of Integer) and then checking the runtime of each on, say, 10000 or so iterations. EDIT 1: First test is in. Adding 100,000 integers to a list using ReDim's or List with standard capacity: VB ReDim's: 8919.503 ms List(Of Integer): 2.0002 ms So, ReDim-ing as you can see is really a bad idea. I do have to note though that on 10,000 items the time is not noticeable. EDIT 2: Included a ReDim mimic version of List(Of T)'s implementation: doubling capacity every time we overflow. The results are in:" VB ReDim's: 8907.5024 ms List(Of Integer): 2.0001 ms VB ReDim with Capacity: 2.0001 ms The conclusion is: use List(Of Integer). There is no speed difference noticeable (not even in my test) between implementing it yourself. But List(Of T) handles a lot of things for you, has lots of extra functionality, and negates the need for manual array management = double win.

                    modified on Tuesday, June 21, 2011 2:01 PM

                    S Offline
                    S Offline
                    Steven St John
                    wrote on last edited by
                    #9

                    Thank you for running the test and your very helpful answer.

                    1 Reply Last reply
                    0
                    • A AspDotNetDev

                      Read SlimList (I'm the author, but I think it's very relevant to your question). It discusses a bit about how List works. The List class basically uses an array that doubles in capacity each time capacity is reached. If you average out the time spent on each add operation, that comes to a performance of O(1) for each insert operation. Performing an array resize each time you add an item would average to O(N) for each insert operation (where N is the total number of objects inserted). Of course, you could reimplement the capacity-doubling functionality to make it O(1), but then you'd just be making List all over again. Read about big-O notation if you don't know what O(1) and O(N) mean. Go with List. Read the above article if you want a more in-depth understanding of List (as well as an understanding of how it can theoretically be improved upon). Also, one more note: do not use SlimList. It merely demonstrates a theoretical improvement that is in practice so slow that it is not worthwile to with any real production code.

                      Steven St. John wrote:

                      I understand that using a collection would be a bad idea because of boxing/unboxing issues, but I was given to believe List(Of T) does not suffer this penalty.

                      The word "collection" applies to both arrays and Lists. In either case, those collections will only suffer a boxing penalty if you try to store a non-reference type in a reference variable. For example, if you put an integer into a List(Of Object). However, putting an integer into a List(Of Integer) would not cause boxing.

                      Help a brotha out and vote Managing Your JavaScript Library in ASP.NET as the best ASP.NET article of May 2011.

                      S Offline
                      S Offline
                      Steven St John
                      wrote on last edited by
                      #10

                      Thank you - your article was very informative!

                      A 1 Reply Last reply
                      0
                      • S Steven St John

                        Thank you - your article was very informative!

                        A Offline
                        A Offline
                        AspDotNetDev
                        wrote on last edited by
                        #11

                        I'm glad it helped. :)

                        Help a brotha out and vote Managing Your JavaScript Library in ASP.NET as the best ASP.NET article of May 2011.

                        1 Reply Last reply
                        0
                        Reply
                        • Reply as topic
                        Log in to reply
                        • Oldest to Newest
                        • Newest to Oldest
                        • Most Votes


                        • Login

                        • Don't have an account? Register

                        • Login or register to search.
                        • First post
                          Last post
                        0
                        • Categories
                        • Recent
                        • Tags
                        • Popular
                        • World
                        • Users
                        • Groups