Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. Design and Architecture
  4. Obscenely amount of data

Obscenely amount of data

Scheduled Pinned Locked Moved Design and Architecture
question
6 Posts 4 Posters 8 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M Offline
    M Offline
    minimice
    wrote on last edited by
    #1

    How would you sort an obscenely amount of data? Say 500 gigabytes or more? Any brilliant ideas?

    O B L 3 Replies Last reply
    0
    • M minimice

      How would you sort an obscenely amount of data? Say 500 gigabytes or more? Any brilliant ideas?

      O Offline
      O Offline
      originSH
      wrote on last edited by
      #2

      Using a highly optimised sort algorithm suited to the data and disk based operations :P unless you have 1tb of ram laying around ;) I suppose a database could do it too.

      1 Reply Last reply
      0
      • M minimice

        How would you sort an obscenely amount of data? Say 500 gigabytes or more? Any brilliant ideas?

        B Offline
        B Offline
        Brady Kelly
        wrote on last edited by
        #3

        Divide the data into files small enough to sort by normal operations, sort each file, and merge the sorted files. You could also import the data into a database engine and have it do the ordering.

        M 1 Reply Last reply
        0
        • B Brady Kelly

          Divide the data into files small enough to sort by normal operations, sort each file, and merge the sorted files. You could also import the data into a database engine and have it do the ordering.

          M Offline
          M Offline
          minimice
          wrote on last edited by
          #4

          Brady Kelly wrote:

          You could also import the data into a database engine and have it do the ordering.

          Which database engine you think could handle such huge datasets?

          B 1 Reply Last reply
          0
          • M minimice

            Brady Kelly wrote:

            You could also import the data into a database engine and have it do the ordering.

            Which database engine you think could handle such huge datasets?

            B Offline
            B Offline
            Brady Kelly
            wrote on last edited by
            #5

            In SQL Server 2005 the number of rows per table is only constrained by storage space. and the maximum file sizes are 16 terabytes for the data file and 2 terabytes for the log file.

            1 Reply Last reply
            0
            • M minimice

              How would you sort an obscenely amount of data? Say 500 gigabytes or more? Any brilliant ideas?

              L Offline
              L Offline
              Luc Pattyn
              wrote on last edited by
              #6

              Hi, I would try to (partially) sort the data as it is generated, not afterwards. If applicable. For instance for words, keep 26 or 26^n collections. :)

              Luc Pattyn


              try { [Search CP Articles] [Search CP Forums] [Forum Guidelines] [My Articles] } catch { [Google] }


              1 Reply Last reply
              0
              Reply
              • Reply as topic
              Log in to reply
              • Oldest to Newest
              • Newest to Oldest
              • Most Votes


              • Login

              • Don't have an account? Register

              • Login or register to search.
              • First post
                Last post
              0
              • Categories
              • Recent
              • Tags
              • Popular
              • World
              • Users
              • Groups