Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. Algorithms
  4. Estimating time of completion of a data migration process

Estimating time of completion of a data migration process

Scheduled Pinned Locked Moved Algorithms
sysadmintoolsperformancehelptutorial
5 Posts 3 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • T Offline
    T Offline
    tufkap
    wrote on last edited by
    #1

    I've just finished writing a utility to migrate data from optical libraries to a NAS box and now I'm trying to come up with a formula to estimate the time of completion given the following data: 1) Total amount of data 2) No: of drives in library 3) Average read speed of the drives 4) Total no: of files 5) Fixed overhead for each file 6) Average write speed of the NAS box (this also takes into account the network write speed) The formula that I am using now looks like this: (Total data / (No: of drives * Average read speed)) + (Total files * Fixed overhead) + (Total data/ Average write speed) I don't think this right in all cases. The utility launches 1 thread for each drive in the library. So there is some parallelisation of the copy process. But I think the above formula would only work if the copying is done in a sequential manner. Does anyone have a better idea on how to do this by taking into account that the reads and writes happen in parallel? Please note that in the program itself, I just use the no: of files processed so far and the time taken to process them to guesstimate the time remaining. This formula is to create an excel file where the user can enter the data given above and get an approximate time of completion before actually starting the migration. Any help is greatly appreciated.

    The user formerly known as pkam.

    T A 2 Replies Last reply
    0
    • T tufkap

      I've just finished writing a utility to migrate data from optical libraries to a NAS box and now I'm trying to come up with a formula to estimate the time of completion given the following data: 1) Total amount of data 2) No: of drives in library 3) Average read speed of the drives 4) Total no: of files 5) Fixed overhead for each file 6) Average write speed of the NAS box (this also takes into account the network write speed) The formula that I am using now looks like this: (Total data / (No: of drives * Average read speed)) + (Total files * Fixed overhead) + (Total data/ Average write speed) I don't think this right in all cases. The utility launches 1 thread for each drive in the library. So there is some parallelisation of the copy process. But I think the above formula would only work if the copying is done in a sequential manner. Does anyone have a better idea on how to do this by taking into account that the reads and writes happen in parallel? Please note that in the program itself, I just use the no: of files processed so far and the time taken to process them to guesstimate the time remaining. This formula is to create an excel file where the user can enter the data given above and get an approximate time of completion before actually starting the migration. Any help is greatly appreciated.

      The user formerly known as pkam.

      T Offline
      T Offline
      Tim Craig
      wrote on last edited by
      #2

      I think the gating factor is the slower of reading, writing, or data transfer rate. It doesn't matter how fast you can read the data if writing slower. If reading can't keep up with writing, then reading is the limiting process. Of course, this analysis is based on aggregate rates which may be hard to judge but it seems you have some average values to work with.

      If you don't have the data, you're just another asshole with an opinion.

      T 1 Reply Last reply
      0
      • T Tim Craig

        I think the gating factor is the slower of reading, writing, or data transfer rate. It doesn't matter how fast you can read the data if writing slower. If reading can't keep up with writing, then reading is the limiting process. Of course, this analysis is based on aggregate rates which may be hard to judge but it seems you have some average values to work with.

        If you don't have the data, you're just another asshole with an opinion.

        T Offline
        T Offline
        tufkap
        wrote on last edited by
        #3

        Thank you for taking the time to respond to my question, Tim.

        The user formerly known as pkam.

        1 Reply Last reply
        0
        • T tufkap

          I've just finished writing a utility to migrate data from optical libraries to a NAS box and now I'm trying to come up with a formula to estimate the time of completion given the following data: 1) Total amount of data 2) No: of drives in library 3) Average read speed of the drives 4) Total no: of files 5) Fixed overhead for each file 6) Average write speed of the NAS box (this also takes into account the network write speed) The formula that I am using now looks like this: (Total data / (No: of drives * Average read speed)) + (Total files * Fixed overhead) + (Total data/ Average write speed) I don't think this right in all cases. The utility launches 1 thread for each drive in the library. So there is some parallelisation of the copy process. But I think the above formula would only work if the copying is done in a sequential manner. Does anyone have a better idea on how to do this by taking into account that the reads and writes happen in parallel? Please note that in the program itself, I just use the no: of files processed so far and the time taken to process them to guesstimate the time remaining. This formula is to create an excel file where the user can enter the data given above and get an approximate time of completion before actually starting the migration. Any help is greatly appreciated.

          The user formerly known as pkam.

          A Offline
          A Offline
          Alan Balkany
          wrote on last edited by
          #4

          There are complexities and interactions you can't anticipate, so a more reliable approach is to make completion-time measurements for different parameter combinations. Looking at the graphs of times for different values of a single parameter will give you insight as to how it really affects completion time. Multiple regression will give you formulas that estimate completion time based on the values of multiple parameters.

          T 1 Reply Last reply
          0
          • A Alan Balkany

            There are complexities and interactions you can't anticipate, so a more reliable approach is to make completion-time measurements for different parameter combinations. Looking at the graphs of times for different values of a single parameter will give you insight as to how it really affects completion time. Multiple regression will give you formulas that estimate completion time based on the values of multiple parameters.

            T Offline
            T Offline
            tufkap
            wrote on last edited by
            #5

            Thank you for your response Alan. I just needed a rough estimate. So for now I'm using the method suggested by Tim. Also the utility has currently been tested only on a small test configuration. When we do further testing, I'll try out your method.

            The user formerly known as pkam.

            1 Reply Last reply
            0
            Reply
            • Reply as topic
            Log in to reply
            • Oldest to Newest
            • Newest to Oldest
            • Most Votes


            • Login

            • Don't have an account? Register

            • Login or register to search.
            • First post
              Last post
            0
            • Categories
            • Recent
            • Tags
            • Popular
            • World
            • Users
            • Groups