Advice about choice of algorithm

thebiggestbangtheory

Dear all, I am a newbie here, this is my first post. I have looked over a few algorithms and have searched this board for my question but have not been able to narrow down to a clear answer. I am looking for any pointers/code/advice to help decide how to solve the following problem. I have to write a program to suggest the starting time of a job. There is 1 CPU, and there are n jobs already scheduled on it and we know their run times. I can assume a run time (maybe a median of the known run times or something) for this new job. The constraint being at no time should there be more than m jobs running in parallel. I think this problem is relatively simple. For a hammer and tongs solution, I have implemented a quick-sort kinds binary cleaving program which starts with a random time of the day, say, 13:00, and checks if one more process can be started at that time, it accepts the "assumed" run time of the new process and finds if the number of jobs will become greater than m during the assumed run time of the new process. If not, success! else, I cleave the two sides of the time chosen 00:00-12:59 and 13:01-23:59 (of course I can add the assumed run time to 13:01 to make a better choice of the window/range) to get the midway time position and check there and so on. Can there be a more formal solution to this problem? Like maybe based on the Knapsack algorithm or something.. any advice. Also, what could be a good measure of the "assumed" run time, I'm thinking median. I do realize that this question is a bit vague. Thanks in advance, -J

Fatbuddha 1

Hi I think I would go for a observer program that checks if there is one more job that can be started. If so I would start one. If you want you can do a importance weighting schema based on the assumed run time. I don't know if this answer your question. but this is the direction I would tend to go. Cheers

You have the thought that modern physics just relay on assumptions, that somehow depends on a smile of a cat, which isn’t there.( Albert Einstein)

Lost User

thebiggestbangtheory wrote:

Also, what could be a good measure of the "assumed" run time, I'm thinking median.

I'd go for a progress-indication that's displayed in "% work done", instead of using a calculated guess based on the previous runtimes. You could still calculate the time that's required to finish the job, based on the percentual feedback that you get :)

I are Troll :suss:

Alan Balkany

Why not just use a simple, "greedy" approach like an operating system, and start jobs as they come in, until m jobs are running. Then when one finishes, start another one? A few nuances: 1. If a job can start other jobs before it can complete, you risk a deadlock situation where two jobs are each waiting for the other to complete before they can start other jobs. If you know in advance how many other jobs a job can start, a conservative approach is the Banker's Algorithm: If a job will start k other jobs, don't start it if more than m - (k + 1) jobs are currently running. This will guarantee you have the resources needed to finish the job. 2. Starting short jobs before longer jobs will increase throughput. (But if you give shorter jobs too much priority, some long jobs may never run. This was observed in some early operating systems.)