Tutorial :An algorithm for estimating the runtime for a program


I need to find the total timings for the execution of a program over different inputs. The program reads some data and writes it into another file. The values of the data value and the size of the data are different every time.

I want to find how long it will take in general for all size of data.

Is the algorithm for finding this based on the total timings of the program for a single execution?

For Example, if I know

for single execution   a.program - execution time   1.2sec             - its create file  100 kb file   

Can I find out how long it will take for n executions, on a different data size?


I don't quite understand your question, but I believe that what you're asking is how to figure out the execution time of a program in advance of running it.

This is related to the halting problem. The halting problem is intractable.

Apologies if I am misunderstanding your question.

Edit: To respond to your clarification, there is no general algorithm for extrapolating runtime for larger inputs from runtime for smaller inputs. Analysis of algorithms is very tricky business. There are heuristics that you can use. For example, you could compute runtime on inputs of varying "size" (say, 10, 100, 1000, 10000) and try to fit a curve to the function "size" -> runtime.


There is no perfect algorithm for this, and it will be highly variable - a first run may be very slow, but a second run will be much quicker since the data from disk is cached in memory.

You could measure your program with a variety of inputs and try to build a model for input size/complexity compared to execution time and then use the results of that to estimate execution time.


If I understand you correctly, you want to find out how long it takes a process to execute.

If this is the case then use the Benchmark module.

Using this you can take timings at different places and judge the timings of different parts of your program.


1, Halting problem is intractable, i.e. even you have the data and the program, there is no way to tell (! in the general case !) how long it will take to run it. 2, The 'program' in the halting problem means the complete state, e.g. your program + the data it processes.

So it's two times intractable :)


If you know the asymptotic running time of your algorithm, for example knowing a sorting algorithm is n*log(n), you can run it on small inputs and calculate (though perhaps only a range) what it would be for larger inputs. The problem is that analyzing most algorithms is very hard. Otherwise, you could run it a few times on smaller input and do some type of regression (probably nonlinear) to discover/approximate an equation for the algorithm's performance characteristics, and use that to calculate for larger inputs.


There is an entire branch of Computer Science devoted to this: Algorithmic Complexity. In short, you analyze the code to determine what performance characteristics it has, for instance, most good sorting algorithms have an average runtime of O(n log n) where n is the size of the array to be sorted. This means the length of time to sort 100 items is 100 * ln 100 / ln 2 * x or 664x where x is the amount of time needed to run one instruction.


From the question, I am not sure you are trying to calculate execution time before running this program, or to record the time it takes your script to run. If you are trying to pre-calculate, I agree with the other answers that say it is not possible.

If you want to record your execution time, simply add 2 global date variables to your program, store the current date and time in one immediately at the start of execution, and then store the time in the second variable upon termination. Use a date difference function to give you the seconds (or your desired time unit) elapsed.


It's hard to know for sure, but I don't think this is a version of the halting problem. (or, at least, it can be simplified so that it's not).

I think, you're just asking for estimates about how long a series of reads and writes will take...but with varied amounts of reading and writing.

The easiest way to do it is to make some empirical measurements (the more the better) and use those measurements to make estimates about how long future runs will take. If you discover that reading 10 MB of data takes 10 seconds, then you could estimate that 100 MB might take about 100 seconds. (This, of course, makes the assumption that you're looking at an O(n) algorithm...if not, you'll have to adjust accordingly.)

Of course, this will be prone to error, because of whatever else is going on on the system...but, depending on your needs, it may give you estimates that are good enough.

Of course, if you are able to update your estimates as you are reading/writing, you can improve them.


If you are asking about the practical solution, then using the Benchmark module, or some other process where you record the time. Then plot the execution time against the input size for a number of inputs and interpolate (but beware extrapolation, as this xkcd cartoon demonstrates).

If you want to know about the theory you need to understand "computational complexity" which is a good search term to get you started.

For example, if you run over the data once, then usually twice as much data will take roughly twice as long. The best search algorithms typically take O(NlnN) so twice as much data will take slightly more than twice as long. But even these only give you limits on the length of time, and the constants will depend on things like disk access, memory, other progams running, etc.


You have to know if your program halts. It can't be automatically desired but you can determine if you know its design. Then you have to know at least your program asymptotic complexity. Better if you know real complexity formula. Then you can benchmark for adequate set of inputs. Then you can interpolate your data to achieve constants. Finally just place constant to equation and compute. Easy, isn't it? ;-)


If you can reexecute your program. You can time it using unix "time" command. If not, you need to save System time, and then again at end of program and print it out?

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Next Post »