Tutorial :What's the best way to calculate remaining download time?



Question:

Let's say you want to calculate the remaining download time, and you have all the information needed, that is: File size, dl'ed size, size left, time elapsed, momentary dl speed, etc'. How would you calculate the remaining dl time?

Ofcourse, the straightforward way would be either: size left/momentary dl speed, or: (time elapsed/dl'ed size)*size left. Only that the first would be subject to deviations in the momentary speed, and the latter wouldn't adapt well to altering speeds.

Must be some smarter way to do that, right? Take a look at the pirated software and music you currently download with uTorrent. It's easy to notice that it does more than the simple calculation mentioned before. Actually, I notices that sometimes when the dl speed drops, the time remaining also drops for a couple of moments until it readjusts.


Solution:1

Well, as you said, using the absolutely current download speed isn't a great method, because it tends to fluctuate. However, something like an overall average isn't a great idea either, because there may be large fluctuations there as well.

Consider if I start downloading a file at the same time as 9 others. I'm only getting 10% of my normal speed, but halfway through the file, the other 9 finish. Now I'm downloading at 10x the speed I started at. My original 10% speed shouldn't be a factor in the "how much time is left?" calculation any more.

Personally, I'd probably take an average over the last 30 seconds or so, and use that. That should do calculations based on recent speed, without fluctuating wildly. 30 seconds may not be the right amount, it would take some experimentation to figure out a good amount.

Another option would be to set a sort of "fluctuation threshold", where you don't do any recalculation until the speed changes by more than that threshold. For example (random number, again, would require experimentation), you could set the threshold at 10%. Then, if you're downloading at 100kb/s, you don't recalculate the remaining time until the download speed changes to either below 90kb/s or 110kb/s. If one of those changes happens, the time is recalculated and a new threshold is set.


Solution:2

You could use an averaging algorithm where the old values decay linearly. If S_n is the speed at time n and A_{n-1} is the average at time n-1, then define your average speed as follows.

A_1 = S_1
A_2 = (S_1 + S_2)/2
A_n = S_n/(n-1) + A_{n-1}(1-1/(n-1))

In English, this means that the longer in the past a measurement occurred, the less it matters because its importance has decayed.

Compare this to the normal averaging algorithm: A_n = S_n/n + A_{n-1}(1-1/n)

You could also have it geometrically decay, which would weight the most recent speeds very heavily: A_n = S_n/2 + A_{n-1}/2

If the speeds are 4,3,5,6 then A_4 = 4.5 (simple average)
A_4 = 4.75 (linear decay)
A_4 = 5.125 (geometric decay)

Example in PHP

Beware that $n+1 (not $n) is the number of current data points due to PHP's arrays being zero-indexed. To match the above example set n == $n+1 or n-1 == $n

<?php    $s = [4,3,5,6];    // average  $a = [];  for ($n = 0; $n < count($s); ++$n)  {      if ($n == 0)          $a[$n] = $s[$n];      else      {          // $n+1 = number of data points so far          $weight = 1/($n+1);            $a[$n] = $s[$n] * $weight + $a[$n-1] * (1 - $weight);      }  }    var_dump($a);      // linear decay  $a = [];  for ($n = 0; $n < count($s); ++$n)  {      if ($n == 0)          $a[$n] = $s[$n];        elseif ($n == 1)          $a[$n] = ($s[$n] + $s[$n-1]) / 2;        else      {          // $n = number of data points so far - 1          $weight = 1/($n);            $a[$n] = $s[$n] * $weight + $a[$n-1] * (1 - $weight);      }  }    var_dump($a);      // geometric decay  $a = [];  for ($n = 0; $n < count($s); ++$n)  {      if ($n == 0)          $a[$n] = $s[$n];      else      {          $weight = 1/2;            $a[$n] = $s[$n] * $weight + $a[$n-1] * (1 - $weight);      }  }    var_dump($a);  

Output

array (size=4)    0 => int 4    1 => float 3.5    2 => float 4    3 => float 4.5    array (size=4)    0 => int 4    1 => float 3.5    2 => float 4.25    3 => float 4.8333333333333    array (size=4)    0 => int 4    1 => float 3.5    2 => float 4.25    3 => float 5.125  


Solution:3

The obvious way would be something in between, you need a 'moving average' of the download speed.


Solution:4

I think it's just an averaging algorithm. It averages the rate over a few seconds.


Solution:5

What you could do also is keep track of your average speed and show a calculation of that as well.


Solution:6

EDIT: Here's what I finally suggest, I tried it and it provides quite satisfying results:

I have a zero initialized array for each download speed between 0 - 500 kB/s (could be higher if you expect such speeds) in 1 kB/s steps. I sample the download speed momentarily (every second is a good interval), and increment the coresponding array item by one. Now I know how many seconds I have spent downloading the file at each speed. The sum of all these values is the elapsed time (in seconds). The sum of these values multiplied by the corresponding speed is the size downloaded so far. If I take the ratio between each value in the array and the elapsed time, assuming the speed variation pattern stabalizes, I can form a formula to predict the time each size will take. This size in this case, is the size remaining. That's what I do: I take the sum of each array item value multiplied by the corresponding speed (the index) and divided by the elapsed time. Then I divide the size left by this value, and that's the time left.

Takes a few seconds to stabalize, and then works preety damn well.

Note that this is a "complicated" average, so the method of discarding older values (moving average) might improve it even further.


Solution:7

For anyone interested, I wrote an open-source library in C# called Progression that has a "moving-average" implementation: ETACalculator.cs.

The Progression library defines an easy-to-use structure for reporting several types of progress. It also easily handles nested progress for very smooth progress reporting.


Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »