Calculute average value of large array
-
is there any algorithm to calculate very large array(integer)'s average value? by ordinary method, there maybe danger of sum overflow.
If the integers are 32-bit use 64-bit (
long long int
) for calculation. You can always usedouble
for the calculation. But this may give inexact results. A general method is using another variable to extend the sum. This requires overflow check for each add operation and implementation of the division operation for the extended value. -
is there any algorithm to calculate very large array(integer)'s average value? by ordinary method, there maybe danger of sum overflow.
-
is there any algorithm to calculate very large array(integer)'s average value? by ordinary method, there maybe danger of sum overflow.
A tongue-in-cheek approach: Averaging...the easy way.
"One man's wage rise is another man's price increase." - Harold Wilson
"Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons
"You can easily judge the character of a man by how he treats those who can do nothing for him." - James D. Miles
-
is there any algorithm to calculate very large array(integer)'s average value? by ordinary method, there maybe danger of sum overflow.
I've taken the link provided above (about cumulative average) as an inspiration to calculate the average incrementally, in a way that no intermediate value gets larger than about three times the maximum absolute value being added. Since I wasn't sure about various effects, such as the current average going negative, I added some test code and asserts to verify it actually works as intended. See the code below:
#define TEST_RANGE
#ifdef TEST_RANGE
#include
void minmax(const double newval, double&minv, double&maxv) {
if (newval < minv)
minv = newval;
else if (newval > maxv)
maxv = newval;
}
#endif
template
basetype average(const container_iterator& start, const container_iterator& end) {
basetype cumulated_average = 0;
basetype cumulated_remainder = 0;
basetype addendum = 0;
#ifdef TEST_RANGE
double real_avg = 0.0;
double val_min = *start;
double val_max = *start;
double avg_min = cumulated_average;
double avg_max = cumulated_average;
double rem_min = cumulated_remainder;
double rem_max = cumulated_remainder;
#endif
long long n_values = 0;
for (auto pvalue = start; pvalue != end; ++pvalue) {
++n_values;
addendum = cumulated_remainder - cumulated_average + *pvalue;
cumulated_average += addendum/n_values;
cumulated_remainder = addendum%n_values;
#ifdef TEST_RANGE
real_avg += *pvalue;
assert((char)(n_values*cumulated_average + cumulated_remainder - real_avg) == 0);
minmax(*pvalue, val_min, val_max);
minmax(cumulated_average, avg_min, avg_max);
minmax(cumulated_remainder, rem_min, rem_max);
#endif
}
#ifdef TEST_RANGE
assert (fabs(n_values*cumulated_average - real_avg) < n_values);
real_avg /= (double)n_values;
#endif
return cumulated_average;
}
void test_average() {
char cvalues[] = { 13,7,-27, 34, -3, 22, 33, -1, 18, 29,
13,7,-27, 34, -3, 22, 33, -1, 18, 29,
13,7,-27, 34, -3, 22, 33, -1, 18, 29,
13,7,-27, 34, -3, 22, 33, -1, 18, 29,
13,7,-27, 34, -3, 22, 33, -1, 18, 29
};
auto cavg = average(cvalues+0, cvalues+50);
}The last line is how it's used. For passing the values, you can pass any pair of iterators that delimit the range, including simple pointers (as I've done here), provided that these iterators can be incremented and dereferenced. In this example