[Home]STLAlgorithmExtensions/StatisticsAlgorithms

BOOST WIKI | STLAlgorithmExtensions | RecentChanges | Preferences | Page List | Links List

This could include mean, median, and mode, for sequences of numeric values.

Here some simple code I had laying around for mean, variance, and standard_deviation. Hope this helps...People/Jeff Garland

I changed the code to parameterize the number type. I also introduced the zero parameter to duck the problem of coming up with the appropriate zero object for an arbitrary type T. -People/JeremySiek

Jeremy -- I like the additional parameter to genericize the number type, but I think the zero is a bad idea. It is ugly to make the user to know that he/she has to pass zero to these functions. If something else is passed the answer will be incorrect. To me, it would be much more reasonable to assume that something that is a valid floating point number can be constructed using normal 0. What do you think? -- Jeff

That is a good point. However, I'd still like to leave open the door for people that aren't using floating point numbers. I think anything that satisfies the mathematics requirements for a Field should work. How about we go with a default that uses 0? --People/JeremySiek

I believe to make this generic to both scalar and complex, you must use norm to compute variance. This means you need to define norm for scalars. (Neal Becker)

Looks great! -- Jeff

Isn't value_type() a better default? int(), float(), complex<float>() all are proper 0-values; other types (e.g. mathematical vectors) exist that have a useful mean, a useful default ctor but might not have a conversion from 0. -- People/MichielSalters

I've modified the code below to have a few more statistics functions. In order to compile the AccumulatorType? needs to be callable by sqrt() for most of them. I noticed a few other things:

--People/ScottKirkwood

Some suggestions if someone is going to propose some statistics algorithms for a boost header:

--GeorgeHeintzelman

I agree with George, but he beat me to the edit, and went down the same route with:-

template<typename AccumulatorType>
class order_2_accumulator
{
  public:
    typedef AccumulatorType value_type;

    order_2_accumulator():
      Count(),
      Sum(),
      Sum2()
    {}
    order_2_accumulator(unsigned int count_,
                        const value_type& sum_,
                        const value_type& sum2_):
      Count(count_),
      Sum(sum_),
      Sum2(sum2_)
    {}

    unsigned int count() const     { return Count; }
    value_type sum() const         { return Sum; }
    value_type sum_squares() const { return Sum2; }
    value_type mean() const        { return Sum/Count; }
    value_type variance() const    { return Sum2/Count - (Sum*Sum)/(Count*Count); }

    // Scott Kirkwood: added these - would require sqrt() though.
    value_type std() const         { return std::sqrt(variance); }
    value_type std_error_of_mean() { return std() / std::sqrt(Count); }
    value_type root_mean_square()  { return std::sqrt(Sum2 / Count); }
    value_type coefficient_of_variation() { return 100 * std() / mean(); }

    template<typename T>
    order_2_accumulator<value_type> bump(const T& value_)
    {
      ++Count;
      Sum  += value_;
      Sum2 += value_*value_;
      return *this;
    }
  private:
    unsigned int Count;
    value_type   Sum;
    value_type   Sum2;
};


// Default operator used by std::accumulate
template<typename AccumulatorType, typename T>
AccumulatorType operator+(const AccumulatorType& init_, const T& value_)
{
  AccumulatorType accum(init_);
  return accum.bump(value_);
}

Which is then used like this
#include <iostream>
#include <numeric>
int main()
{
  int seq[] ={1, 2, 3, 4};
  order_2_accumulator<double> sum=std::accumulate(seq, seq+4, order_2_accumulator<double>());
  std::cout << sum.count() << ' ' << sum.sum() << ' ' << sum.sum_squares() << std::endl;
  std::cout << sum.mean() << ' ' << sum.variance() << std::endl;

  double seq2[] = {1., 2., 3., 4.};
  sum=std::accumulate(seq2, seq2+4, sum);
  std::cout << sum.count() << ' ' << sum.sum() << ' ' << sum.sum_squares() << std::endl;
  std::cout << sum.mean() << ' ' << sum.variance() << std::endl;

}

--Ian Mitchell

 //stats.hpp
 #include <cmath>

 template<class InputIterator>
 inline
 typename std::iterator_traits<InputIterator>::value_type
 mean(InputIterator begin,
      InputIterator end,
      typename std::iterator_traits<InputIterator>::value_type zero = 0)     
 {
   unsigned int count = 0;
   typename std::iterator_traits<InputIterator>::value_type sum = zero;
   for(InputIterator i=begin; i < end; ++i) {
     sum += *i;
     count++;
   }
   return sum/count;
 }


 template<class InputIterator>
 inline
 typename std::iterator_traits<InputIterator>::value_type
 variance(InputIterator begin,
 	  InputIterator end,
          typename std::iterator_traits<InputIterator>::value_type zero = 0)
 {
   typename std::iterator_traits<InputIterator>::value_type
     mn = mean(begin,end);
   unsigned int count = 0;
   typename std::iterator_traits<InputIterator>::value_type sum = zero;
   for(InputIterator i=begin; i < end; ++i) {
     sum += std::pow(*i - mn, 2);
     count++;
   }
   return sum/(count-1);
 }
 
 template<class InputIterator>
 inline
 typename std::iterator_traits<InputIterator>::value_type
 standard_deviation(InputIterator begin,
                    InputIterator end,
                    typename std::iterator_traits<InputIterator>::value_type zero = 0)   
 {
   return std::sqrt(variance(begin, end, zero));
 } 

BOOST WIKI | STLAlgorithmExtensions | RecentChanges | Preferences | Page List | Links List
Edit text of this page | View other revisions
Last edited February 11, 2003 1:02 pm (diff)
Search:
Disclaimer: This site not officially maintained by Boost Developers