boost.png (6897 bytes) Home Libraries People FAQ More

PrevUpHomeNext

Tutorial

Generators
Example: the producer/consumer pattern
Stackful generators: Same fringe problem
Coroutines
Multitasking
Waiting for events

While all subroutines have state, this is usually lost when a subroutine returns; on the other hand coroutines keep their state across calls. Function objects, familiar to most C++ programmers, are similar to coroutines in the fact as they may have state that is preserved across calls; but while function objects store their state on class members variables, coroutines store the state in the stack as automatic objects.

[Marlin80] provides a widely accepted definition of coroutines:

The second point is a fundamental difference between a coroutine and a generic function objects. While the latter can also preserve local data in the form of member variables, it does not automatically preserve the point of suspension when it is exited; it must be manually saved as an extra state member variable. Coroutines automatically remember where they left off.

Coroutines can be used in all places where function objects are used; this includes: as parameters to standard algorithms, as generator functions, as callback to asynchronous functions and much more.

In this section, we will first present the generator class template (a simplified form of coroutines). Only later the full coroutine class template is described.

Stylistic Notes

For brevity all code in this and most other sections will assume that the following using declaration is in effect:

using namespace coro = boost::coroutines;

And the following include directive is present:

#include<boost/coroutine/generator.hpp>

Generators

One of the most simple uses for coroutines is as generator functions. A generator is similar to a function that returns a sequence of values, but instead of returning all values at once (for example as an array), the generator returns the values one at time. Every time the generator is called, it returns the next value.

In standard C++ library, generators are for example used with the std::generate algorithm, that takes as third argument a function object that model the Generator concept.

Function objects as generators

A generator can be easily implemented in C++ as a function object. Consider a generator that returns all integer numbers in a range:

class range_generator {
public:
  range_generator(int min, int max) :
    m_current(min),
    m_max(max) {}

  int operator()() {
    return m_current++;
  }

  operator bool() const {
    return m_current < m_max;
  }

private:
  int m_current;
  int m_max;
};

It can be used like this:

range_generator generator(100, 200);

while(generator) 
  std::cout<<generator()<<"\n";

It will print all values in the half-open range [100, 200). The conversion to bool is used to detect when the generator has terminated. In production code probably the safe-bool idiom would be used instead.

Input iterators as generators

A generator can also be implemented as an input iterator. Recall that an input iterator only support dereferencing and incrementing. This is the iterator version of the previous function object.

class range_generator {
public:
  typedef int value_type;

  range_generator(int min, int max) :
    m_current(min),
    m_max(max) {}

  range_generator() :
    m_current(-1),
    m_max(0) {}

  int operator*() {
    return m_current;
  }
  
  range_generator& operator++() {	
    m_current ++;
    if(m_current == m_max)
      m_current = -1;
    return *this;
  }    

  range_generator operator++(int) {
    range_generator t(*this);
    ++*this;
    return t;
  }

  friend
  bool operator==(const range_generator& rhs,
    const range_generator& lhs) {
    return rhs.m_current == lhs.m_current;
  }

  friend
  bool operator!=(const range_generator& rhs,
    const range_generator& lhs) {
    return !(rhs == lhs);
  }

  private:
  int m_current;
  int m_max;
};

It can be used like this:

range_generator generator(100, 200);

while(generator != range_generator()) 
  std::cout<<*generator++<<"\n";

It will print all values in the half-open range [100, 200). Notice that a default constructed iterator is used to represent the past-the-end iterator. We will call this kind of iterator a generator iterator.

The generator class template

Obviously a generator is a stateful object, and can be easily implemented using coroutines.

Before introducing full fledged coroutines, we will introduce the generator class template that wrap a coroutine in an input iterator interface.

We begin declaring its type, the generator is an iterator over values of type int:

typedef coro::generator<int> generator_type;

The typedef is not really required, but makes the following code more readable. This is the generator body:

int range_generator(generator_type::self& self, 
      int min,
      int max) 
{
  while(min < max-1)
    self.yield(min++);
  return min;  
}

It is a plain C++ function that takes as parameter a non const reference to a generator::self and two integers by value. The self object of type generator_type::self identifies the current generator. In fact, as coroutines have state, there can be more than one instance of the same coroutine type. The self name is just a convention used in this documentation. You can give to it whatever name you want, of course.

The min and max parameters are the minimum and maximum bounds of the iteration.

The generator body iterates between all numbers in the ranges [min, max-1) and invokes self::yield() for each number. The yield member function is responsible of returning the parameter to the caller of the generator.

When the while loop terminates, a plain return min statement is executed. This both terminates the generator and returns the final value (i.e. max-1). We will see later how to remove this asimmetry.

Given the generator body, a generator iterator can be constructed:

generator_type generator
  (boost::bind
   (range_generator, 
    _1, 
    100,
    200));

The boost::bind facility is used to bind the min and max arguments of the function to the actual iterations ranges. The function object returned by boost::bind is then used to construct a generator object. The signature of the function or function object passed to the generator constructor must be:

value_type(coro::generator<value_type>::self&)

The generator iterator can be used exactly like the iterator object of the previous example.

while(generator != generator_type()) 
  std::cout<<*generator++<<"\n";

Note that range_generator body is entered for the first time when the generator is constructed (from the main entry point), then at every iteration range_iterator is reentered from yield(). In particular range_iterator is reentered when generator::operator++ is invoked.

You can have more than one generator referring to the same body:

generator_type generator_a
  (boost::bind
   (range_generator, 
    _1, 
    100,
    200));

generator_type generator_b
  (boost::bind
   (range_generator, 
    _1, 
    100,
    200));
Do not confuse a generator body with the generator itself. The generator body is only the code that implement the generator behavior. The generator is composed of the body plus the current state (that is, the current call stack and the set of live local variables). Notice that two generators with the same generator signature and the same body are still two different generators.
while(generator_a != generator_type() && 
enerator_b != generator_type()) 
  std::cout<<"generator_a is: "<<*generator_a++<<", "
    <<"generator_b is: "<<*generator_b++<<"\n";

The self parameter in range_generator is used to identify the different instances of a generator. Also generator::self encodes the type of the generator allowing the compiler to statically type check the argument type of yield in the same way it would statically type check the argument type of a return statement.

In addition to the normal input iterator semantics, a generator iterator is also convertible to bool. The conversion returns true while there are elements in the range:

range_generator generator(100, 200);

while(generator) 
  std::cout<<*generator++<<"\n";

generator has a nested result_type typedef and an value_type operator() member function (generator() is equivalent to *generator++). Thus generator also models the AdaptableGenerator concept:

range_generator generator(100, 200);

while(generator) 
  std::cout<<generator()<<"\n";

Exiting a generator

The previous example had an asimmetry in its body. The last generated value had to be returned with a 'return' statement instead of 'yield'. In simple code this is not a problem, because it is easy to see what the final value will be, but in more complex generators this asimmetry requires a substantial obfuscation of the code.

The generator::self::exit() member function provides a way to exit a generator without returning a value. The previous generator can thus be written like this:

int range_generator(generator_type::self& self, 
	     int min,
	     int max) 
 {
   while(min < max)
     self.yield(min++);
   self.exit();
 }

Notice that now the while loop iterates over the full range. The generator class can handle both styles of exiting a generator.

exit() works by throwing an exception of type exit_exception. Objects of this type can be normally caught, but must be eventually re-thrown: once exit() has been called, the coroutine can no longer yield() nor return.

Some compilers might not be able to recognize exit() as a function that doesn't return, and warn that 'range_generator' returns without a value. For these compilers you may have to add a dummy return value at the end of the function body like this: return int(); If the return type is not default constructible, boost optional might be another solution: return *boost::optional<result_type>();

A generator is automatically exited when the last generator iterator that refers to it goes out of scope. In that case the generator body is resumed and an exit_exception is thrown from yield().

Note that the generator class template use the reference counted body/handle idiom. This is necessary because an input iterator must be Assignable while it is in general not possible to copy the generator state (that is kept in automatic variables in the generator body). This means that if a generator ever gets a copy of its associated generator iterator, a cycle is formed and it could cause memory not to be reclaimed.
Copyright © 2006 Giovanni P. Deretta

PrevUpHomeNext