Google Summer Of Code 2006

BOOST WIKI | RecentChanges | Preferences | Page List | Links List

The following are some possible Boost projects to pursue for the 2006 Summer of Code. If you have other ideas you would like to pursue that are consistent with the Boost mission of building world-class, open source, C++ libraries please feel free to post to the [developer list].

[Register at the Google Student Registration Page] by May 8th at 17:00 Pacific Daylight Time

Notes:

Read the whole page; not all project suggestions appear in the partial index below
These are just suggestions; we're open to your own original ideas. We have had a lot of people applying to do the same thing, and you may stand a better chance if you propose solving a problem you're passionate about.
Please don't flood the mentors and administrators with applications. Decide on one project on which you think you can do a great job, and concentrate on writing one great application, with lots of detail about what you plan to accomplish and by when.

Partial Index

Binding to Dynamic Languages.
Boost.Spirit C++ Parser
Boost.Build enhancements (see below)
Julio V has expressed an interest in developing BoostProcess
Boost.Unicode might be a possible project
Specialized containers with Boost.MultiIndex
Associative containers specialized for low memory foot-print and fast lookup
Generic tree container
Small buffer utility
SQL library with full embedded SQL-syntax
Volatile container
Editable XML Serialization
Enhanced Boost Copy Utility
Boost.Statechart Viewer
Big Int or Decimal Arithmetic
Concurrency Library
Graph Library: Hypergraph Support
Graph Library: Out-of-Core Graphs and Graph Algorithms
Graph Library: User-Friendly Graph Classes
Graph Library: Performance Benchmarking and Tuning
Geometry Library: small STL-like components

Stealing from the GCC SoC? wiki :-)... They point to some [suggestions] from Drupal to students applying for SoC?. In the case of Boost I would emphasize their -- Don't be afraid to go out on a limb. -- sentiment. (Rene)

Boost.Langbinding

Boost.Langbinding is a generalized library proposed for binding dynamic languages to C++. The popularity and elegance of Boost.Python has sprouted imitators for many different languages, including Lua, Ruby, and JavaScript?. The authors of Boost.Python and Luabind have collaborated on the fundamentals of a design that allows different "back-end" plugins to be used with the same C++ binding code to expose the same classes to multiple scripting languages. In principle it should be possible to run these languages together and communicate between them using Langbinding-wrapped C++ code. The basic ideas are all there and proven to be sound, but the project is still in its infancy. A talented student could carry it through to maturity, or at least adolescence, in about three months. You can read some of the discussion that forms the basis for this library idea here: http://news.gmane.org/gmane.comp.lib.boost.langbinding

Likely Mentors: David Abrahams, Daniel Wallin

Boost.Build

Introduction: [Boost.Build] is the cross platform build description and management software used by the [Boost] C++ Libraries. In a robust development tools are as important as the programs developed with their help. In the case of Boost.Build it reduces the development burden placed on the numerous library developement community that participates in Boost. Instead of developing, testing, and fixing many different build system descriptions they can write a single cohesive build description which is used in all platforms. This allows for a distributed development and testing model, of developing on one platform and having others tests on different platforms, essential for Open Source software. Our goals with the projects below, is to continue to extend the cross-platform aspect to other parts of the configuration, build, and testing stream. (Rene Rivera)

Configure

Goal: To construct a configuration file generation tool for [Boost.Build] ala GNUs [Autoconf]. The new tool should have enough functionality to replace the current [Boost.Config] [Autoconf] based configure script.

Mentor: Rene Rivera

Requirements:

Overall:
- At the user level must be a part of the Boost.Build system which is built on top of [Boost.Jam], a Perforce Jam derivative.
- Support the same distributed individual test composition that has made Autoconf popular.
Tests:
- Individual tests can be written in a language appropriate for the what is getting tested. This will at minimum include C++ as that is what Boost is interested in. More important is that the system must be flexible to allow the use of any language, as is the case with Autoconf.
File Generation:
- Like Autoconf, an external template file should be supported.
- Optionally an inline template file could be supported.
- The syntax should be simple and intuitive to developers.

Visual C++ Project Files

Goal: To construct a [Visual C++] project files generation tool for [Boost.Build]. The tool should have enough functionality to generate a complete set of project files and a solution file for building the [Boost] C++ Libraries with [Visual C++] version 8.0.

Mentor: Rene Rivera

Python port

'Goal:' To rewrite Boost.Build V2 in Python. The primary goals of the rewrite are improved peformance and maintainability. It's not expected that all Boost.Build functionality be ported, but building C++ application with two compilers will be required.

Mentor: Vladimir Prus

CompileTimeCheckedFormat

Mentor: AlexanderNasonov

Specialized containers with Boost.MultiIndex

Introduction: [Boost.MultiIndex] allows for the construction of complex data structures involving two or more indexing mechanisms on the same set of elements. Out of the unlimited range of possible data structures specifiable within Boost.MultiIndex, some particular configurations arise recurrently:

A bidirectional map or bimap is a container of elements of type pair<T,Q> where fast lookup is provided both for the T and the Q field, in contrast with a regular STL map which only allows for fast lookup on T.
An MRU (most recently used) list keeps the n last referenced elements: when a new item is inserted and the list has reached its maximum length, the oldest element is erased, whereas if an insertion is tried of a preexistent element, this gets promoted to the first position. MRU lists can be used to implement dynamic caches and the kind of behavior exhibited by programs featuring a "Recent files" menu command, for instance.

Although Boost.MultiIndex provides the mechanisms to build these common structures, the resulting interface can be cumbersome and too general in comparison with specialized containers focusing on such particular structures.

Goal: To write a library of specialized containers like the ones described above, using Boost.MultiIndex as the implementation core. Besides bimap and MRU list, the student can also propose other specialized containers of interest in the community. It is expected that the library meets the standards of quality required by Boost for an eventual inclusion in this project, which implies a strong emphasis on interface design, documentation and unit testing; the mentor will be guiding the student through the complete cycle from specification and requirements gathering to documentation and actual coding. The final result of the project must then contain:

Source code following [Boost programming guidelines].
User documentation. Requirements on the format are loose, though the [QuickBook] format is gaining acceptance within Boost.
Complete set of unit tests powered by [Boost Build System V2].

Requirements

Intermediate-to-high level in C++, with emphasis in generic programming (templates).
Knowledge of the STL framework and design principles. Of course, knowledge of Boost in general and Boost.MultiIndex in particular is a big plus.
Acquaintaince with at least two different C++ programming evironments.
Some fluency in the English language; subsequent reviews of the documentation can help smooth rough edges here, though.
A mathematical inclination and previous exposure to a formal Algorithms course would help very much.
A craving for extreme quality work.

Benefits for the student: The student taking on this project will have the opportunity to learn the complete process of software production inside a highly regarded C++ open source institution, and even see her work included in Boost eventually. The completion of the project involves non-trivial problems in C++ interface design and so-called modern C++ programming, high quality user documentation and unit testing. The student will also learn, perhaps to her surprise, that most of the time will be spent gathering and trying ideas and, in general, thinking, rather than writing actual code.

Mentor: [Joaquín M López Muñoz].

Associative containers specialized for low memory footprint and fast lookup

Introduction: We all use containers like std::map and std::set in our daily work. But for certain operations, using these container incurs a lot of overhead, because the container allocates a node for each value it stores. For small types such as int or T* there is a large overhead both in time and space. The solution is fairly well-known: use an std::vector and wrap it such that it is always sorted, and then implement find() and other lookup functions using binary search.

Goal: To write a library consisting of these specialized containers, benchmarks, unit-tests and their documentation. In addition to map and set classes, it might be good to provide an optimized priority_queue as well. Additional functionality may be to support lookup-functions with parameters that does not need to be converted to the key type as long as they are comparable.

There are several other libraries that have something similar. For example

- Loki's AssocVector?

- Boost.Interprocess' (former Shmem) flat_map, flat_set, flat_multimap, flat_multiset containers

Requirements: Low-to-Intermediate level C++ knowledge.

Mentor: [Thorsten Ottosen]

Generic tree container

Please be aware that as of May 4th we have about 5 Tree proposals -- some of them quite good. We will not be accepting 2, so if you are thinking of this you might want to think twice.

Introduction: Trees a pervasive in Computer Science and there are few programs that do not use a tree datastructure in some form or another. The C++ standard library, however, does not provide a low-level tree-container.

Goal: To write, document and test a generic tree-library. There quite some design work involved when you try to figure out the best and most flexible interface. The following libraries may act as a starting point:

- Adobe's forrest class

- An STL-like C++ tree class by Kasper Peeters

Requirements: Intermediate-to-high level C++ knowledge.

Mentor: Not yet assigned.

Small buffer utility

Introduction: C++ library developers constantly strive for more efficient solutions. One way to optimize certain pieces of code is to avoid the expensive heap-allocations if it is possible. One way to do that is to create a small utility class that uses an internal stack-based buffer until a certain extend. If there is a need for a larger buffer, it automatically uses the heap instead.

Goal: To write, document and test an boost::auto_buffer class (and perhaps also boost::ptr_auto_buffer). This component may act as a starting point:

- Synesis' auto_buffer

Requirements: Low level knowledge of C++.

Mentor: [Thorsten Ottosen]

SQL library with full embedded SQL-syntax

Please be aware that as of May 5th we have about 5 SQL proposals -- some of them quite good. We will not be accepting 2, so if you are thinking of this you might want to think twice about submitting this idea.

Introduction: A good SQL library is one of the most wanted by the C++ community, and modern C++ opens up for new levels of elegance in the design of such a library. Most people know how to use an SQL library in other languages. For example, in PHP I usually write is like this:

  $res = $this->safeQuery( " select * from " . 
                           ARTICLE_TABLE .
                           " where id = $faqData->article " );

Wouldn't it be cool if we could write the same in C++ and get the compiler to type-check the SQL-expression for us? Just imagine that you could write this

  using boost::sql;
  query q = select >> '*' >> from >> article_table >> where >> article_table.id == faqData.article; 
  result = q.execute();

This syntax is of course entirely made-up, but libraries like Spirit, Xpressive and Lambda show that it could be realistic.

The first stage would be to find a way to specify SQL tables directly in C++, perhaps allowing for the following use:

  struct employee_name   : sql::column<std::string> { };
  struct employee_age    : sql::column<int> { };
  struct employee_salary : sql::column<float> { };
  struct employee_birth  : sql::column<boost::date_time> { };

  struct employee_table : sql::table<employee_name,employee_name,employee_salary,employee_birth> { } 
  ...
  sql::connection conn( "localhost", "password" );
  sql::database db( "company", conn );
  emplyee_table t;
  db.create( t );

The newly created table keeps all the necessary type-information around, so to make type-checking of queries possible. Perhaps it should then be possible to do the following:

  employee_table::result_set rs = db >> select >> '*' >> from >> t;
  ...
  BOOST_FOREACH( const emplyee_table::row& row, rs )
      std::string s = row.get<emplyee_name>();
  ...
  std::string      a_string = ...;
  boost::date_time a_date   = ...;
  db >> update >> employee_table >> set >> (employee_name = a_string, employee_birth = a_data) >> where >> employee_age = 42;

The many types we created in the beginning are thus used as tags for type-safe extraction of the values. The example also shows that it should be possible to use some kind of name parameters.

Once this basic functionality has been implementet, there are endless posibilities for enhancements. But that's another story...

Goal: To write, document and test a library as descibed above. These librararies may be used as inspiration:

- Database Template Library

- LiteSQL ? - C++/SQL serialization library (provides already partial support for the proposed syntax)

- OTL

- RTL, a library proposal for boost (go to RTL subfolder)

- Ultimate++ SQL (search for "SQL programming")

- SOCI library

This list is incomplete.

Requirements: Intermediate-to-high level C++ knowlegde with emphasis on C++ templates. Don't be fooled: this task is MAJOR and will require quite some meta-programming

Mentors: Possible: [Thorsten Ottosen]

 (but several mentors are needed)

Volatile Container

Introduction: As modern computers are getting faster, robustness and maintainability of code become relatively more important than efficiency. Object oriented applications can become so complex that a complete overview is not feasible anymore. The programmer can only consider one class at a time, treating the rest of the application as a collection of randomly created and destructed objects.

Obviously, it would be a problem when one object points to another object, and the latter is destroyed. A solution to this problem is to use shared pointers with reference counters-- but another, often more logical approach would be to destroy the object that owns the pointer as well (I need this specifically for an event library where 'event requests' need to be deleted when the object that requested the event is deleted; the event request object contains a 'pointer' to the latter object in the form of the call-back function).

Automatically destroying other objects when a given object is being destroyed, leads possibly to an unpredictable avalanche of destroyed objects. Having a robust infrastructure for dealing with that would be nice; but certain specific problems need to be overcome first. One of those problems is where the application is iterating through all elements of a container, calling member functions of those elements. Calling any 'external' function can cause any number of objects to 'disappear'. Finding the 'next' element in the container is therefore non-trivial.

Goal: To write and document an STL-like container that allows iterating through its elements while elements are 'randomly' deleted. The planned approach will be to not really delete the elements until after the loop. Elements that are being deleted from a container can only be deleted by calling 'erase', if that happens in some kind of 'locked' state, then the element is marked for removal but not really deleted until after the lock is released. Robustness demands that the lock has to be acquired to run over the elements in the first place.

Requirements: Intermediate level knowledge of C++ (will use templates, base classes and virtual functions). The ability to think on an abstract level.

Mentor: Carlo Wood

Editable XML Serialization

Introduction: Boost Serialization can be used to save an restore data in XML format. Currently, there it is generally impractical and or inconvenient to edit these files outside the application. This is surprising and disappointing to many users.

Goal: To create an archive type derived from the current xml_archive which, in addition to serializing the data in XML, generates an XML schema which can be used by XML editors such as XML Spy to edit xml archives while keeping them loadable by the Boost Serialization library.

Requirements: Understanding and/or interest in the boost serialization library, xml and xml schemas, and XML editors.

Mentor: Robert Ramey

Resources for potential inspiration: The s11n (serialization) library

Enhanced Boost Copy Utility

Introduction: [Boost Copy (bcp)] is a utility that can subset the Boost Development Tree based on library dependencies and other factors.

Goal: To enhance bcp with several new features including:

Used-by and Uses Dependencies by Library into XML or HTML
Utilize the [Wave Preprocessor] to do macro preprocessing thus enhancing the exact dependency analysis
Add a platform filter in conjuction with the Wave analysis -- thus if the user only wants a development tree for the date-time library on Linux

An example command might look like:

  bcp --platform Linux --toolset gcc-4 --libraries date_time --Macro=BOOST_DATE_TIME_STANDARD_CONFIG

Requirements: Intermediate level of C++. Some knowledge of Boost.Regex, Boost.Filesytem, and Boost.WAVE would provide a headstart as bcp uses the first 2 and Wave would be used in the new version.

Mentor: Not yet assigned (Possibly [John Maddock])

Viewer utility for FSMs implemented with Boost.Statechart

Introduction: [Boost.Statechart] is a library that allows to implement finite state machines directly in C++ code

Goal: Application that can extract a graphical representation ([UML State Machine diagram]) from C++ code implementing an FSM with Boost.Statechart. The kind of work is somewhat tailorable to the interests of the student:

A minimal but already very useful tool would be a command line utility that uses [GCC_XML] to look for states in .hpp files and produces [Graphviz] output with only the state hierarchy (states, inner states, orthogonal states). The visualization of the additional statechart elements like termination & history pseudo states, transitions, entry & exit actions should be achievable within the 3 month period of Summer of Code
A more advanced application could have a GUI where the user can manually or automatically arrange statecharts
A deluxe variant of the application could even allow the user to modify a statechart graphically (within certain limits) and then modify the underlying C++ code accordingly

Requirements: Affinity for state machines. Intermediate level in a widely used and freely available programming language. Beginner-Intermediate level in C++ & XML

Mentor: Andreas Huber (ahd6974-groups at yahoo dot com)

Various Possible Projects for BigInt or Decimal Arithmetic

Introduction: Finish Boost.Big Int or provide implementataion of n1977 Decimal Types Proposal

Goal: Build additional integer types or math functions for addition to Boost

The following is the current work on Boost.Big Int in the

http://cvs.sourceforge.net/viewcvs.py/boost-sandbox/boost-sandbox/boost/bigint.hpp?rev=1.28&view=auto http://cvs.sourceforge.net/viewcvs.py/boost-sandbox/boost-sandbox/libs/bigint/

This was put together by Ron Garcia @ Univ of Indiana. Also, Richard Peters was apparently was doing some work in this space. Try out this query to look at some of the posts in 2005 about bigint:

http://tinyurl.com/lpsma

Before going to far down this path you might want to get in contact with them and see if there is a way to extend/finish their work.

Also on a mathmatical track, there is a need for someone to develop TR1 math special functions. Specifically have a look at:

http://engineering.meta-comm.com/resources/cs-win32_metacomm/doc/html/boost_tr1/unsupported.html#boost_tr1.special

This might be another possibility if the BigInt direction doesn't work out. John Maddock and Paul Bristow have been the folks most involved

Finally, I'll mention that another possible project in this area would be to build a Boost implementation for the Decimal Arithmetic proposal. This is a proposal that IBM is making before the standard committee. That latest draft is at:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n1977.html

Requirements: Understanding of machine mathematics, moderate C++ experience

Mentor: TBD

Concurrency Library

Introduction: With the recent introduction of multi-core processors to the mainstream user, there is a growing need for high-level tools which ease the creation of applications that are able to take advantage of such hardware. Low-level threading libraries which require explicit locking are often error-prone and difficult to work with, yet at this point in time there are few if any portable alternatives in C++. One possible direction for such a library may be to implement portable active objects and functions, though students may propose any design they see fit.

Goal: To write, document, and test a portable concurrency library. Students may look to Boost.Threads for portable threading facilities.

Requirements: Experience with writing portable multi-threaded code in C++.

Mentor: Not yet assigned.

Graph Library: Hypergraph Support

Introduction: The Boost Graph Library contains very general abstractions of graph datas, provides various graph data structures that meet these abstractions, and implements many traditional graph algorithms. However, all of the graph abstractions in the Boost Graph Library are based on simple graphs and multigraphs, where each edge can connect only two vertices. There is a more general form of graph, called the "hypergraph", where the edges in the hypergraph (called "hyperedges") can connect any number of vertices. How can the Boost Graph Library's abstractions and algorithms be generalized to consider hypergraphs?

Goal: To extend the abstractions and algorithms of the Boost Graph Library to support hypergraphs, and implement a hypergraph data structure to test these new algorithms.

Requirements: Solid working knowledge of C++ templates, prior exposure to graphs and graph theory.

Mentor: Jeremy Siek or Doug Gregor.

Graph Library: Out-of-Core Graphs and Graph Algorithms

Introduction: Most algorithms make the naive assumption that their entire input and output fit within the computer's main memory. When this assumption proves false, however, the results can be disastrous: swapping to disk can cause huge slowdowns that make it impossible to process large data. Applications making use of graph theory have been hit particularly hard by this problem, as researchers attempt to perform computations with graphs containing billions of vertices, such as the Web graph itself. One potential solution to this problem is the use of out-of-core graph algorithms (also called "external" graph algorithms), which are designed to operate efficiently when only part of the graph data structure is available in memory at any given time.

Goals: Implement an out-of-core graph data structure and several representative out-of-core graph algorithms (breadth-first search, depth-first search, shortest paths, etc.) in the Boost Graph Library.

Requirements: Solid working knowledge of C++ templates, prior exposure to graphs and graph theory.

Mentor: Jeremy Siek or Doug Gregor.

Graph Library: User-Friendly Graph Classes

Introduction: The main graph class in the Graph Library is adjacency_list, which is a swiss-army knife of graph classes, and can fulfill 99% of a user's need. However, it is quite complicated to use. What we really need is a pair of graph classes, directed_graph and undirected_graph that are easy to use and cover the common use cases. The directed graph should provide fast access to both in and out edges. Both classes should be as safe as possible from invalidation: adding edges and vertices should not invalidate graph iterators or other vertex or edge descriptors. As the adjacency_list class is the most widely used component in the Graph Library, high quality replacements for it will see even more use by the C++ community.

Goals: Implement two graph classes that fulfill the Graph Library interface specifications, a directed_graph and undirected_graph.

Requirements: Solid working knowledge of C++ templates, some prior exposure to graphs and graph theory.

Mentor: Jeremy Siek or Doug Gregor.

Graph Library: Performance Benchmarking and Tuning

Introduction: Recently, the LEDA project (a competitor of the Boost Graph Library, BGL) threw down the gauntlet by publishing some performance comparisons between a couple LEDA algorithms and the BGL algorithms, showing that LEDA was faster. Of course, both libraries have many more algorithms, and it would be quite interesting to see how they compare, and whether the choice of graph data structure makes a difference (the BGL provides many different graph classes). The project is to create a performance testing framework for the BGL and then to identify some problem areas and tune some BGL algorithms and data structures.

Goals: Develop a performance analysis framework that automatically runs and times BGL algorithms with a variety of graph classes. Identify problem areas and tune the data structures and algorithms.

Requirements: Solid working knowledge of C++ templates, some prior exposure to graphs and graph theory.

Mentor: Jeremy Siek or Doug Gregor.

Geometry Library: Small STL-like components for basic 2D operations

Introduction: STL has done wonders to make high-quality implementations of sequence algorithms available and put a stop to the proliferation of home-brewed sorting routines. Through its design, and its specification, it has become a model for extensions and related algorithms. Boost has followed suit with a graph library which no longer models a sequence container. There is a widespread need for basic geometric algorithms (distance and intersection computations, enclosing boxes, spatial queries, map overlay, etc.) in many applications, e.g., GUI/windows, graphics, 3d rendering, robotics, spatial data processing, etc.

There exist many geometric computations library (biased for various purposes), but none is as simple to use as, say including <algorithm> and using the STL algorithms. All require pre-compilation, installation and linkage. For instance, CGAL is an open source project which has many of these features. However, it still is a monolithic and configure-compile-install library which requires linkage. In this project, we envision small, header-only files which can be distributed independently.

Goals: The goal is two-fold: first, produce a set of concepts analogous to the Algorithm-Iterator-Container framework of the STL. In this case, the algorithms are separated from the storage by function objects which provide the geometric primitives. The code is aimed at being generic in that it should work with any representation of geometric objects (struct Point {double x, y}, yourLibrary::point, OpenGL?, CGAL::Point_2<R>, etc.) as long as geometric primitives are supplied. Adaptors to existing library would be very welcome.

Second, we would like small, one-header file implementations of basic 2D constructs (for instance, useful ones that come to mind are: kd-tree, quadtree, range trees, convex hulls, closest-pair, clustering, polygons, Delaunay triangulation, rectangle and line arrangements, line-segment intersection and map overlay). The proposal could consist of a subset of those, or other geometric algorithms of general use. However, anything dealing with 3D is significantly more complex and should not be part of this project.

Requirements: Familiarity with STL (design and implementation, e.g. HP/SGI's implementation or gcc). Familiary with generic programming, and good C++ abilities (templates required). A genuine love and some knowledge of computational geometry. References and detailed mentoring will be provided for the topic.

Mentor: TBD.

[buy lipitor online] [buy lipitor] [[buy lipitor online]]

[buy fioricet online] [buy fioricet] [[buy fioricet online]]

Disclaimer: This site not officially maintained by Boost Developers