[Home]Effective UBLAS

BOOST WIKI | RecentChanges | Preferences | Page List | Links List

Effective uBLAS

For uBLAS Users and Developers

This document is intended both to facilitate discussion and to document the current status of uBLAS. Its positioning in the Boost Wiki should make it easier for new comers to avoid problems and understand the many complex issues uBLAS has to deal with.

If you are interested, you can join the ublas mailing list:

      (new) http://lists.boost.org/mailman/listinfo.cgi/ublas

and have a look at the main page of the boost project.

There is a Sourceforge site where development took place

      (old) http://sourceforge.net/projects/ublas/ 

The old mailing list is still available at

      (old) http://groups.yahoo.com/group/ublas-dev/

but no new members will be accepted. All discussion should run via the new list.

Boost 1.33.x

The most recent version of Boost with an updated version of uBLAS. Release Notes uBLAS

First uBLAS users and developers meeting

uBlas dev-meeting

Further sources of information

Frequently asked Questions using uBLAS mostly performance issues.

Extensions and new usage patterns in uBLAS.

Examples - How to extend uBLAS.

Gunter Winkler's tricks and sparse matrix information (and a lot more)

     http://www.guwi17.de/ublas/index.html

Linear Algebra with uBLAS

Linear Algebra with uBLAS

LU Matrix Inversion - matrix inversion using uBLAS' LU-decomposition functions.

Effective UBLAS/Matrix Inversion.

Making Efficient use of uBLAS

Make sure you define NDEBUG

The only way uBLAS knows you want a release configuration is to check if you have defined NDEBUG. If you don't it assumes you want a debug configuration and adds a lot of very useful runtime check. However these are very slow!

You can either rely on your build system to do this for you. Boost.build is highly recommended. Or you must add this yourself to your compiler options.

Controlling the complexity of nested products

What is the complexity (the number of add and multiply operations) required to compute the following?
 R = prod(A, prod(B,C)); 
Firstly the complexity depends on matrix size. Also since prod is transitive (not commutative) the bracket order affects the complexity.

uBLAS evaluates expressions without matrix of vector temporaries and honours the bracketing structure. However avoiding temporaies for tested product unnecessarly increases the complexity. Conversly by explictly using temporary matrices the complexity of a nested product can be reduced.

uBLAS will give a compile time error using the syntax above, and instead, it provides 3 alternative syntaxes for this purpose:

 temp_type T = prod(B,C); R = prod(A,T);   // Preferable if T is preallocated

 prod(A, temp_type(prod(B,C));

 prod(A, prod<temp_type>(B,C));

The 'temp_type' is important. Given A,B,C are all of the same type. Say matrix<float>, the choice is easy. However if the value_type is mixed (int with float or double) or the matrix type is mixed (sparse with symmetric) the best solution is not so obvious. It is up to you! It depends on numerical properties of A and the result of the prod(B,C).

Properties of symmetric_matrix

They really are symmetric! uBLAS is able to automaticaly enforce their symmetry. This is achieved by using a packed storage format so that access to upper/lower parts symmetrically accesses the same element. However packed storage is significantly slower than normal dense storage.

When assigning from a non another matrix type is only valid when its value is numerically (upper identical to lower) symmetric. This is checked as a precondition in the debug build. This check can be avoided (ignoring the other half of the matrix by using a symmetric_adaptor.

Fixed size vector or matrices

The default uBLAS vector and matrix types are of variable size. Many linear algebra problems involve vectors with fixed size. 2 and 3 elements are common in geometry! Fixed size storage (akin to C arrays) can be implemented efficiently as it does not involve the overheads (heap management) associated with dynamic storage. uBLAS implements fixed sizes by changing the underling storage of a vector/matrix to a "bounded_array" from the default "unbounded_array". A simple 3 element vector can be implemented very easily thus:

 namespace ublas = boost::numeric::ublas;
 class vector3 : public ublas::vector<double, ublas::bounded_array<double,3> >
 {
	typedef ublas::vector<double, ublas::bounded_array<double,3> > Base_vector;
 public:
	// Default construction
	vector3 () : Base_vector(3)
	{}
	// Construction and assignment from a uBLAS vector expression or copy assignment
	template <class R> vector3 (const ublas::vector_expression<R>& r) : Base_vector(r)
	{}
	template <class R> void operator=(const ublas::vector_expression<R>& r)
	{
		Base_vector::operator=(r);
	}
	template <class R> void operator=(const Base_vector& r)
	{
		Base_vector::operator=(r);
	}
 };

It should be noted that this only changes the storage uBLAS uses for the vector3. uBLAS will still use all the same algorithm (which assume a variable size) to manipulate the vector3. In practice this seems to have no negative impact on speed. The above runs just as quickly as a hand crafted vector3 which does not use uBLAS. The only negative impact is that the vector3 always store a "size" member which in this case is redundant.

uBLAS now has a bounded_vector type that effectively wraps the above.

 ublas::bounded_vector<double, 3>

'rank' of an Iterator

Some iterator functions have a 'rank' argument. This is not explained in the class documentation. What is it for you may ask?

It is not the rank (number of independent row or columns) of the matrix!

Joerg's answer: It's something like the distinction over which rank the iterators have to traverse: 0 means it is an outer loop iterator, which is usually not able to determine (easily), if a row/column is empty and 1 means it is in inner loop iterator, which has the task to determine the next/previous element in a row or column.

edit: Thoughts about matrix iterators, see UBLAS matrix iterators.

Debugging in Visual Studio

It is possible to get vectors to display cleanly in VS debugger. Add the following in the autoexp.dat file in Common7/Packages?/Debugger?/ under [Visualizer] where it says DO NOT MODIFY.

 boost::numeric::ublas::bounded_vector<*,*>{
	children
	(
	    #array
	    (
			expr :	($c.data_.data_)[$i],
			size :	$c.data_.size_
		)
	)

    preview
    (
        #(
			"[",
            $c.data_.size_ ,
            "](",

            #array
            (
				expr : 	($c.data_.data_)[$i],
				size : 	$c.data_.size_
			),
			")"
		)
	)
 }

 boost::numeric::ublas::vector<*,*>{
	children
	(
	    #array
	    (
			expr :	($c.data_.data_)[$i],
			size :	$c.data_.size_
		)
	)

    preview
    (
        #(
			"[",
            $c.data_.size_ ,
            "](",

            #array
            (
				expr : 	($c.data_.data_)[$i],
				size : 	$c.data_.size_
			),
			")"
		)
	)
 }

Configuring uBLAS with Macros

Many macro's appear in ublas/config.hpp and elsewhere. Hopefully in the future may of these will disappear! They fall into 4 groups:

uBLAS Current Issues

Bug reports

Bug reports are followed using the Sourceforge tracker at:

      http://sourceforge.net/tracker/?group_id=94014&atid=606391

warning C4224: nonstandard extension used : formal parameter 'lower' was previously defined as a type

Is issued by VC7 when compiling with /Za. No other compiler reports this as a problem. Any information regarding standards conformance would be helpful.

Index differences may overflow

uBLAS uses size_t to for index values. The implementation of some algorithms requires the difference of two indices i.e.
 int diff = a.index() - b.index();
This may overflow.

size_t not consistently used

uBLAS uses size_t(-1) as special value. It is does not guaranteed that this special value cannot be specified by the user in construction or indexing.

sparse_matrix::iterator not sparse for rank==0 (slow index)

For simplicity when manipulating the slow index of a matrix the iterators sparseness is ignored. Only when the rank==0 (fast index) is manipulated is sparseness considered. Implies complexity is O(m) where m is size of slow index.

Conformant sparse proxies

When the sparse proxy assignment algorithm is used for computed assignments it must first insure the LHS is conformant (contains all the elements of) the RHS. This requires the assignment algorithm first insert elements into the LHS to make it conformant. This is unusual as computed assignment is normally very simple and fast.
BOOST WIKI | RecentChanges | Preferences | Page List | Links List
Edit text of this page | View other revisions
Last edited September 5, 2008 10:20 am (diff)
Search:
Disclaimer: This site not officially maintained by Boost Developers