[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

A. Internal Structures

A.1 Expressions are reference counted  
A.2 Internal representation of products and sums  


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

A.1 Expressions are reference counted

In GiNaC, there is an intrusive reference-counting mechanism at work where the counter belongs to the algebraic objects derived from class basic but is maintained by the smart pointer class ptr, of which ex contains an instance. If you understood that, you can safely skip the rest of this passage.

Expressions are extremely light-weight since internally they work like handles to the actual representation. They really hold nothing more than a pointer to some other object. What this means in practice is that whenever you create two ex and set the second equal to the first no copying process is involved. Instead, the copying takes place as soon as you try to change the second. Consider the simple sequence of code:

 
#include <iostream>
#include <ginac/ginac.h>
using namespace std;
using namespace GiNaC;

int main()
{
    symbol x("x"), y("y"), z("z");
    ex e1, e2;

    e1 = sin(x + 2*y) + 3*z + 41;
    e2 = e1;                // e2 points to same object as e1
    cout << e2 << endl;     // prints sin(x+2*y)+3*z+41
    e2 += 1;                // e2 is copied into a new object
    cout << e2 << endl;     // prints sin(x+2*y)+3*z+42
}

The line e2 = e1; creates a second expression pointing to the object held already by e1. The time involved for this operation is therefore constant, no matter how large e1 was. Actual copying, however, must take place in the line e2 += 1; because e1 and e2 are not handles for the same object any more. This concept is called copy-on-write semantics. It increases performance considerably whenever one object occurs multiple times and represents a simple garbage collection scheme because when an ex runs out of scope its destructor checks whether other expressions handle the object it points to too and deletes the object from memory if that turns out not to be the case. A slightly less trivial example of differentiation using the chain-rule should make clear how powerful this can be:

 
{
    symbol x("x"), y("y");

    ex e1 = x + 3*y;
    ex e2 = pow(e1, 3);
    ex e3 = diff(sin(e2), x);   // first derivative of sin(e2) by x
    cout << e1 << endl          // prints x+3*y
         << e2 << endl          // prints (x+3*y)^3
         << e3 << endl;         // prints 3*(x+3*y)^2*cos((x+3*y)^3)
}

Here, e1 will actually be referenced three times while e2 will be referenced two times. When the power of an expression is built, that expression needs not be copied. Likewise, since the derivative of a power of an expression can be easily expressed in terms of that expression, no copying of e1 is involved when e3 is constructed. So, when e3 is constructed it will print as 3*(x+3*y)^2*cos((x+3*y)^3) but the argument of cos() only holds a reference to e2 and the factor in front is just 3*e1^2.

As a user of GiNaC, you cannot see this mechanism of copy-on-write semantics. When you insert an expression into a second expression, the result behaves exactly as if the contents of the first expression were inserted. But it may be useful to remember that this is not what happens. Knowing this will enable you to write much more efficient code. If you still have an uncertain feeling with copy-on-write semantics, we recommend you have a look at the C++-FAQ lite by Marshall Cline. Chapter 16 covers this issue and presents an implementation which is pretty close to the one in GiNaC.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

A.2 Internal representation of products and sums

Although it should be completely transparent for the user of GiNaC a short discussion of this topic helps to understand the sources and also explain performance to a large degree. Consider the unexpanded symbolic expression which could naively be represented by a tree of linear containers for addition and multiplication, one container for exponentiation with base and exponent and some atomic leaves of symbols and numbers in this fashion:

repnaive

However, doing so results in a rather deeply nested tree which will quickly become inefficient to manipulate. We can improve on this by representing the sum as a sequence of terms, each one being a pair of a purely numeric multiplicative coefficient and its rest. In the same spirit we can store the multiplication as a sequence of terms, each having a numeric exponent and a possibly complicated base, the tree becomes much more flat:

reppair

The number 3 above the symbol d shows that mul objects are treated similarly where the coefficients are interpreted as exponents now. Addition of sums of terms or multiplication of products with numerical exponents can be coded to be very efficient with such a pair-wise representation. Internally, this handling is performed by most CAS in this way. It typically speeds up manipulations by an order of magnitude. The overall multiplicative factor 2 and the additive term -3 look somewhat out of place in this representation, however, since they are still carrying a trivial exponent and multiplicative factor 1 respectively. Within GiNaC, this is avoided by adding a field that carries an overall numeric coefficient. This results in the realistic picture of internal representation for :

repreal

This also allows for a better handling of numeric radicals, since sqrt(2) can now be carried along calculations. Now it should be clear, why both classes add and mul are derived from the same abstract class: the data representation is the same, only the semantics differs. In the class hierarchy, methods for polynomial expansion and the like are reimplemented for add and mul, but the data structure is inherited from expairseq.


[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by root on July, 11 2005 using texi2html