| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
| A.1 Expressions are reference counted | ||
| A.2 Internal representation of products and sums |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
In GiNaC, there is an intrusive reference-counting mechanism at work
where the counter belongs to the algebraic objects derived from class
basic but is maintained by the smart pointer class ptr, of
which ex contains an instance. If you understood that, you can safely
skip the rest of this passage.
Expressions are extremely light-weight since internally they work like
handles to the actual representation. They really hold nothing more
than a pointer to some other object. What this means in practice is
that whenever you create two ex and set the second equal to the
first no copying process is involved. Instead, the copying takes place
as soon as you try to change the second. Consider the simple sequence
of code:
#include <iostream>
#include <ginac/ginac.h>
using namespace std;
using namespace GiNaC;
int main()
{
symbol x("x"), y("y"), z("z");
ex e1, e2;
e1 = sin(x + 2*y) + 3*z + 41;
e2 = e1; // e2 points to same object as e1
cout << e2 << endl; // prints sin(x+2*y)+3*z+41
e2 += 1; // e2 is copied into a new object
cout << e2 << endl; // prints sin(x+2*y)+3*z+42
}
|
The line e2 = e1; creates a second expression pointing to the
object held already by e1. The time involved for this operation
is therefore constant, no matter how large e1 was. Actual
copying, however, must take place in the line e2 += 1; because
e1 and e2 are not handles for the same object any more.
This concept is called copy-on-write semantics. It increases
performance considerably whenever one object occurs multiple times and
represents a simple garbage collection scheme because when an ex
runs out of scope its destructor checks whether other expressions handle
the object it points to too and deletes the object from memory if that
turns out not to be the case. A slightly less trivial example of
differentiation using the chain-rule should make clear how powerful this
can be:
{
symbol x("x"), y("y");
ex e1 = x + 3*y;
ex e2 = pow(e1, 3);
ex e3 = diff(sin(e2), x); // first derivative of sin(e2) by x
cout << e1 << endl // prints x+3*y
<< e2 << endl // prints (x+3*y)^3
<< e3 << endl; // prints 3*(x+3*y)^2*cos((x+3*y)^3)
}
|
Here, e1 will actually be referenced three times while e2
will be referenced two times. When the power of an expression is built,
that expression needs not be copied. Likewise, since the derivative of
a power of an expression can be easily expressed in terms of that
expression, no copying of e1 is involved when e3 is
constructed. So, when e3 is constructed it will print as
3*(x+3*y)^2*cos((x+3*y)^3) but the argument of cos() only
holds a reference to e2 and the factor in front is just
3*e1^2.
As a user of GiNaC, you cannot see this mechanism of copy-on-write semantics. When you insert an expression into a second expression, the result behaves exactly as if the contents of the first expression were inserted. But it may be useful to remember that this is not what happens. Knowing this will enable you to write much more efficient code. If you still have an uncertain feeling with copy-on-write semantics, we recommend you have a look at the C++-FAQ lite by Marshall Cline. Chapter 16 covers this issue and presents an implementation which is pretty close to the one in GiNaC.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Although it should be completely transparent for the user of GiNaC a short discussion of this topic helps to understand the sources and also explain performance to a large degree. Consider the unexpanded symbolic expression which could naively be represented by a tree of linear containers for addition and multiplication, one container for exponentiation with base and exponent and some atomic leaves of symbols and numbers in this fashion:

However, doing so results in a rather deeply nested tree which will quickly become inefficient to manipulate. We can improve on this by representing the sum as a sequence of terms, each one being a pair of a purely numeric multiplicative coefficient and its rest. In the same spirit we can store the multiplication as a sequence of terms, each having a numeric exponent and a possibly complicated base, the tree becomes much more flat:

The number 3 above the symbol d shows that mul
objects are treated similarly where the coefficients are interpreted as
exponents now. Addition of sums of terms or multiplication of
products with numerical exponents can be coded to be very efficient with
such a pair-wise representation. Internally, this handling is performed
by most CAS in this way. It typically speeds up manipulations by an
order of magnitude. The overall multiplicative factor 2 and the
additive term -3 look somewhat out of place in this
representation, however, since they are still carrying a trivial
exponent and multiplicative factor 1 respectively. Within GiNaC,
this is avoided by adding a field that carries an overall numeric
coefficient. This results in the realistic picture of internal
representation for
:

This also allows for a better handling of numeric radicals, since
sqrt(2) can now be carried along calculations. Now it should be
clear, why both classes add and mul are derived from the
same abstract class: the data representation is the same, only the
semantics differs. In the class hierarchy, methods for polynomial
expansion and the like are reimplemented for add and mul,
but the data structure is inherited from expairseq.
| [ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |