Wednesday, April 8, 2009

Some benchmarks and coding hints

Dear class,

If you are having trouble with your code, let me provide some help here.

First, if you are wondering what some typical results are, I am getting Rg values to be in the range of about 14-24 (in Angstroms) and the ratio to vary between roughly 0.90 and 0.98.

Second, it is important that you consider the notation |r| to denote a vector norm. That is, it gives the length of the vector r. Thus,

|a-b|^2

gives the squared length of the vector pointing from b to a. This could be expanded as:

(xa - xb)^2 + (ya - yb)^2 + (za - zb)^2

where a = (xa, ya, za) and b = (zb, yb, zb).

Now, consider the radius of gyration. For a set of points, this is simply the root-mean-square-distance of the points from their centroid, where the centroid is defined as the average position of the points.

For the function you are to write called RadiusOfGyration, you need to compute the centroid and, subsequently, the squared distances of each position from it. To compute the centroid, you might write this code:
rm = np.zeros(3, float)
for (x, y, z) in Pos:
rm[0] = rm[0] + x
rm[1] = rm[1] + y
rm[2] = rm[2] + z
rm = rm / len(Pos)
Better yet, we could use the fact that NumPy allows element-by-element operations on arrays:
rm = np.zeros(3, float)
for Point in Pos:
rm = rm + Point
rm = rm / len(Pos)
Even better yet, we could use the sum function and specify that we want to sum over the first axis, the one corresponding to the particle index:
rm = Pos.sum(axis=0) / len(Pos)
Best yet, we can use the mean function:
rm = Pos.mean(axis=0)
Once we have computed the centroid, we might compute the Rg as follows:
RgSq = 0.
for (x, y, z) in Pos:
RgSq = RgSq + (x-rm[0])**2 + (y-rm[1])**2 + (z-rm[2])**2
Rg = np.sqrt(RgSq / len(Pos))
Again, we can use element-by-element operations:
RgSq = 0.
for Point in Pos:
RgSq = RgSq + sum((Point - rm)**2)
Rg = np.sqrt(RgSq / len(Pos))
Even better, we can use NumPy's broadcasting feature:
RgSq = sum((Pos - rm)**2)
Rg = np.sqrt(RgSq / len(Pos))
Here, Pos is an (N,3) array and rm is a (3) array. In taking the difference between the two, NumPy automatically broadcasts rm to be the same shape as Pos by duplicating it over N times.

Putting everything together, the RadiusOfGyration function can be completely written in a single line:
return np.sqrt(sum( (Pos - Pos.mean(axis=0))**2 ) / len(Pos))
Hope that helps,

MSS

No comments: