Monday, 10 November 2008

Basic Relativity

I realise I haven't made any posts here recently; I've been far too busy with my Ph.D. and various other commitments. In lieu of the post I wanted to make on special relativity, defining everything carefully and introducing the tensor formulation, I'm going to present some basic concepts of Special Relativity, almost exactly as I taught them to a first year undergraduate examples class. These people had lectures on relativity already, so the stuff below is really a recap and a worked example.

Lorentz Transforms
Consider a coordinate frame S' moving with velocity v in the positive x-direction of a frame S. The Lorentz transforms between the two frames (x, y, z, t in S and x', y', z', t' in S') are given by:

\begin{align*}
x^\prime &= \gamma(x-vt) = \frac{1}{\sqrt{1-\frac{v^2}{c^2}}}(x - vt)\\
y^\prime &= y\\
z^\prime &= z\\
t^\prime &= \gamma\left(t - \frac{vx}{c^2}\right) = \frac{1}{\sqrt{1-\frac{v^2}{c^2}}}\left(t - \frac{vx}{c^2}\right)
\end{align*}


The reverse transforms, from S' back to S, are given by:

\begin{align*}
x &= \gamma(x^\prime+vt^\prime) = \frac{1}{\sqrt{1-\frac{v^2}{c^2}}}(x^\prime + vt^\prime)\\
y &= y^\prime\\
z &= z^\prime\\
t &= \gamma\left(t^\prime + \frac{vx^\prime}{c^2}\right) = \frac{1}{\sqrt{1-\frac{v^2}{c^2}}}\left(t^\prime + \frac{vx^\prime}{c^2}\right)
\end{align*}


Velocity Transformation
If an object is moving with a velocity u' in frame S' (as defined above), we need to determine a transform for the velocity in frame S. Start out with the fact that the velocity in S is
u = \frac{\Delta x}{\Delta t}
and substitute in the transforms above:

u = \frac{\Delta x}{\Delta t} &= \frac{\gamma(\Delta x^\prime + v\Delta t^\prime)}{\gamma\left(\Delta t^\prime + \frac{v\Delta x^\prime}{c^2}\right)}

Divide through by
\Delta t^\prime
:

u= \frac{\frac{\Delta x^\prime}{\Delta t^\prime} + v}{1+\frac{v\Delta x^\prime}{\Delta t^\prime c^2}}


Since
\frac{\Delta x^\prime}{\Delta t^\prime}=u^\prime
,


u = \frac{u^\prime + v}{1+\frac{u^\prime v}{c^2}}

which is the velocity transform between u and u'!

Time Dilation
Consider a particle with a lifetime
\tau
in its rest frame (i.e. in an inertial frame in which the particle has zero momentum). The lifetime t, in a frame moving with velocity v is given by:

t = \gamma\tau


Since
\gamma
is always greater than or equal to 1, t is always as large as, or larger than,
\tau
. This is the phenomenon of time dilation and the relationship can be used to determine the time dilation in any inertial frame.

Length Contraction
If the length measured in the lab frame is given by
L_0
, the length seen by a particle moving at speed v in the lab frame is given by:

L = \frac{L_0}{\gamma}

The length is therefore always smaller than (or exactly the same as) the length in the lab frame, and this is the phenomenon of length contraction.

Be careful never to use both time dilation and length contraction in the same (part of a) physics problem! If you're working in the lab frame, use time dilation to alter the particle lifetime, and leave the length as measured in the lab frame. If you're in the particle rest frame, use the particle lifetime as given (since they are defined as lifetime in a particle rest frame) and use length contraction to determine the length seen by the particle from the length as measured in the lab.

The classic atmospheric muon problem, below, illustrates how to use these concepts.

The Atmospheric Muon Problem
Muons are produced in cosmic-ray interactions high in the Earth's atmosphere (at, say 8000m). The muon lifetime is
2.2\times 10^{-6}~\mathrm{s}
.

Muons produced in this way are detected at the Earth's surface. For a muon travelling at 0.998 times the speed of light,
(a) Calculate how far the muon would travel using only classical (non-relativistic) physics.

Here, we use the classical
s = ut
relationship. The maximum distance the muon can travel is given by:

s = 0.998c \tau = 0.998\times 3\times 10^{8} \times 2.2\times 10^{-6} = 658.68~\mathrm{m}


In this situation, the muon cannot reach the Earth's surface.

(b) In the rest frame of the muon, use relativistic physics to show that the muon can indeed reach the surface.

In the muon rest frame, the lifetime is still
2.2\times 10^{-6}~\mathrm{s}
, but the length it has to travel is shortened by length contraction:

L_\mu = \frac{1}{\gamma}L_E

The Lorentz factor is given by:

\gamma = \frac{1}{\sqrt{1-\frac{v^2}{c^2}}} = 15.8

giving the length seen by the muon as:

L_\mu = \frac{8000}{15.8} = 506.32~\mathrm{m}


We already worked out that a muon travelling at 0.998c can travel over 600m before it decays using its lifetime as stated, in part a, so now the muons will make it to the surface of the Earth!

(c) In the Earth frame, use relativistic physics to show that the muon can make it to the surface.

Here, the length is as measured, 8000m, but the lifetime of the muon must be dilated:
t = \gamma \tau


We worked out the Lorentz factor above as 15.8, so now we have:

t = 15.8 \times 2.2\times 10^{-6} = 3.48 \times 10^{-5}


The maximum distance such a muon can travel is then:

s = u t = 0.998\times 3\times 10^8 \times 3.48\times 10^{-5} = 10419~\mathrm{m}


The muon can now travel over 10 km in the Earth frame, well over the 8 km it needs to hit the surface, so again we can detect such muons on the Earth.

This example demonstrates several interesting aspects of relativistic physics. Firstly, that it works! We can detect muons produced in this way, which we wouldn't be able to detect if those relativistic effects didn't occur! Secondly, you can analyse a problem using either length contraction or time dilation, but you need to choose your frame carefully. Don't contract or dilate quantities that were measured in the frame you're using; only alter those measured in other frames. Note that lengths are often (but not always) measured in a lab frame, and that particle lifetimes are always given in the rest frame of the particle concerned!

Tuesday, 7 October 2008

Contraction of Four-Vectors: A C++ Exercise

NOTE: The examples as typed have not been compiled or checked, however they are directly based on examples I talked through while teaching C++, so there should not be any particularly glaring errors or omissions. Note that I've avoided talking too much about object-oriented design, advanced concepts or why certain things are done the way they are. This article is intended mainly as an introduction to the abstraction mechanisms available with C++ class-based design, and is not intended to introduce any new concepts to the world of C++. In particular, I do not claim to be the first person to have thought of such concepts (as a number of comments have suggested in the past); I'm merely providing an example for those interested!

Without going into the physics behind it, if we have two objects A and B, each with four components, we can define their contraction as:
A_{\mu}B^{\mu} = A_0 B_0 - (A_1 B_1 + A_2 B_2 + A_3 B_3)


I intend now to use this concept to develop a C++ program from a `first draft' to a finished object-oriented program. This is indented primarily as an introductory-level tutorial for those new to C++ but familiar with many aspects of C or similar languages, so I will not go into a lot of detail about the basics, instead focussing on the C++ specific stuff.

The first approach could look something like this:

#include <iostream>

int main()
{
// Create some variables
double a0, a1, a2, a3;
double b0, b1, b2, b3;

// Prompt and read eight components
std::cout << "Enter the four components of A, separated by spaces:" << std::endl;
std::cin >> a0 >> a1 >> a2 >> a3;
std::cout << "Enter the four components of B, separated by spaces:" << std::endl;
std::cin >> b0 >> b1 >> b2 >> b3;

// Calculate contraction and display result
double result = a0*b0 - a1*b1 - a2*b2 - a3*b3;
std::cout << "A.B = " << result << std::endl;

return 0;
}


This is a fairly simple program, achieving exactly what we set out to do. However, from the perspective of simplicity, there's a lot of overhead in the main program which detracts from the understanding of what is actually going on. We can make things slightly clearer by writing the contraction as a function; then we'll be able to call it repeatedly from our code, instead of defining what a contraction of four-vectors means at every point it is used:

double contract(double a0, double a1, double a2, double a3,
double b0, double b1, double b2, double b3)
{
return a0*b0 - a1*b1 - a2*b2 - a3*b3;
}


Now, our main() looks like this:

int main()
{
// Create some variables
double a0, a1, a2, a3;
double b0, b1, b2, b3;

// Prompt and read eight components
std::cout << "Enter the four components of A, separated by spaces:" << std::endl;
std::cin >> a0 >> a1 >> a2 >> a3;
std::cout << "Enter the four components of B, separated by spaces:" << std::endl;
std::cin >> b0 >> b1 >> b2 >> b3;

// Calculate contraction and display result
double result = contract(a0, a1, a2, a3, b0, b1, b2, b3);
std::cout << "A.B = " << result << std::endl;

return 0;
}

No major improvement there, but we've moved the contraction code out into a function so that we can reuse it. Still, it's pretty ugly to pass eight doubles into the function. Much nicer would be to pass two four-vectors; after all, that is what the contraction works with, at a higher level:

#include <iostream>

class FourVector {
public:
double x0;
double x1;
double x2;
double x3;
};

double contract(FourVector a, FourVector b)
{
return a.x0*b.x0 - a.x1*b.x1 - a.x2*b.x2 - a.x3*b.x3;
}

int main()
{
// Create some variables
Fourvector a, b;

// Prompt and read eight components
std::cout << "Enter the four components of A, separated by spaces:" << std::endl;
std::cin >> a.x0 >> a.x1 >> a.x2 >> a.x3;
std::cout << "Enter the four components of B, separated by spaces:" << std::endl;
std::cin >> b.x0 >> b.x1 >> b.x2 >> b.x3;

// Calculate contraction and display result
double result = contract(a, b);
std::cout << "A.B = " << result << std::endl;

return 0;
}


Things are a little neater now, but still not ideal. For a start, we have public member variables in our class. This is generally considered bad design, so we should make them private and instead provide alternative mechanisms for accessing them. In this case, the class lends itself perfectly towards the use of operator overloading. The idea here is to define what it means to "multiply" two four-vectors together. We can define operator*() as a member-function of the FourVector class:

#include <iostream>

class FourVector {
public:
double x0;
double x1;
double x2;
double x3;
double operator*(const FourVector& b);
};

double FourVector::operator*(const FourVector& b)
{
return x0*b.x0 - x1*b.x1 - x2*b.x2 - x3*b.x3;
}

int main()
{
// Create some variables
Fourvector a, b;

// Prompt and read eight components
std::cout << "Enter the four components of A, separated by spaces:" << std::endl;
std::cin >> a.x0 >> a.x1 >> a.x2 >> a.x3;
std::cout << "Enter the four components of B, separated by spaces:" << std::endl;
std::cin >> b.x0 >> b.x1 >> b.x2 >> b.x3;

// Calculate contraction and display result
double result = a * b;
std::cout << "A.B = " << result << std::endl;

return 0;
}


As you can see, we have now defined enough code that the statement result = a * b; works as expected, producing the contraction of the two. The member function, when called, is bound to object a (in this case), so x0, x1, x2, x3 all refer to a's variables. We have to refer to the variables in b (passed as an argument to the operator) explicitly as b.x0 etc.

We still haven't hidden the member variables. To do this, we need to add a couple more things: a constructor, and a stream operator.

#include <iostream>

class FourVector {
public:
FourVector() : x0(0), x1(0), x2(0), x3(0) {};
FourVector(double i0, double i1, double i2, double i3)
: x0(i0), x1(i1), x2(i2), x3(i3) {};
double operator*(const FourVector& b);
friend std::istream& operator>>(std::istream& in, FourVector& v);
private:
double x0;
double x1;
double x2;
double x3;
};

double FourVector::operator*(const FourVector& b)
{
return x0*b.x0 - x1*b.x1 - x2*b.x2 - x3*b.x3;
}

std::istream& operator>>(std::istream& in, FourVector& v)
{
in >> v.x0 >> v.x1 >> v.x2 >> v.x3;
return in;
}

int main()
{
// Create some variables
Fourvector a, b;

// Prompt and read eight components
std::cout << "Enter the four components of A, separated by spaces:" << std::endl;
std::cin >> a;
std::cout << "Enter the four components of B, separated by spaces:" << std::endl;
std::cin >> b;

// Calculate contraction and display result
double result = a * b;
std::cout << "A.B = " << result << std::endl;

return 0;
}

There are a few things to note here. First, the default constructor:
FourVector() : x0(0), x1(0), x2(0), x3(0) {};


Constructors are named identically to the class they construct, and have no return type. This particular one takes no arguments, has an initialiser list which sets all the member variables to zero, and has an empty function body -- {};

The other constructor allows one to construct a FourVector object by passing four doubles into the constructor:

double E, px, py, pz;
// Some code which sets E and p goes here
FourVector EnergyMomentum(E, px, py, pz);


Finally, operator>> is a little different. Because the order is stream >> FourVector, we can't make the operator>> a member of FourVector (since it would expect the FourVector object to be passed first!). Furthermore, we need access to the private member variables, or we have nothing to read into. This is a bit of a dilemma, until we use the `friend' statement. This tells the compiler to let a function which is defined outside the class have access to internal class members; precisely what we want here!

You'll notice that the mechanism by which we got here is quite convoluted; we've added constructors and defined operators, and the class is far from complete -- for it to be generally useful we'd need an output operator, other mathematical operations and more. The amount of code has increased, as has the complexity of the overall solution. But look at the code inside main()... this is now considerably simpler. If we ignore comments and don't prompt the user, we can now write exactly the code which corresponds to the mathematical concept:

int main()
{
FourVector a, b;
std::cin >> a >> b;
double result = a * b;
std::cout >> result >> std::endl;
return 0;
}


The best part is that the FourVector class is reusable, and sufficiently powerful to allow this abstraction to be carried forward to develop a whole system for tensor calculus, expressed in terms of the mathematical objects involved!

Wednesday, 24 September 2008

Symmetries, Conserved Quantities and Noether's Theorem

Following on from my posts on Lagrangian and Hamiltonian Mechanics [1, 2, 3, 4], I'd like to discuss one of the most amazing topics in physics. The consideration of the symmetries of physical laws and how those symmetries relate to conserved quantities, the fundamentally beautiful mathematics lying beneath, and the extent to which we can develop theories of the world around us from such simple concepts; these things continually inspire me and provoke my interest.

I'll start by showing that the formulation of Lagrangian and Hamiltonian Mechanics, thus far, allows us to determine several conservation laws. Consider, for example, the homogeneity of space. Space is homogeneous if the motion (or time-evolution) of a particle (or system thereof) is independent of absolute position. That is, the potential does not vary with absolute position (it can still vary with the vector distance between two particles, as an interaction potential, for example!)

If we make a transformation
\mathbf{r}\rightarrow\mathbf{r}+\delta\mathbf{r}
, then the Lagrangian will also transform as
L \rightarrow L+\delta L
. For a single particle, we can Taylor expand as follows:

L(\mathbf{r}+\delta\mathbf{r},\mathbf{v}) = L(\mathbf{r},\mathbf{v})+\frac{\partial L}{\partial x}\delta x+\frac{\partial L}{\partial y}\delta y+\frac{\partial L}{\partial z}\delta z

Which we can use to write
\delta L = \frac{\partial L}{\partial \mathbf{r}}\cdot\delta\mathbf{r}

\frac{\partial L}{\partial \mathbf{r}}
is a vector quantity; each component is the derivative of L with respect to the corresponding coordinate of r. For a single particle, then,
\frac{\partial L}{\partial \mathbf{r}}=\nabla L
.

Homogeneity of space requires that
\delta L = 0
. Since
\delta \mathbf{r}
is arbitrary (and therefore not necessarily zero), we have that

\frac{\partial L}{\partial q_i} = 0 ~~~~~~ (\star)


This holds only if L does not depend on absolute position, otherwise there would be a contribution
\delta L
from many of the possible choices of
\delta\mathbf{r}
. Spatial dependence of e.g. V(x) implies spatial variation of L, and momentum would not be conserved.

The Euler-Lagrange Equation applies for each coordinate in the vector r. The sum of these Euler-Lagrange Equations (ELEs) means that
(\star)
requires that:

\begin{multiline*}
\frac{d}{dt}\frac{\partial L}{\partial \dot{q}_i} = 0 \\
\Rightarrow p_i = \frac{\partial L}{\partial \dot{q}_i} ~~\mathrm{remains~constant}
\end{multiline*}


We have, therefore, demonstrated the conservation of momentum as a result of requiring translational invariance. That is, any canonical momenta whose conjugate coordinates do not appear explicitly in the Lagrangian are conserved.

Turning once again to time symmetries, let us re-derive the conservation of energy. If the Lagrangian is homogeneous in time, i.e.
L(q,\dot{q})~~\mathrm{not}~~L(q,\dot{q},t)
, then:

\frac{dL}{dt} = \sum_i \frac{\partial L}{\partial q_i}\dot{q}_i + \sum_i\frac{\partial L}{\partial \dot{q}_i}\ddot{q}_i

As L does not depend explicitly on time, there is no term
\frac{\partial L}{\partial t}
on the RHS. Sunstituting
\frac{\partial L}{\partial q_i}
from the ELE,

\frac{dL}{dt} = \sum_i\dot{q}_i\frac{d}{dt}\frac{\partial L}{\partial\dot{q}_i}+\sum_i\frac{\partial L}{\partial\dot{q}_i}\ddot{q}_i = \sum_i\frac{d}{dt}\left(\dot{q}_i\frac{\partial L}{\partial\dot{q}_i}\right)

\begin{multiline*}
\Rightarrow \frac{d}{dt}\sum_i\left( \dot{q}_i\frac{\partial L}{\partial\dot{q}_i}-L \right) = 0 \\
\Rightarrow H = \sum_i \dot{q}_i\frac{\partial L}{\partial \dot{q}_i} - L~~~\mathrm{remains~constant}


The conservation of energy holds for any motion in a non-time-varying external field V(x)

We turn now to the isotropy of space, and show that angular momentum is conserved due to rotational invariance of the Lagrangian. Consider rotation by an angle
|\delta\theta|
(with a direction given by
\delta\theta
) about a vector. For small rotations,
\mathbf{r}\rightarrow \mathbf{r}+\delta\mathbf{r}
, with
\delta\mathbf{r} = \delta\theta\times r
. Each component of the velocity is also transformed by this rotation,
\delta\mathbf{v}=\delta\theta\times\mathbf{v}
.

For a single body, we now impose the requirement that the Lagrangian be unchanged under such a rotation (i.e. we require space to be isotropic).
\delta L = \sum_i\left( \frac{\partial L}{\partial q_i}\cdot\delta r_i + \frac{\partial L}{\partial\dot{q}_i}\cdot\delta v_i \right) = 0


We can replace
\frac{\partial L}{\partial v_i}
by the vector canonical momentum
p_i
, and
\frac{\partial L}{\partial q_i}
by
\dot{p}_i
, leaving:
\begin{multiline*}
\left(\dot{\mathbf{p}}\cdot\delta\mathbf{r} + \mathbf{p}\cdot\delta\mathbf{v}\right) = 0 \\
\Rightarrow \dot{\mathbf{p}}\cdot\left(\delta\theta\times \mathbf{r}\right) + \mathbf{p}\cdot\left(\delta\theta\times\mathbf{v}\right) = 0
\end{multiline*}


Since
\mathbf{a}\cdot\left(\mathbf{b}\times\mathbf{c}\right) = \mathbf{b}\cdot\left(\mathbf{c}\times\mathbf{a}\right)
,

\begin{multiline*}
\delta\theta\cdot\left(\left[\mathbf{r}\times\dot{\mathbf{p}}\right]+\left[\mathbf{v}\times\mathbf{p}\right]\right)=0\\
\Rightarrow\delta\theta\cdot\frac{d}{dt}\left(\mathbf{r}\times\mathbf{p}\right)=0\end{multiline*}


Since
\delta\theta
is arbitrary, this requires that
\mathbf{r}\times\mathbf{p}
does not change in time, hence angular momentum is a conserved quantity.

Hamilton's Equations
Using the ideas presented above, I'm going to take a moment to derive Hamilton's Equations, which will prove useful later on.

Consider changes in the Lagrangian L, according to
dL = \sum_i\frac{\partial L}{\partial\dot{q}_i}\,d\dot{q}_i + \sum_i\frac{\partial L}{\partial q_i}\,dq_i

This can be written:
dL=\sum_ip_i\,d\dot{q}_i+\sum_i\dot{p}_i\,dq_i

since
\frac{\partial L}{\partial q_i}=\dot{p}_i
and
\frac{\partial L}{\partial\dot{q}_i}=p_i
.
Using,
\sum_i p_i\,d\dot{q}_i = d\left(\sum_i p_i q_i\right) - \sum_i\dot{q}_i\,dp_i
,

d\left(\sum_i p_i\dot{q}_i - L\right) = -\sum_i\dot{p}_i\,dq_i+\sum_i\dot{q}_i\,dp_i

The argument of the differential on the left is the Hamiltonian, H,
H(q,p,t)=\sum_i p_i\dot{q}_i - L
, therefore:
dH = -\sum_i\dot{p}_i\,dq_i + \sum_i\dot{q}_i\,dp_i


From here, we can obtain Hamilton's Equations:
\begin{align*}
\dot{q}_i &= \frac{\partial H}{\partial p_i}\\
\dot{p_i} &= \frac{\partial H}{\partial q_i}
\end{align*}


For m coordinates (and m momenta), Hamilton's Equations form a system of 2m first-order differential equations, compared to the m second-order equations in the Lagrangian treatment.

The total time derivative,

\frac{dH}{dt}=\frac{\partial H}{\partial t}+\sum_i\frac{\partial H}{\partial q_i}\dot{q}_i+\sum_i\frac{\partial H}{\partial p_i}\dot{p}_i

Substituting Hamilton's equations for
\dot{q}_i, \dot{p}_i
, the last two terms cancel, so
\frac{dH}{dt} = \frac{\partial H}{\partial t}

and if H does not depend explicitly on time,
\frac{dH}{dt}=0
and energy is conserved!

Noether's Theorem
The three conserved quantities above were shown to be related to the invariance of the Lagrangian under some symmetry transformation:
  • Translational invariance (homogeneity of space) ==> Conservation of momentum
  • Rotational invariance (isotropy of space) ==> Conservation of angular momentum
  • Time invariance (homogeneity of time) ==> Conservation of energy

Noether's Theorem states that any differentiable symmetry of the Action (integral of the Lagrangian) of a physical system has a corresponding conservation law.

To every differentiable symmetry generated by local actions, there corresponds a conserved current.

`Symmetry' here, refers to the covariance of the form of a physical law with respect to a Lie group of transformations; the conserved quantity is known as a charge and the flow carrying it as a current (c.f. electrodynamics).

Noether's Theorem, which I will discuss in more detail at a later date, is another critical component used to build gauge theories. The key thing to remember right now is that a symmetry (invariance) of the Lagrangian corresponds to a conserved quantity; we can use this result to look for the underlying symmetry behind quantities we know to be conserved (for example, electric charge).

Monday, 22 September 2008

Feedback on `Writing Scientific Documents Using LaTeX'

Since I submitted my article Writing Scientific Documents Using LaTeX to CTAN, I've had several emails offering comments, suggestions and improvements. I plan on taking all of these into account for the next edition, which should be available sometime soon. Meanwhile, I thought I'd just post a little about some of the suggestions here.

The single most common suggestion was that I replace the use of $$ ... $$ to delimit displayed mathematics with the \[ ... \] form. The dollar-variant is apparently deprecated, and the newer square bracket style preferred. The reason for this, as given in http://www.ctan.org/tex-archive/info/l2tabu/, is that the $$ form is a plain TeX command, and should be avoided in LaTeX due to inconsistencies. Check the l2tabu document for more details.

Another frequent comment was that the eqnarray environment is bad, and should always be replaced by the align or similar, from the amsmath package. I originally thought I'd leave a discussion of eqnarray in there for reference, but now it seems better to remove it entirely.

Other comments related to minor corrections and requests for additional explanation in some sections, expanded coverage of BibTeX and more details on table design.

As I said above, I plan to incorporate all of these suggestions into the 5th Edition of the article.

From Lagrangian to Hamiltonian Mechanics

If the Lagrangian L does not depend explicitly on time, and varies with time only through the time-dependence of the coordinate q (and its time derivative), then we can define a constant of motion known as the Hamiltonian, H.

That is, if

\begin{align*}
L = L(q(t),\dot{q}(t)) \\
\mathrm{not}~~L = L(q(t), \dot{q}(t), t)
\end{align*}

then the Hamiltonian H can be defined as
H = \dot{q}\frac{\partial L}{\partial \dot{q}} - L


For the simplest example of a particle of mass m, moving through a potential V(x), with velocity v = dx/dt, we have
L = \frac{m\dot{x}^2}{2} - V(x) = T - V
. Using the above definition of the Hamiltonian,
H = \frac{m\dot{x}^2}{2} + V(x)
which can be identified as the total energy. Since H is a constant of motion, this result corresponds to conservation of energy!

More generally,
\frac{dH}{dt}=\frac{d}{dt}\left(\dot{x}\frac{\partial L}{\partial\dot{x}}\right)-\frac{\partial L}{\partial\dot{x}}\frac{d\dot{x}}{dt}-\frac{\partial L}{\partial x}\frac{dx}{dt}
= v\left( \frac{d}{dt}\frac{\partial L}{\partial \dot{x}} - \frac{\partial L}{\partial x}\right) = 0

since the last term inside the brackets vanishes (it's the Euler-Lagrange Equation!)

Note that if L depends explicitly on time, the above does not hold, and we lose conservation of energy; here, energy can be `used' in thermodynamically irreversible processes. Of course, if we expanded the Lagrangian to take into account a large enough system, we regain conservation of energy for the universe as a whole!

The Hamiltonian H is an integral of the motion, since
\frac{dH}{dt} = 0 \Rightarrow \int_{t_0}^{t_1}\frac{dH}{dt}\,dt = \left[H\right]_{t_0}^{t_1} = 0

It contains only first-order time derivatives of the coordinate q, whereas the Euler-Lagrange equation contains second-order time derivatives,
\frac{d}{dt}\frac{\partial L}{\partial \dot{q}}
.

As an example of determining the Hamiltonian from a Lagrangian, I'd like to look at the case of Special Relativity (SR). There are two reasons for this... first, the SR Lagrangian looks a bit different from the usual T - V form of classical mechanics. Secondly, Special Relativity will feature prominently in a number of future posts as I start to direct the methods of Lagrangian and Hamiltonian mechanics towards describing gauge theories.

The SR Lagrangian in a potential V(x) is given by
L = \frac{-mc^2}{\gamma} - V(x) = -mc^2\sqrt{1-\frac{v^2}{c^2}} - V(x)

where I have used
v = \dot{x} = \frac{dx}{dt}
and have referred to the rest mass as simply m, while many books use the notation m_0.

The Hamiltonian H can be found as
\begin{multiline*}
H = v\frac{\partial L}{\partial v} - L\\
= \frac{mc^2}{\sqrt{1-\frac{v^2}{c^2}}} + V(x)\\
= \gamma mc^2 + V(x)
\end{multiline*}

The last line is readily identified as the total relativistic energy of a particle of rest mass m in a potential V(x), so our interpretation of the Hamiltonian holds!

Canonical Momenta
I'd like to take a moment now to introduce some new nomenclature and notation, and to explain why it is useful here.

First, a reminder that the Euler-Lagrange equation for a coordinate q can be written as
\frac{d}{dt}\frac{\partial L}{\partial \dot{q}} = \frac{\partial L}{\partial q}


We can define a quantity p, known as the canonical momentum conjugate to the coordinate q, as follows
p=\frac{\partial L}{\partial \dot{q}}


and a quantity F, known as the canonical force conjugate to the momentum p, as
F = \frac{\partial L}{\partial q}

These follow the usual definitions such that
T = \frac{p^2}{2m}
and
F = -\nabla V
, since the Lagrangian L = T - V.

Written in this form, the Euler-Lagrange Equation directly represents Newton's 2nd Law:
F = \dot{\mathbf{p}}


For multiple coordinates
q_i
, the Hamiltonian H is given by

H = \sum_i p_i \dot{q}_i - L

where
p_i
is the canonical momentum conjugate to the coordinate
q_i
.

Introducing the concept of canonical momentum conjugate to a coordinate is essential for the future posts I'd like to make on this blog. The reason for this is that when I discuss quantum field theory, the concepts introduced above will play an essential role in the procedure. More on this later!

Meanwhile, although I provided a couple of classical mechanics examples for the use of the Lagrangian, in order to introduce the concept at a level that most people can understand from the world around them, I will refrain from posting similar examples making use of the Hamiltonian, since I wish to progress to more complicated topics.

In order to illuminate the path from here to a discussion of gauge theories, I'd like now to simply list some of the topics I must first cover. As such, the interested reader will notice any such posts! I anticipate that this list will be incomplete, and that I will need to cover certain topics in a lot more detail than others.
  1. Symmetries and Noether's Theorem (some of this)
  2. Special Relativity in four-vector formulation (lots of this)
  3. Relativistic electrodynamics (not much of this)
  4. Relativistic Quantum Mechanics (I've already discussed the Klein-Gordon equation, but I'll go over this and other aspects, again)
  5. Spin & Relativity (Pauli matrices and the gamma matrices)
  6. The Dirac Equation (in some detail!)
  7. U(1) symmetries and the Gauge Principle
  8. Quantum Electrodynamics
  9. SU(2) and Electroweak Unification
  10. Aside on Superconductivity (which may be omitted)
  11. Spontaneous Symmetry Breaking and the Higgs Mechanism


I expect those posts to take some considerable time. I've spent at least two weeks discussing Lagrangian mechanics, and we have a long way to go before I can introduce the Higgs mechanism, which is the ultimate goal of this entire series of posts. The idea is that anyone with a basic grounding in physics and mathematics should be able to learn enough about Lagrangian & Hamiltonian Mechanics, Quantum Mechanics and Special Relativity to be able to appreciate the formulation of the gauge theories of the Standard Model of Particle Physics, and to go a little beyond that and get a look at the Higgs mechanism. Let's hope I make it to the end!

Saturday, 20 September 2008

Lagrangian Mechanics: From the Euler-Lagrange Equation to Simple Harmonic Motion

I already wrote about obtaining Newton's Laws from the Principle of Least Action. Now I'm going to analyse a simple mass-spring system; effectively just a case of substituting in a suitable potential for the spring.



Let us go through all of the steps, once more. We'll choose our coordinate x to be the displacement from the equilibrium length of the spring, as shown on the diagram. The kinetic energy of the moving mass is then just
T=\frac{m\dot{x}^2}{2}
. The potential of a spring stretched (or compressed) x metres from its equilibrium length is given by
V=\frac{kx^2}{2}
, where k is the spring constant (N m^-1).

So, for our mass-spring system, the Lagrangian is

L = \frac{1}{2}\left(m\dot{x}^2 - kx^2\right)}


Applying the Euler-Lagrange equation, we obtain

\begin{multiline*}
-kx - \frac{d}{dt}\left(m\dot{x}\right) = 0 \\
\Rightarrow \ddot{x} - \frac{k}{m}x = 0
\end{multiline*}


Comparing the last line above with the general form of Simple Harmonic Motion (SHM),
\ddot{q} - \omega^2 q = 0
we can see that the equation of motion rendered by applying the Euler-Lagrange equation to the Lagrangian of a mass-spring system provides Simple Harmonic Motion with a frequency

\omega = \sqrt{\frac{k}{m}}


This is as expected for the case of a mass-spring system!

Friday, 19 September 2008

Lagrangian Mechanics: From the Euler-Lagrange Equation to Newton's Laws

Last week I wrote about the Euler-Lagrange Equation, and how it can be obtained from the Principle of Least Action. Today, I'd like to show that this equation is consistent with Newtonian Mechanics.

For reference, the Euler-Lagrange equation for some arbitrary coordinate q is:

\frac{\partial L}{\partial q} - \frac{d}{dt}\frac{\partial L}{\partial \dot{q}} = 0


L, the Lagrangian, is the kinetic energy of a system minus the potential.

Let us now consider the case of a free particle of mass m, moving in a potential V(x). The particle's instantaneous speed is given by
v = \frac{dx}{dt}=\dot{x}
. The kinetic energy is the familiar
T=\frac{1}{2}m\dot{x}^2
.

Applying the Euler-Lagrange Equation, we have:

\begin{align*}
\frac{\partial L}{\partial x} = -\frac{\partial V(x)}{\partial x} \\
\mathrm{and}~~\frac{\partial L}{\partial \dot{x}} = m\dot{x} \\
\Rightarrow \frac{d}{dt}\frac{\partial L}{\partial \dot{x}} = m\ddot{x}
\end{align*}


Putting these together, we have:

\begin{align*}
-\frac{\partial V(x)}{\partial x} - m\ddot{x} = 0 \\
\Rightarrow m\ddot{x} = -\frac{\partial V}{\partial x}
\end{align*}


The right-hand side is the gradient of a potential energy (in 1D). Force can be defined in terms of the gradient of a potential V:

F = -\nabla V

And since
\ddot{x} = \frac{d^2 x}{dt^2}
is acceleration, a, the result of applying the Euler-Lagrange equation to a classical-mechanical Lagrangian is the familiar form of Newton's Second Law:

F = ma


In other words, applying the Euler-Lagrange equation to a suitable Lagrangian provides an equation of motion!

Tuesday, 16 September 2008

Writing Scientific Documents Using LaTeX: Permanent home on CTAN

Following the suggestion of someone who commented on my previous post about my article Writing Scientific Documents Using LaTeX, I submitted the PDF and TeX source to CTAN, the Comprehensive Tex Archive Network. After changing the license to the LPPL, it was accepted into the archive.

As a result, the PDF can now be found permanently at the following location:
http://www.ctan.org/get/info/intro-scientific/scidoc.pdf

The CTAN entry is at http://www.ctan.org/tex-archive/info/intro-scientific/

Sunday, 14 September 2008

Lagrangian Mechanics: From the Principle of Least Action to the Euler-Lagrange Equation

A quantity known as the Action, A is defined as

A = \int L \,dt

where L is known as the Lagrangian. The simplest Lagrangian is given as the kinetic energy T of a system minus the potential energy V of the system:

L(x,t) = T(x,t) - V(x,t)


By extremising the Action (minimising it, in this case) we can obtain the Euler-Lagrange Equation, a key equation for Lagrangian and Hamiltonian mechanics. The concept is also used extensively in quantum mechanics and particle physics, particularly when dealing with gauge theories.

Consider an extremised path x(t) from a point x(t0) to a point x(t1), and some excursion from this path, as shown in the figure below. The excursion is given by some small function a(t), and the velocity is changed accordingly:

\begin{align*}
x(t) \rightarrow x(t) + a(t) \\
v(t) \rightarrow v(t) + \dot{a}(t)
\end{align*}



If we take x(t) to be the extremal path from x(t0) to x(t1), with the end-points fixed, and a(t) to be some small but general excursion from that path which must pass through the end-points, we can assert that:

a(t_0) = a(t_1) = 0


The Lagrangian will be changed as a result of these excursions. To first-order in small a(t), the Lagrangian transforms as:

\begin{align*}
L(x,v) \rightarrow L(x+a, v+\dot{a}) \\
= L(x, v) + a(t)\frac{\partial L}{\partial x} + \dot{a}(t)\frac{\partial L}{\partial v}
\end{align*}


The Action therefore transforms according to
A \rightarrow A + \delta A
where:

\delta A = \int_{t_0}^{t_1}\,dt \left( a(t)\frac{\partial L}{\partial x} + \frac{da}{dt}\frac{\partial L}{\partial v} \right)


The second term in the brackets above can be integrated by parts:

\int_{t_0}^{t_1}\,dt \frac{da}{dt}\frac{\partial L}{\partial v} = \left[ a(t)\frac{\partial L}{\partial v} \right]_{t_0}^{t_1} - \int_{t_0}^{t_1}dt\,a(t)\frac{d}{dt}\frac{\partial L}{\partial v}

Since a(t0) = a(t1) = 0, the integrated part (in square brackets) vanishes, leaving the following form for delta A:

\delta A = \int dt\,a(t)\left(\frac{\partial L}{\partial x} - \frac{d}{dt}\frac{\partial L}{\partial v}\right)


For arbitrary a(t), to minimise the action (by having delta A zero), the part in brackets must be zero. This is the Euler-Lagrange Equation, rewritten for a generalised coordinate q (in place of x) and using the time-derivative of q in place of the velocity v, above:

\frac{\partial L}{\partial q} - \frac{d}{dt}\frac{\partial L}{\partial \dot{q}} = 0


As I mentioned above, the Euler-Lagrange Equation is critical for the understanding of several aspects of quantum mechanics and gauge theories. Since I plan on making posts about some of these topics in the future, I felt I should begin by explaining Lagrangian mechanics, rather than jumping in at the deep end.

Two-Column Layouts in LaTeX

I was recently asked about using LaTeX to create the two-column layout often seen in scientific journals. I used this layout in some lab reports and project reports.

The first thing to do is to include the multicol package in your preamble:
\usepackage{multicol}

After that, to place text within a multi-column layout, start the multicols environment (note the s at the end!) and specify the number of columns you'd like:
\begin{multicols}{2}

Finally, end the multi-column layout with:
\end{multicols}

It's as simple as that! Any sections, subsections, equations etc. between the two will now be automatically formatted into two columns.

There are, however, a few caveats and points to note for formatting considerations.

The first thing to note is that the abstract is often formatted as a single column. To do this, simply place the \begin{multicols}{2} command below your abstract, but above the first \section command.

Secondly, I usually typeset references within the two-column layout. To do this, end the multicols environment after your references section.

Finally, and most importantly, graphics and tables do not always work well with multi-column environments. In order to achieve reasonable results, I used a few workarounds.

If the graphic is large, or has small text, set it normally and it will float outside of the column layout, spanning both columns. For smaller figures, you can create an adjusted figure or table environment which forces LaTeX to set the figure in the column, at the point you typed it in the source text. This is not always what you want, but a reasonable layout can usually be obtained by playing with the exact positioning of the table or graphic commands in the source.

The following, inserted into the preamble, defines two new environments, tablehere and figurehere, which insert tables and figures inline with the column text.
% Figures within a column...
\makeatletter
\newenvironment{tablehere}
{\def\@captype{table}}
{}
\newenvironment{figurehere}
{\def\@captype{figure}}
{}
\makeatother
In addition to this, you'll probably need to scale graphics to the column width. This can be achieved with the \resizebox command inside a figure or our new figurehere environment. For example,
\begin{figurehere}
\centering
\resizebox{\columnwidth}{!}{\includegraphics{gf-graphs.eps}}
\caption{\label{gf-graphs}Graph showing applied RF frequency against magnetic field from the sweep coils, the gradient of the lines reveals information about $g_f$ for each isotope. The left gradient, corresponding to \chem{^{85}{Rb}} is $7.73 \times 10^9 (\pm 1.5 \times 10^8)$. The right gradient, corresponding to \chem{^{87}{Rb}}, is $5.24 \times 10^9 (\pm 6.00 \times 10^7)$ }
\end{figurehere}

\columnwidth is defined by the multicol package to be the width of a text-column!

Saturday, 13 September 2008

Writing Scientific Documents Using LaTeX

Back in 2007, I wrote a short document describing some of the basic LaTeX commands and explaining how to go about creating a document using LaTeX. I did this because I was writing third year lab reports collaboratively with my lab partners (as per the regulations for the third year lab at Warwick) and I wanted us all to be using LaTeX, so the reports would look good and so that editing the sections together would be easier. In order to do this, though, I had to educate my lab partners in the use of LaTeX.

Since then, the document has been read (and requested!) by other physicists, scientists and mathematicians at Warwick and in the wider world. I've updated it several times to add content and to correct minor mistakes. It was not until this week, however, that someone suggested I put it online and make it freely available to anyone who wants it (and can find it).

So, I spent a while tonight polishing it up, adding a few more subsections, and creating what is now the 13-page Fourth Edition. It is available online at http://www.physical-thought.com/scidoc.pdf

Sunday, 17 August 2008

Surf Photography: Rip Curl Boardmasters 2008

I took this on Thursday August 7th, 2008 at Fistral Beach, Newquay Cornwall. The Rip Curl Boardmasters is a surf competition and festival which has been at Fistral for as long as I can remember, and it's a great atmosphere.

This was taken at the 300mm end of my 70-300mm lens with a shutter speed of 1/640 s, aperture f/8.0 and ISO 200. It was subsequently adjusted in Photoshop to improve the contrast a little, then cropped to remove a bit of empty sea.

Pebble v1 Call Graph

Following up on my recent post about a C++ implementation of the Mersenne Twister pseudo-random number generator (PRNG), I decided to find out just how much the performance of the PRNG actually affects a Monte Carlo system such as Pebble. Clearly, the random number generation is a critical part of the procedure, but as I'm about to show, it is definitely not the slowest part of Pebble, by a long way.

The valgrind tool allows a programmer to check their programs for many things, including memory leaks (the original goal of the valgrind project). One of the neat things it does is function call profiling and graphing. The idea is simple, you run a program inside valgrind using the callgrind option, and it tracks every function call made within the program. This lets you figure out how many times a function is being called, and how much of the program execution time is spent within a given function, allowing you to easily profile the code and determine functions or classes which are good candidates for optimisation.

Here is the call graph for Pebble v1, which uses a C++ implementation of the Mersenne Twister which I wrote last summer, rather than the new one I have recently posted about.


(Click for larger version)


The random number generation, performed by Pebble::Math::Random::MersenneTwister::getDouble and getLong, account for a mere 3.33% of the time spent in the event generator function Pebble::EventGenerator::generate. In contrast, finding minima of the probability distribution functions (using Pebble::Math::Utility::Minimiser::getMinimum) accounts for 31.29%, and integrating the PDFs for normalisation purposes (using Pebble::Math::Calculus::RombergIntegrator::integrate) accounts for 50.82% of the time spent generating events.

It is clear from this that, if I want to improve on the speed of generation in "Pebble v2", I need to focus my efforts on efficient minimia-finding and numerical integration functions.

There is also 7.47% of the time taken to generate events spent adding particles to an Event object, using the Pebble::Event::addParticle function. A lot of this time is accounted for by the memory allocation, and as I mentioned in my previous post, eliminating such object-oriented extravagances, while maintaining a sensible object-oriented design for the important parts of the program, should help to boost the speed of any new version.

Friday, 15 August 2008

Mersenne Twister

A key part of the Pebble program was the ability to generate lots of good-quality pseudo-random numbers for Monte Carlo (MC). The Mersenne Twister (MT) algorithm is ideally suited to this, and is indeed the algorithm we used. The original author, Makoto Matsumoto, has published C source for the algorithm, which I converted into a C++ class for the project, using the same interface as several other random number generators (RNGs) provided within Pebble.

So, following my post yesterday about considering building a "Pebble v2" or similar, I've spent the afternoon looking back into Mersenne Twister. The C code is fairly small, and reasonably easy to understand (surprisingly for something as complicated as a PRNG!) and I had no difficulty converting it to C++ the first time around. This time, I've ignored a lot of the complicated interface I built in the first time, since I no longer intend to provide other RNGs -- Mersenne Twister is more than good enough for any MC purposes.

Anyway, I modified the MT code to encapsulate it in a C++ class (code below), then ran some timing tests against the C version. In order to get useful timing data, I had to generate 1 million integers and 1 million real numbers, timing the program run for the entire thing. A real number is generated from a single integer, so this is roughly equivalent to two million integer number generations, plus a little overhead in converting half of them to a number between 0 and 1.

For the purposes of this test, I removed all the program output code from both the original C code and my new C++ version, timing the programs as they generated and threw away two million random numbers. The reason for removing output code was to prevent any skew in the results from the different output systems used by C and C++.

The test itself consisted of running a program to generate 1M integers and 1M reals, using the C or C++ versions of the MT algorithm, respectively. Each test was run ten times, using the bash time command to provide timing details. I discarded the real time taken, which refers to what is known as wall time -- the amount of actual elapsed time according to a clock on the wall -- and focused only on the user and system times, which are the number of seconds the CPU actually spent running the program in user-space and in system-calls, respectively.

The graph below shows the results. Both are incredibly fast, with the slowest taking a mere 0.156 seconds of CPU time. If I can generate two million random numbers in 0.2 seconds, this will certainly not be a bottleneck in any MC system!


(Click image for larger version)

The comparison, however, is slightly discouraging. The system times are incredibly low, 0.002 and 0.001 seconds for C and C++ respectively, averaged over ten runs. This is almost certainly because I removed all of the I/O, so the only time spent in the "system" would be the initial memory allocation and the final deallocation. In fact, I'd expect the C++ variant to take longer here, as the MersenneTwister object must be constructed. Still, these times are so close to zero it's hard to talk about them meaningfully.

The "user" times tell a different story. These are also averaged over ten runs, and show that the C++ version took almost twice as long as the C version. I expected there to be a little overhead from dynamically allocating the array used to store the internal state of the MT generator, but that should be a one-time thing, and the rest of the code is almost identical. I can only assume that the discrepancy arises from the function calls being mapped to member functions in C++ rather than global functions in C.

Many would argue that the factor of 2 slowdown is completely unacceptable, even if the C++ version does hide the details of MT (no defines, no global variables, just a few member variables). I agree that the reduction in speed is quite dramatic here, but compared to other components of a functioning Monte Carlo system, this overhead is negligible, and for any program written predominantly in C++, encapsulating the functionality and hiding the details is a good idea. This is where the hardcore C fans would argue that no system which requires speed or efficiency should be written in C++, but I disagree, backed up by plenty of evidence for C++ development in the physical sciences, and in particle physics in particular!

MersenneTwister.hpp

/*
A C++ program for MT19937, with initialization improved 2002/1/26.
Original C Coded by Takuji Nishimura and Makoto Matsumoto.
Converted to a C++ class by Andrew J. Bennieston

Before using, initialize the state by using init(seed)
or initArray(init_key, key_length).

Copyright (C) 1997 - 2002, Makoto Matsumoto and Takuji Nishimura,
All rights reserved.
C++ Class modifications copyright (C) 2008, Andrew J. Bennieston.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:

1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.

3. The names of its contributors may not be used to endorse or promote
products derived from this software without specific prior written
permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


Any feedback is very welcome.
http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html
email: m-mat @ math.sci.hiroshima-u.ac.jp (remove space)

Feedback on the C++ modifications to andrew @ physical - thought . com (remove spaces)
*/


class MersenneTwister {
private:
const int N;
const int M;
const unsigned long MATRIX_A;
const unsigned long UPPER_MASK;
const unsigned long LOWER_MASK;
int mti;
unsigned long* mt;
public:
// Constructors
MersenneTwister()
: N(624), M(397), MATRIX_A(0x9908b0dfUL),
UPPER_MASK(0x80000000UL), LOWER_MASK(0x7fffffffUL),
mti(N+1)
{ mt = new unsigned long[N]; };

// Destructor
~MersenneTwister() { delete[] mt; };

// Initialisation
void init(unsigned long s);
void initArray(unsigned long init_key[], int key_length);

// Random number generation
unsigned long genInt32(); // Random integer on [0,0xffffffff]
long genInt31(); // Random integer on [0,0x7fffffff]
double genRealClosed(); // Random real on closed range [0,1]
double genReal(); // Random real on half-open range [0,1) (i.e. not including 1)
double genRealOpen(); // Random real on open range (0,1) (i.e. not including 0 or 1)
double genReal53(); // Random 53-bit real on [0,1)
};


MersenneTwister.cpp

#include "MersenneTwister.hpp"

/*
* Initialise mt[N] with a seed
*/
void MersenneTwister::init(unsigned long s)
{
mt[0] = s & 0xffffffffUL;
for (mti=1; mti<N; mti++) {
mt[mti] =
(1812433253UL * (mt[mti-1] ^ (mt[mti-1] >> 30)) + mti);
/* See Knuth TAOCP Vol2. 3rd Ed. P.106 for multiplier.
* In the previous versions, MSBs of the seed affect
* only MSBs of the array mt[].
* 2002/01/09 modified by Makoto Matsumoto
*/
mt[mti] &= 0xffffffffUL;
/* for >32 bit machines */
}
}

/*
* Initialise by an array with length given by key_length
*/
void MersenneTwister::initArray(unsigned long init_key[], int key_length)
{
int i, j, k;
init(19650218UL);
i=1; j=0;
k = (N>key_length ? N : key_length);
for (; k; k--) {
mt[i] = (mt[i] ^ ((mt[i-1] ^ (mt[i-1] >> 30)) * 1664525UL))
+ init_key[j] + j; /* non linear */
mt[i] &= 0xffffffffUL; /* for WORDSIZE > 32 machines */
i++; j++;
if (i>=N) { mt[0] = mt[N-1]; i=1; }
if (j>=key_length) j=0;
}
for (k=N-1; k; k--) {
mt[i] = (mt[i] ^ ((mt[i-1] ^ (mt[i-1] >> 30)) * 1566083941UL))
- i; /* non linear */
mt[i] &= 0xffffffffUL; /* for WORDSIZE > 32 machines */
i++;
if (i>=N) { mt[0] = mt[N-1]; i=1; }
}

mt[0] = 0x80000000UL; /* MSB is 1; assuring non-zero initial array */
}

/*
* Generate random number on [0,0xffffffff]
*/
unsigned long MersenneTwister::genInt32()
{
unsigned long y;
static unsigned long mag01[2]={0x0UL, MATRIX_A};
/* mag01[x] = x * MATRIX_A for x=0,1 */

if (mti >= N) { /* generate N words at one time */
int kk;

if (mti == N+1) /* if init() has not been called, */
init(5489UL); /* a default initial seed is used */

for (kk=0;kk<N-M;kk++) {
y = (mt[kk]&UPPER_MASK)|(mt[kk+1]&LOWER_MASK);
mt[kk] = mt[kk+M] ^ (y >> 1) ^ mag01[y & 0x1UL];
}
for (;kk<N-1;kk++) {
y = (mt[kk]&UPPER_MASK)|(mt[kk+1]&LOWER_MASK);
mt[kk] = mt[kk+(M-N)] ^ (y >> 1) ^ mag01[y & 0x1UL];
}
y = (mt[N-1]&UPPER_MASK)|(mt[0]&LOWER_MASK);
mt[N-1] = mt[M-1] ^ (y >> 1) ^ mag01[y & 0x1UL];

mti = 0;
}

y = mt[mti++];

/* Tempering */
y ^= (y >> 11);
y ^= (y << 7) & 0x9d2c5680UL;
y ^= (y << 15) & 0xefc60000UL;
y ^= (y >> 18);

return y;
}

/*
* Generate a random number on [0,0x7fffffff]
*/
long MersenneTwister::genInt31()
{
return (long)(genInt32() >> 1);
}

/*
* Generate a random number on [0,1]-real
*/
double MersenneTwister::genRealClosed()
{
return genInt32() * (1.0 / 4294967295.0);
/* divided by 2^32 - 1 */
}

/*
* Generate a random number on [0,1)-real
*/
double MersenneTwister::genReal()
{
return genInt32() * (1.0 / 4294967296.0);
/* divided by 2^32 */
}

/*
* Generate a random number on (0,1)-real
*/
double MersenneTwister::genRealOpen()
{
return (((double)genInt32()) + 0.5) * (1.0 / 4294967296.0);
/* divided by 2^32 */
}

/*
* Generate a random number on [0,1)-real with 53-bit resolution
*/
double MersenneTwister::genReal53()
{
unsigned long a = genInt32() >> 5;
unsigned long b = genInt32() >> 6;
return(a * 67108864.0 + b) * (1.0 / 9007199254740992.0);
}

Saturday, 12 July 2008

Close-Up Photography

The company I ordered my step-up ring adapter from, SRB-Griturn, sell adapters which I'd never even considered before. Amongst them are adapters to reverse-mount a lens to a camera body, supposedly for close-up photography. The idea is that you take your lens off the camera body, and reverse mount it to the body using the lens filter thread into one side of this adapter, and the other side fits into the camera where the lens would normally go.

And they recommend you "try it first", by holding the lens in reverse up against the body and taking a photo of something.

Now, here is a photo taken using my Canon 18-55mm lens mounted normally. This is about the closest focus I can get to an object:



After taking that photo, I removed the lens, turned it round, and put it up against the hole in the camera. There are a number of obstacles to overcome here.

  1. You lose autofocus functionality, meaning you have to manually focus the lens
  2. The lens focussing isn't really sufficient in this configuration, so you end up moving the camera backwards and forwards anyway
  3. While holding the lens against the body with one hand, and leaving the other hand poised over the shutter button, you need a third hand to change the lens focus
  4. With the camera so close to the subject, light is an issue. There's not enough of it. I switched on a spot-light to get enough light in
After all that, I managed to take the photo below. This was with the lens rather close to the phone, and the results speak for themselves:



There's a fairly small area which is in focus. This is to be expected, given this configuration, and could probably be improved if I had enough hands. I suspect that is what these reverse-mounting adapters are for; freeing up the hand that is holding the lens, so you can focus with it instead!

Even so, I'm amazed by how close this gets you, and of the image quality resulting from it (fuzzy focus aside!)

Wednesday, 9 July 2008

Filter Ring Step Adapters

This morning I was running round Nottingham trying to find somewhere that sold filter ring adapters; these are devices which attach to the filter thread of a camera lens and provide another filter thread of a different size, allowing you to use smaller or larger filters than the lens ordinarily permits.

In my case, I wanted to be able to use my 62mm circular polarising filter on my 58mm thread lens, so I needed a 58-62mm step-up ring. This proved impossible to find. Jacobs had some adapters, but most were step-down (which surely would obscure part of the lens and thus be of very little use at all). The few step-up adapters they did have were the wrong sizes, so I moved on to London Camera Exchange, who informed me they did not keep adapters in stock, but could order them in.

Onwards to Jessops, who also do not stock them. It was here, however, that I received the most useful response; that a company called SRB, based in Luton, produced every kind of adapter imaginable, and took online orders!

So, after returning home, I Googled SRB Luton, and found the most useful photographic adapter producer in the world: http://www.srb-griturn.com/. They stock step-up and step-down ring adapters as well as a whole range of other more exotic things; adapters to make a macro lens out of two lenses, adapters for telescopes and microscopes... the list is almost endless.

I placed my order this afternoon, selecting 1st class delivery. Hopefully the adapter should arrive soon, and I can certainly recommend the website if you're looking for an adapter. In fact, if they don't have what you need, you can ask them to quote you a price for making it! It really doesn't get any better.

Saturday, 5 July 2008

Circular Polariser

My circular polariser arrived today. It's a Hoya Pro1 Digital filter, meaning it has multiple coatings to reduce flares and ghosting, a black frame, black rimmed glass to reduce internal reflections, a low profile frame, and other useful stuff. It's also pretty damn awesome!

I can't be bothered to explain the physics of polarisation, be it linear or circular. I also can't be bothered to explain why cameras with autofocus or through-the-lens metering need circular polarisers, but I'm sure you can all use Google to find that out for yourselves.

What I will do is show you the extremes of the differences in polarisation, and their effects on the image.

The first image is the greenhouse in my back garden. Notice that there is a lot of reflection from the sky. Reflected light is (partially) polarised, with a maximum of 100% polarisation between 50 and 60 degrees from the normal to the reflective plane. Because of this, I can use the polariser to select the light polarised from the reflection, maximising the reflection effect captured in the image:


Alternatively, I can rotate the filter 90 degrees to almost eliminate the reflected light, leaving only the light transmitted through the glass from the tree behind:


Notice also that the greens became more vibrant; that's due to a similar effect with the scattered light from the sky, and reflection from the leaves themselves.

Since I mentioned the sky, here's a photo which is mostly unpolarised, showing very little contrast between the sky and nearby clouds. It looks rather washed out and weak:



And here's almost the same image with full polarising effect. The sky is a much deeper blue, the clouds stand out more, and the vibrant greens are back. The whole image looks much more colour-correct, and retains the same vibrance throughout:

Friday, 4 July 2008

Photography: Californian Poppy

A while ago, I ordered a 62mm Hoya Circular Polarising filter for my 70-300 telephoto lens. I placed the order with 7dayshop, who had a reduced price filter, but no stock! I assumed that if I placed the order sufficiently early, they were likely to get some stock and send one out before I left for Cornwall to take surf photos. Over a week later, I've heard nothing about when they're getting more stock, so I've requested they cancel my order, and gone with a seller on the Amazon marketplace, who not only has stock of the filters, but is selling the Hoya Pro1D professional filters, designed for digital SLRs and with better lens coatings, for about the same price I'd have paid for the standard filter at 7dayshop. Bonus!

In other news, I took some photos around the garden and subsequently played with a few of them in Photoshop. Since I haven't posted a photo on this blog in a while, here's a sample:


(As usual, click the image for a larger version, scaled 50% from the original)

Monday, 16 June 2008

Fourier Transform of a Cat

I took the Fourier Transform of a cat...

But it wasn't very useful, so I took the Fourier Discrete Cosine Transform (DCT) of one instead, and that wasn't much use either, but when I DCT'd back I got a ghost cat.

And now, by way of explanation, here's what I really mean...

I loaded a photograph of a tiger into Mathematica, pulled out the grey values of each pixel, then Fourier transformed the data. The Mathematica discrete Fourier transform produces (as you'd expect) complex results from real data, so it was pretty much useless for rasterising back to graphic format.

Fortunately, Mathematica provides a means to obtain real results from a Fourier Transform, by performing Discrete Cosine Transforms (or Discrete Sine Transforms, DST). Doing this allowed me to transform the data corresponding to greyscale pixel values for the tiger image into a Fourier transform of the tiger. The result was mostly black pixels; not hugely interesting. The original greyscale, and the Fourier DCT image are shown below.





In theory, no information is lost when something undergoes a Fourier Transform. This means that transforming again (i.e. performing a reverse transformation) should yield the original data. Since I had to do a DCT in order to get real pixel values, there was a very slight loss of information, and a second DCT revealed a ghost cat:



UPDATE (July 16, 2008):
I've uploaded the Mathematica notebook (with the output stripped to keep it small, you'll have to re-evaluate the cells, but I provided the original tiger.jpg in the ZIP file too!). It's available at http://www.physical-thought.com/files/fourier-cat.zip

UPDATE (April 19, 2009):
The Mathematica notebook linked above is no longer available online. If you really want the file, you'll have to e-mail me and ask. I'll leave it as an exercise for the reader to find my e-mail address.

Sunday, 1 June 2008

Gravitational Potential

Last night I was playing with Mathematica, visualising perturbations to the gravitational potential of one body by the presence of another (e.g. for binary stars, star-planet and planet-moon systems). I did this by writing a gravitational potential function, using

V(r)=\frac{G M}{r}

Actually, I allowed my Mathematica function to take arguments beyond just a distance r... Firstly I split r into x and y, using the relationship
r^2=x^2+y^2
, then allowed the mass M to be parameterised as well as an offset from the origin, since
r^2=(x-a)^2+(y-b)^2
describes a circle of radius r centred on (a,b).

Finally, I used the fact that potentials can be added to generate the true potential of interacting bodies, i.e.
V_\mathrm{total}=V_\mathrm{star}+V_\mathrm{planet}
. With these functions defined, and a few basic quantities such as the mass of the sun (2*10^30 kg) and the distance in metres corresponding to 1 AU (149,598,000,000 metres), I was able to use the Mathematica ContourPlot[] function to plot equipotential lines for various two-body potentials. I was looking for particularly interesting perturbations to the otherwise circularly symmetric (spherically, in 3D) potential of a "point mass". The approximation of a point mass works perfectly as long as we're dealing only with space outside of the true radius of the object, and on the length scales we're looking at here, that is definitely the case.

I've included some of the more interesting equipotential plots below, including the parameters used to plot them. Remember, you can click any of the images to see a bigger version.

Potential around the Sun: MSun at (0,0) on a scale from -10AU to +10AU on both x and y axes


Earth-Sun System: MSun at (0,0) and MEarth at (1AU,0). x-axis from 0.8 AU to 1.2 AU, y-axis from -1 AU to 1 AU


Earth-Moon System: MEarth at (0,0) and MMoon at (384000km,0). Both axes from -2 to +2 times the Earth-Moon separation (384,000 km)


Sun-Jupiter System: MSun at(0,0), MJupiter at (5.15 AU,0). Axes x from 4 AU to 6 AU, y from -2 AU to +2 AU


HD209458
One of the most "famous" exoplanetary systems, HD209458b is a Hot Jupiter orbiting the star HD209458 (note that the 'b' added to the star name gives the planet name, i.e. the planet is the second object (discovered) in the system). HD209458 has a mass of 1.01 MSun, and the planet HD209458b has mass around 0.69 MJupiter. The separation of the two bodies is 0.045 AU.
Axes from 0.02 AU to 0.07 AU on the x, -0.05 AU to +0.05 AU on the y


HD41004B
Another exoplanet system, this time HD41004B is 0.4 MSun and the planet, HD41004Bb is 18.4 MJupiter. The separation is 0.0177 AU and the axes run from -0.03 AU to +0.03 AU on both x and y


Cygnus X
Cygnus X is a 30 solar mass star and an 8.3 solar mass black hole. The separation is 0.2 AU. I plotted this one with the star at -Separation/2, the black hole at Separation/2 in the range -0.25 AU to +0.25 AU on both x and y axes


PSR B1913+16
A binary pulsar (the first one detected, I think). A 1.441 Solar mass neutron star and a 1.387 Solar mass companion star. Separation 0.0130356021 AU. Axes from -0.02 AU to +0.02 AU on both x and y


I also tried plotting some 3D equipotential surfaces, but the results weren't quite so impressive. Here's the equipotential surface for the Cygnus X system in 3D, range -0.5 AU to 0.5 AU in all three dimensions:


After this, I tried to visualise some of the general relativistic potentials (i.e. corrections to the standard Newtonian potential to take into account general relativistic effects) but didn't get any of them to produce nice visualisations, yet!

Sunday, 25 May 2008

Producing Neutrinos

Neutrinos are produced only in the Weak interaction. This is because they possess no electric charge (so do not couple to the electromagnetic interaction) and no colour charge (so they don't couple to the strong interaction). They do possess a small amount of mass, so they should be affected by gravity, but we tend to ignore gravity in particle physics because it is much weaker than the other three interactions. There are three mechanisms for neutrino production, described in the Feynman diagrams below for electron-neutrinos (similar diagrams exist for the muon and tau for the W+ and W- versions, and the Z0 version can create all three neutrino flavours, by producing neutrino/anti-neutrino pairs).

Production of a charged electron and an anti-neutrino (to conserve lepton number):


Production of a charged positron (anti-electron) and a neutrino:



Production of neutral neutrino/anti-neutrino pair:



The W+, W- and Z0 are all boson (fields) of the Weak interaction. The Weak interaction has what is called a (V-A) structure, which describes the parity transformation of Weak currents.

The parity operation
\hat{P}
inverts space, so that (x, y, z) --> (-x, -y, -z). Vectors pick up a minus sign under parity, e.g. momentum p (and velocity, v):

\begin{align*}
\mathbf{p}=m\mathbf{v} \\
\mathbf{v}=\mathbf{\dot{r}}\\
\Rightarrow \hat{P}\mathbf{v} = -\mathbf{\dot{r}} = -\mathbf{v}\\
\Rightarrow \hat{P}\mathbf{p} = -m\mathbf{v} = -\mathbf{p}
\end{align*}

The V in (V-A) represents some component of the current which acts like a vector under parity. As for the A, this represents an axial vector (or pseudovector). An axial vector is one which retains its sign under parity, an example is the spin of a particle:

\begin{align*}
\mathbf{\sigma} = \mathbf{r}\times\mathbf{p} \\
\hat{P}\mathbf{\sigma} = -\mathbf{r}\times-\mathbf{p}\\
= \mathbf{r}\times\mathbf{p}\\
= \mathbf{\sigma}
\end{align*}


It is no coincidence that I chose to describe the vector and axial transformations under parity using momentum and spin as examples. Indeed, there is a quantity called Helicity, which is a projection of the spin of a particle along the momentum vector of the particle:

\hat{h} = \frac{\mathbf{\sigma}\cdot\mathbf{p}}{|\mathbf{p}|}

The helicity operator returns eigenvalues +1 (spin aligned with momentum) and -1 (spin anti-aligned with momentum). The +1 state is termed Right-Handed, and the -1 Left-Handed. Helicity commutes with the Hamiltonian, i.e.

\left[\hat{h},\hat{H}\right] = \hat{h}\hat{H}-\hat{H}\hat{h} = 0

which means that helicity states are not affected by motion of the particle (for an example of states which do not commute with the Hamiltonian, see my discussion on neutrino oscillations, particularly that weak flavour eigenstates are not (necessarily) the eigenstates of the Hamiltonian).

There is, however, a problem with helicity. It is possible to make a Lorentz boost (i.e. a relativistic transformation) into an inertial frame where the spin and momentum go from aligned to anti-aligned, or vice versa. This means that helicity is not Lorentz invariant. Lorentz invariance is rather useful when we're talking about relativistic particles, so we define the concept of chirality. A particle is chiral if you cannot superimpose it with its "mirror image". As an example, your left hand is a parity-transformed version of your right hand. Holding the two facing the same direction, it is impossible to superimpose the two and regain the same shape (you can do it by turning one hand around, which is analogous to the parity transformation).

Chirality and helicity are identical only for massless particles, since a massless particle must travel at the speed of light, and no matter how you Lorentz boost, you're not going to make it look any different (constancy of the speed of light in all inertial frames is one of the fundamental postulates of relativity). Chirality is Lorentz invariant, but it is not an eigenstate of the Hamiltonian (except for massless particles, in which case it is identical to Helicity). The chirality operator is
\gamma^5
.

Now, you may be wondering what all this has to do with neutrino interactions. It turns out that the V-A structure of the weak interaction can be represented in terms of gamma matrices. The electromagnetic interaction has a current which looks something like:

J^\mu_{em} = \bar{e}\gamma^\mu e

The Weak interaction, by contrast, has a current which looks something like:

J^\mu_{W} = \bar{e}\frac{1}{2}\gamma^\mu(1-\gamma^5)e

Here, the quantity
\gamma^\mu
transforms as a vector (V) under parity, and
\gamma^5
transforms as an axial vector (A) under parity. This provides the Weak interaction with the desired Vector - Axial (V-A) structure.

An important consequence of this comes from our definition of the chirality operator as
\gamma^5
. Since this operator exists within the Weak current, only left-handed particle states (or right-handed antiparticle states) couple to the Weak interaction, which means that neutrinos can only be produced as left-handed neutrinos (their spin is in the opposite direction to their momentum) or right-handed anti-neutrinos (their spin is in the same direction as their momentum).

Actually, it's a little more complicated than this; the neutral current (Z0) couples to right-handed charged leptons too, but doesn't couple to right-handed neutrinos. This is because the Z0 is actually a combination of two other neutral bosons and provides coupling to the electromagnetic interaction as well as weak interactions. This forms the basis of electroweak unification, and is (for now) beyond the scope of this blog.

If you'd like to know more about neutrinos, I can highly recommend the following:
Neutrino Physics, K. Zuber (Taylor & Francis, 2004) [Amazon]

For more about the V-A structure, interaction currents and electroweak unification, I refer you to the two excellent books by Aitchison and Hey on gauge theories:
Gauge Theories in Particle Physics (Vol. I), I. J. R. Aitchison and A. J. G. Hey (Taylor & Francis, 2003) [Amazon]
Gauge Theories in Particle Physics (Vol. II), I. J. R. Aitchison and A. J. G. Hey (Taylor & Francis, 2004) [Amazon]