The references for this lecture are here. Note that this lecture needs the Symbol font enabled on your browser.

Given a system characterised by the Hamiltonian H, and given an approximate wave function for the ground state of the system, y, then we can evaluate the following quantity

_{c} = [ò
y*Hydr]/
[òy*
ydr],
(6.1) |

which is known as the Rayleigh quotient. If you revise the material given
in Lecture 1 you will immediately realise that
the Rayleigh quotient is just the expectation value of the
Hamiltonian operator, or in other words, the energy. However, because
y is only an approximate wave function,
the Rayleigh quotient gives only an approximation to the exact energy.
The Variational principle states simply that the Rayleigh quotient
provides a value R_{c} which is *always larger* than the
exact energy of the ground state, i.e.

_{c}
³ E_{exact},
(6.2) |

the equality occurring if and only if y is the exact wave function.

This inequality is crucially important, because usually we do not know what the exact wave function is; all we have is an approximation to it. But via the variational principle we know where we are, i.e., we know that our approximate wave function gives an energy that approaches the exact result from above, and never from below. Therefore, any improvement we make on the wave function can only reduce the value of the Rayleigh quotient (i.e. our approximate energy) and get us closer to the exact result.

The variational principle is easy to demonstrate for the ground state, and you can find the proof in any Quantum Chemistry, Quantum Mechanics or Physical Chemistry text book. A particularly clear derivation is given by Thijssen, chapter 3. Follow this demonstration yourself via problem 6.4.1.

The variational principle is important not only because it tells us that an approximate wave function always gives an energy higher than the exact one, but it also tells us how to improve our approximate wave functions. Imagine that we have a basis set {f} in which we wish to expand the wave function y, namely:

_{n}c_{n}
f_{n}.
(6.3) |

Then the Rayleigh quotient would be written as

_{c} =
[å_{m}å_{n}
c_{m}*c_{n}H_{mn}]/
[å_{m}å_{n}
c_{m}*c_{n}S_{mn}],
(6.4) |

where

_{nm} = ò
f_{m}*H
f_{n}dr, and
S_{nm} = òf_{m}*
f_{n}dr],
(6.5) |

are the Hamiltonian and overlap matrix elements respectively.

We said earlier
that the variational principle tells us that the approximate energy is always
larger than the exact energy. Because this is so, we can change the wave
function in such a way that we reduce R_{c} to the smallest possible
value. Then we will be as close as we can be, given the basis set chosen, to
the exact wave function and energy. Having expressed the wave function as
a linear combination of the functions in the basis set, the only way to
change the wave function is by modifying the expansion coefficients. The
condition that the Rayleigh quotient R_{c} be a minimum with respect
to the values of the expansion coefficients is that the partial derivatives
of the quotient with respect to each coefficient be all equal to zero. In
other words:

_{n}(H_{mn} -E S_{mn})
C_{n} = 0 for all m.
(6.6) |

These constitute a set of N linear equations in the unknown coefficients
C_{n}, where N is the size of the basis set.

This type of set of linear equations constitutes an eigenvalue problem.
The equations only have solution for certain allowed values of the
energy E, the eigenvalues of the system. For each allowed value
E_{i}, there is a non-trivial solution, i.e. a set of values of
the coefficients C_{n}^{(i)}, which give the best
approximation to the wave function of state i, within this basis
set. This set of equations can be written in a more compact form using
matrix notation as HC = ESC, where now C is a vector of length N, with
each element being one of the C_{n} coefficients.

There are standard algorithms for solving the eigenvalue problem, but we will not go into details here, as this is a mathematical question rather than a physical one. In practice this is done by the computer, and all one needs to worry about is the construction of the Hamiltonian (and if the basis set is not orthonormal, also the overlap) matrix. This is then fed into a mathematical library routine which returns the eigenvalues and also the eigenvectors. Examples of this procedure are being provided by the practical computer sessions, from the first session onwards. So now is a good time to look again at this material to be sure you have understood the structure of the programs used.

_{k}(r) =
e^{(ik.r)}u(r),
(6.7) |

in virtue of Bloch's theorem. Here e^{ikr} is a plane wave
of wave vector **k**, and u(**r**) is a periodic function having the
same periodicity as the crystal. In practice, because at
the atomic scale crystals are very large, the components of the wave
vector vary essentially continuously, but since the plane wave is a periodic
function of the wave vector, the wave functions
y_{k} and
y_{k+K} will be the same
whenever **K** is a reciprocal lattice vector. Remember, from
lecture 3, that the symbols **K**, **G** and
**g** are all used for reciprocal lattice vectors by different authors.

A large crystal consists of very many atoms, with even more electrons,
and this is clearly an insoluble problem if tackled directly. But now
Bloch's theorem comes to the rescue, because it tells us that we do not
need to consider the whole crystal; rather we can focus only on a unit cell, and
bring in **k**-space to account for the presence of the remaining, translationally
symmetric, part of the crystal. But we must still solve the
problem for the unit cell at different values of the **k** wave vector, and
in principle we ought to do this an infinite number of times! This does not
sound very good either, but fortunately we do not need to solve the problem for every
single **k** vector of the infinite set in the reciprocal lattice. The
reason is this: nearby **k** vectors have very similar wave functions,
and those wave functions have very similar energies.

So, we typically solve the problem on a finite grid of **k** vectors, and then
interpolate between our solutions to get a good approximation to the infinite
crystal solution. Indeed in some formulations, such as the tight binding method
discussed in the section 6.3, the form of the
band structure is analytic; the question then reduces to how many coefficients,
multplying these analytic functions, are included. Similar remarks apply to the
moment expansion methods discussed by Sutton and Pettifor. In that case one can
relate the low order moments of the local density of states, to
the number and strength of the overlap integrals between near neighbours.

Because the wave functions depend on the wave vector **k**, the energies
of the crystal are also functions of **k**. Each value of **k** will
correspond to an energy value within a band of allowed energies in the
crystal. One frequently comes accross plots of the variation of the different
allowed energies as a function of the wave vector. These plots are known as
*band structure* diagrams. In Figure 6.1 we show a schematic
example of a band structure plot, and how it is related to the different
electronic bands in the crystal.

Figure 6.1: Schematic diagram of how the band structure of molecules and solids
(E(**k**)) arise, when the starting atomic states are s- and p-electrons. In
this case the band structure corresponds to a semiconductor, with a small gap
between valence p-derived states and conduction s-derived states.

The band structure of a solid reflects the features of the potential that is felt by the electrons as they move through the crystal. This is illustrated in Figure 6.2, where we represent the band structures of two model one-dimensional solids, which differ only in the depth of the potential wells.

Figure 6.2: Schematic diagram showing the difference between band structures with weak (upper) and strong (lower) potential wells. Note the difference in curvature of the lowest band, and in the splitting between the upper bands.

As can be clearly seen, in the deep periodic potential, the low lying states
are hardly affected at all by the presence of other wells nearby, and as
a consequence there is very little *dispersion* of the corresponding
band, which is essentially flat. These are core levels. However, for the
bound states in the shallow periodic potential, and also for the valence
states of the deep potential, dispersion is substantial, which indicates that
these states actually split into wide bands of allowed energy states. In
between allowed (narrow or wide) bands there are regions of forbiden energy,
or *band gaps*.

You are all familiar with the classification of solids into metals,
semiconductors and insulators. Now you can easily see where this classification
comes from, and what does it mean in terms of the electronic
properties of the crystal in question. Metals have a valence band with is filled up to
the Fermi level, and the unoccupied states are easily accessible, e.g. on
application of an electric field.
Pure semiconductors have filled bands, but a small band gap exists, which is of
the order of several *kT* (where *k* is Boltzmann's constant and
*T* is the absolute temperature). Thus it is not so difficult to promote
electrons from the occupied states to the conduction band, and then one can
obtain a conducting crystal, even if the material would not conduct at *T* = 0.
Real devices of course use impurities to manipulate the Fermi level.
Finally, insulators are materials for which the band gap between the highest
occupied band and the conduction band is too large to allow for the easy
promotion of electrons to the conduction states, and thus it is extremely
difficult to make these materials conduct.

These broad classifications do, of course, hide a lot of subtelty. For example,
in the 1D diagrams shown in this section, a diatomic 'metal' from the second
column of the periodic table would have the valence band full; thus the reason
why Mg is a metal is that the E(**k**) curve is different in different directions.
This raises the question of how such metals arise, and how big does a cluster have to
be before it shows 'metallic' behaviour. Similar considerations arise in respect of
other properties, e.g. magnetism. The important point at this stage is to realise that
such questions can be *posed*. Their *solution* is often difficult, and
is usually the subject of ongoing research.

In Pettifor, the LCAO method first appears (chapter 3, page 51) in the context of molecules, in the same context as we have used here in lecture 1, section 1.2 and more specifically section 1.3. A full treatment of the TB method is given in chapter 7, pages 173-178, but only after all the relevant terms for molecular overlap, or hopping, integrals have been given. Jumping into chapter 7 may therefore require some prior work on chapters 3 and 4. Especially one needs to understand the notation of s, p and d bonds, explained on pages 66-74.

The general feature of the TB method is already clear from the 1D case of a line or
ring of atoms. When overlaps are allowed only between nearest neighbours, the energy
depends on *k* as cos(ka), i.e.

ka),
(6.8) |

where -p/*a* < *k* <
p/*a* and *a* is the spacing along the chain.
This says that *k* is confined to the first Brillouin zone.
Note also that a is the self-energy term, and
b the overlap energy, entirely as the terms V and W in
lecture 1, section 1.2.2.

When more overlaps are allowed, e.g. with second nearest neighbours at distance 2*a*,
then the term in b becomes a sum of different
bs, each with their own value, and correponding cosine
terms with argument (2*ka*), etc.

Generating corresponding series for 2D and 3D geometries of course involves the
reciprocal lattice and the vector **k**, but is otherwise analogous. Thus the
simplest 2D or 3D case, for centro-symmetric crystals, has an energy structure

cos(_{R}k.R)},
(6.9) |

where the atom neighbours are found at positions + and - **R** with respect to the
atom under consideration. Thus the band structure depends on the
crystal structure via **R**, and the energy is a function of both the magnitude and
the direction of **k**. It also depends on the type of bonding which determines the
values of the constants b. Some of these points are
illustrated in the following two lectures, in the context of the LCAO bands of silicon.

Return to timetable or to course home page.