The references for this lecture are here. Note that this lecture needs the Symbol font enabled on your browser.
Given a system characterised by the Hamiltonian H, and given an approximate wave function for the ground state of the system, y, then we can evaluate the following quantity
which is known as the Rayleigh quotient. If you revise the material given in Lecture 1 you will immediately realise that the Rayleigh quotient is just the expectation value of the Hamiltonian operator, or in other words, the energy. However, because y is only an approximate wave function, the Rayleigh quotient gives only an approximation to the exact energy. The Variational principle states simply that the Rayleigh quotient provides a value Rc which is always larger than the exact energy of the ground state, i.e.
the equality occurring if and only if y is the exact wave function.
This inequality is crucially important, because usually we do not know what the exact wave function is; all we have is an approximation to it. But via the variational principle we know where we are, i.e., we know that our approximate wave function gives an energy that approaches the exact result from above, and never from below. Therefore, any improvement we make on the wave function can only reduce the value of the Rayleigh quotient (i.e. our approximate energy) and get us closer to the exact result.
The variational principle is easy to demonstrate for the ground state, and you can find the proof in any Quantum Chemistry, Quantum Mechanics or Physical Chemistry text book. A particularly clear derivation is given by Thijssen, chapter 3. Follow this demonstration yourself via problem 6.4.1.
The variational principle is important not only because it tells us that an approximate wave function always gives an energy higher than the exact one, but it also tells us how to improve our approximate wave functions. Imagine that we have a basis set {f} in which we wish to expand the wave function y, namely:
Then the Rayleigh quotient would be written as
where
are the Hamiltonian and overlap matrix elements respectively.
We said earlier that the variational principle tells us that the approximate energy is always larger than the exact energy. Because this is so, we can change the wave function in such a way that we reduce Rc to the smallest possible value. Then we will be as close as we can be, given the basis set chosen, to the exact wave function and energy. Having expressed the wave function as a linear combination of the functions in the basis set, the only way to change the wave function is by modifying the expansion coefficients. The condition that the Rayleigh quotient Rc be a minimum with respect to the values of the expansion coefficients is that the partial derivatives of the quotient with respect to each coefficient be all equal to zero. In other words:
These constitute a set of N linear equations in the unknown coefficients Cn, where N is the size of the basis set.
This type of set of linear equations constitutes an eigenvalue problem. The equations only have solution for certain allowed values of the energy E, the eigenvalues of the system. For each allowed value Ei, there is a non-trivial solution, i.e. a set of values of the coefficients Cn(i), which give the best approximation to the wave function of state i, within this basis set. This set of equations can be written in a more compact form using matrix notation as HC = ESC, where now C is a vector of length N, with each element being one of the Cn coefficients.
There are standard algorithms for solving the eigenvalue problem, but we will not go into details here, as this is a mathematical question rather than a physical one. In practice this is done by the computer, and all one needs to worry about is the construction of the Hamiltonian (and if the basis set is not orthonormal, also the overlap) matrix. This is then fed into a mathematical library routine which returns the eigenvalues and also the eigenvectors. Examples of this procedure are being provided by the practical computer sessions, from the first session onwards. So now is a good time to look again at this material to be sure you have understood the structure of the programs used.
in virtue of Bloch's theorem. Here eikr is a plane wave of wave vector k, and u(r) is a periodic function having the same periodicity as the crystal. In practice, because at the atomic scale crystals are very large, the components of the wave vector vary essentially continuously, but since the plane wave is a periodic function of the wave vector, the wave functions yk and yk+K will be the same whenever K is a reciprocal lattice vector. Remember, from lecture 3, that the symbols K, G and g are all used for reciprocal lattice vectors by different authors.
A large crystal consists of very many atoms, with even more electrons, and this is clearly an insoluble problem if tackled directly. But now Bloch's theorem comes to the rescue, because it tells us that we do not need to consider the whole crystal; rather we can focus only on a unit cell, and bring in k-space to account for the presence of the remaining, translationally symmetric, part of the crystal. But we must still solve the problem for the unit cell at different values of the k wave vector, and in principle we ought to do this an infinite number of times! This does not sound very good either, but fortunately we do not need to solve the problem for every single k vector of the infinite set in the reciprocal lattice. The reason is this: nearby k vectors have very similar wave functions, and those wave functions have very similar energies.
So, we typically solve the problem on a finite grid of k vectors, and then interpolate between our solutions to get a good approximation to the infinite crystal solution. Indeed in some formulations, such as the tight binding method discussed in the section 6.3, the form of the band structure is analytic; the question then reduces to how many coefficients, multplying these analytic functions, are included. Similar remarks apply to the moment expansion methods discussed by Sutton and Pettifor. In that case one can relate the low order moments of the local density of states, to the number and strength of the overlap integrals between near neighbours.
Because the wave functions depend on the wave vector k, the energies of the crystal are also functions of k. Each value of k will correspond to an energy value within a band of allowed energies in the crystal. One frequently comes accross plots of the variation of the different allowed energies as a function of the wave vector. These plots are known as band structure diagrams. In Figure 6.1 we show a schematic example of a band structure plot, and how it is related to the different electronic bands in the crystal.
Figure 6.1: Schematic diagram of how the band structure of molecules and solids (E(k)) arise, when the starting atomic states are s- and p-electrons. In this case the band structure corresponds to a semiconductor, with a small gap between valence p-derived states and conduction s-derived states.
The band structure of a solid reflects the features of the potential that is felt by the electrons as they move through the crystal. This is illustrated in Figure 6.2, where we represent the band structures of two model one-dimensional solids, which differ only in the depth of the potential wells.
Figure 6.2: Schematic diagram showing the difference between band structures with weak (upper) and strong (lower) potential wells. Note the difference in curvature of the lowest band, and in the splitting between the upper bands.
As can be clearly seen, in the deep periodic potential, the low lying states are hardly affected at all by the presence of other wells nearby, and as a consequence there is very little dispersion of the corresponding band, which is essentially flat. These are core levels. However, for the bound states in the shallow periodic potential, and also for the valence states of the deep potential, dispersion is substantial, which indicates that these states actually split into wide bands of allowed energy states. In between allowed (narrow or wide) bands there are regions of forbiden energy, or band gaps.
You are all familiar with the classification of solids into metals, semiconductors and insulators. Now you can easily see where this classification comes from, and what does it mean in terms of the electronic properties of the crystal in question. Metals have a valence band with is filled up to the Fermi level, and the unoccupied states are easily accessible, e.g. on application of an electric field. Pure semiconductors have filled bands, but a small band gap exists, which is of the order of several kT (where k is Boltzmann's constant and T is the absolute temperature). Thus it is not so difficult to promote electrons from the occupied states to the conduction band, and then one can obtain a conducting crystal, even if the material would not conduct at T = 0. Real devices of course use impurities to manipulate the Fermi level. Finally, insulators are materials for which the band gap between the highest occupied band and the conduction band is too large to allow for the easy promotion of electrons to the conduction states, and thus it is extremely difficult to make these materials conduct.
These broad classifications do, of course, hide a lot of subtelty. For example, in the 1D diagrams shown in this section, a diatomic 'metal' from the second column of the periodic table would have the valence band full; thus the reason why Mg is a metal is that the E(k) curve is different in different directions. This raises the question of how such metals arise, and how big does a cluster have to be before it shows 'metallic' behaviour. Similar considerations arise in respect of other properties, e.g. magnetism. The important point at this stage is to realise that such questions can be posed. Their solution is often difficult, and is usually the subject of ongoing research.
In Pettifor, the LCAO method first appears (chapter 3, page 51) in the context of molecules, in the same context as we have used here in lecture 1, section 1.2 and more specifically section 1.3. A full treatment of the TB method is given in chapter 7, pages 173-178, but only after all the relevant terms for molecular overlap, or hopping, integrals have been given. Jumping into chapter 7 may therefore require some prior work on chapters 3 and 4. Especially one needs to understand the notation of s, p and d bonds, explained on pages 66-74.
The general feature of the TB method is already clear from the 1D case of a line or ring of atoms. When overlaps are allowed only between nearest neighbours, the energy depends on k as cos(ka), i.e.
where -p/a < k < p/a and a is the spacing along the chain. This says that k is confined to the first Brillouin zone. Note also that a is the self-energy term, and b the overlap energy, entirely as the terms V and W in lecture 1, section 1.2.2.
When more overlaps are allowed, e.g. with second nearest neighbours at distance 2a, then the term in b becomes a sum of different bs, each with their own value, and correponding cosine terms with argument (2ka), etc.
Generating corresponding series for 2D and 3D geometries of course involves the reciprocal lattice and the vector k, but is otherwise analogous. Thus the simplest 2D or 3D case, for centro-symmetric crystals, has an energy structure
where the atom neighbours are found at positions + and - R with respect to the atom under consideration. Thus the band structure depends on the crystal structure via R, and the energy is a function of both the magnitude and the direction of k. It also depends on the type of bonding which determines the values of the constants b. Some of these points are illustrated in the following two lectures, in the context of the LCAO bands of silicon.