Schrödinger Equation (1)

Daniel Fleish  A Student's Guide to the Schrödinger Equation  (Cambridge University Press  2020)

1. Vectors and Functions

According to the author, in his Quantum Mechanics course, students find it is helpful to see a combination of Erwin Schrodinger's wave mechanics approach and matrix mechanics approach of Werner Heisenberg, as well as Paul Dirac's bra and ket.

          1.2 Dirac Notation
A quantum wavefunction can be expressed as a weighted combination of basis wavefunctions. A generalized version of the dot or scalar product called inner product can be used to calculate how much each component wavefunction contributes the the sum, and this determines the probability of various measurement outcomes,
   The notation was developed by paul direct in 1938, while he was working with a generalized version of dot or scalar product called the inner product , written as ⟨A∣B⟩. In this context "generalized" means the inner product  can be used with higher-dimensional abstract vectors with complex components. Direc realized that the inner product bracket ⟨A∣B⟩ could be conceptually diveded into two pieces, a left half which he called a "bra" and a right half which he called a "ket". In conventional notation, an inner product between vectors 𝐴̄ and 𝛣̄ might be written as 𝐴̄ ⋅ 𝛣̄ or (𝐴̄, 𝛣̄), but in Direc notation the inner product is written as
    (1.12)      Inner product of ∣𝐴⟩ and ∣𝛣⟩ = ⟨𝐴∣ times ∣𝛣⟩ = ⟨𝐴∣𝛣⟩.                            
    (1.13)      ∣𝐴⟩ = column matrix(𝐴x  𝐴y  𝐴z) .                                     
    (1.14)      ⟨𝐴∣ = (𝐴*x  𝐴*y  𝐴*z),
where the components with superscript of * are complex conjugates, which we will see as numerical expression later. The inner product ⟨𝐴∣𝛣⟩ is
    (1.15)      ⟨𝐴∣ times ∣𝛣⟩ = ⟨𝐴∣𝛣⟩ = (𝐴*x  𝐴*y  𝐴*zcolumn matrix(𝛣x  𝛣y  𝛣z).                                               
    (1.16)      ⟨𝐴∣𝛣⟩ = (𝐴*x  𝐴*y  𝐴*zcolumn matrix(𝛣x  𝛣y  𝛣z) = 𝐴*x𝛣x + 𝐴*y𝛣y + 𝐴*z𝛣z.
   Since we will be dealing with generalized vectors the components 𝑥, 𝑦, 𝑧, instead of using the Cartesian units vectors 𝑖̄, 𝑗̄, 𝑘̄, we will use, 𝜖1̄,𝜖2̄...𝜖𝑁̄. So the equation
    (1.17-18)      ∣𝐴⟩ = 𝐴x∣𝑖⟩ + 𝐴y∣𝑗⟩ + 𝐴z∣𝑘⟩     becomes     ∣𝐴⟩ = 𝐴1∣𝜖1⟩ + 𝐴2∣𝜖2⟩ + ∙∙∙ 𝐴𝑁∣𝜖⟩ = ∑𝑖=1𝑁𝐴𝑖∣𝜖𝑖⟩.
   On the other hand a bra is a "linear functional" (also called a "covector" or a "one-form") that combines with a a ket to produce a scalar; mathematicians say bras map vectors to the field of scalars. A linear functional is essentially a mathematical device (or an instruction)  that operates on another object. Hence a bra operates on a ket, and the result is a scalar. Bras don't inhabit the same vector space as kets - they live their own vector space that calle the "dual space to the space of kets. Within thair space bras can be added together and multiplied by scalars to produce new bras.  
   One reason that the space is called "dual" to the space of ket is that for every ket there is a corresponding bra, and when a bra operates on its corresponding (dual) ket, the scalar result is the square of the norm of the ket.
              ⟨𝐴∣𝐴⟩ = (𝐴*1  𝐴*2  . . .  𝐴*𝑁) column mattrix(𝐴1  𝐴2  . . .  𝐴𝑁) = ∣𝐴̄∣2
   The bra ⟨𝐴∣ corresponds to the ket ∣𝐴⟩, but the components of bra ⟨𝐴∣ are the complex conjugates, which we will see the concrete expression in the section (1.4), of ∣𝐴⟩. So a bra is a device for turning a vector (ket) into a scalar. As a result, a bra ⟨𝐴∣ which corresponding to ∣𝐴⟩, with a ket ∣𝛣⟩ produce a scalar as the equation (1.16).

          1.3 Abstract Vectors and Functions
To understand the use of bras and kets in quantum mechanics, it's necessary to generalize the concept s of vector components and basis vector to function. Instead of attempting to replicate three-dimensional physical space as in Fig. 1.4a, simply line up the vector components along the horizontal axis of a two-dimensional graph, with the vertical axis representing the amplitude of the components, as in Fig. 1.4b.
   Because higher-dimensional abstract spaces turn out to be very useful tools for solving problems in several areas of physics, including classical and quantum mechanics. These spaces are called "abstract" because they are nonphysical - that is, their dimensions don't represent the physical dimensions of the universe we inhabits.
   Now imagine drawing a set of axes in an abstract space and marking each axis with value of a parameter. Thus we can consider the graphs of the components of an N-dimensional vector. The multidimensional space most useful in quantum mechanics is an abstract vector space called "Hilbert space" after the German mathematician David Hilbert.
   To understand the characteristics of Hilbert space, imagine we are dealing with a vector with an extremely large number of components, the components may be treated as a continuous function rather than a set of discrete values. Then the function (call it "𝑓") is depicted as the curvy line connecting the tips of the vector components. Then we can add and multiply by scalar functions. So two functions 𝑓(𝑥) and 𝑔(𝑥) added at every 𝑥 and the original function 𝑓(𝑥) times scalar multiplier results in a value at every 𝑥.
   Finally, the inner product between two functions 𝑓(𝑥) and 𝑔(𝑥) is as follows
    (1.19)      ⟨𝑓(𝑥)∣𝑔(𝑥)⟩ = -∞ 𝑓*(𝑥)𝑔(𝑥) 𝑑𝑥,
where 𝑓*(𝑥) represents the complex conjugate as in Eq. (1.16). The reason for taking the complex conjugate is explained in the next section.
   There is one more condition that must be satisfied before we can call that vector space a Hibert space. That condition is that the functions must have a finite norm
    (1.20)      ∣𝑓(𝑥)∣2 = ⟨𝑓(𝑥)∣𝑓(𝑥)⟩ = -∞ 𝑓*(𝑥)𝑓(𝑥) 𝑑𝑥 < ∞.
Such functions are said to be "square summable" or "square integrable" in Hilbert space. Thus continuous functions have generalized "length" and "direction" and obey the rules of vector addition and scalar multiplication , and inner product. Hilbert space is a collection of such functions that also have finite norm.

          1.4 Complex Numbers, Vectors, and Functions
One important difference between the Schrodinger equation and the classical wave equation is the presence of the imaginary unit "𝑖" (the square root of minus one). So this section contains a short review of complex numbers and their use in the context of vector comonents and Dirac notation.
   The most general way of representing a complex quantity 𝑧 is
    (1.21)      𝑧 = 𝑥 + 𝑖𝑦, where 𝑥 is the real part, 𝑖 = √(-1) and 𝑦 is the imaginary part.
   Imaginary numbers lie along a different number line. The number line is perpendicular to the real number line, and a two-dimensional plot of a both number lines represent the "complex plane" shown in Fig. 1.7.  
   The magnitude of a complex number is
    (1.22)      ∣𝑧∣2 = 𝑥2 + 𝑦2.
   To find the magnitude of a complex number, it's necessary to multiply the quantity not by itself, but by its complex conjugate. The complex conjugate for 𝑧 = 𝑥 + 𝑖𝑦 is
    (1.24)      𝑧* = 𝑥 - 𝑖𝑦,
which is ususally indicated by an asterisk. So we can obtain the magnitude as follows
    (1.25)      ∣𝑧∣2 = 𝑧 ⨯ 𝑧* = (𝑥 + 𝑖𝑦) ⨯ (𝑥 - 𝑖𝑦) = 𝑥2 - 𝑥𝑖𝑦 + 𝑖𝑦𝑥 + 𝑦2 = 𝑥2 + 𝑦2.
And since the magnitude (or norm) of a vector 𝐴̄ can be found by taking the square root of the inner product of vector itself, the complex conjugate is built into the process of taking the inner product between complex quantities:
    (1.26)      ∣𝐴∣ = √(𝐴̄ ⋅ 𝐴̄) = √(𝐴*x𝐴x + 𝐴*y𝐴y + 𝐴*z𝐴z) = √(∑𝑖=1𝑁𝐴*𝑖 𝐴𝑖).
This also applies to complex functions:
    (1.27)      ∣𝑓(𝑥)∣ = √(⟨𝑓(𝑥)∣𝑓(𝑥)⟩) = √(-∞𝑓(𝑥)*𝑓(𝑥) 𝑑𝑥).
If the inner product involves two different vectors or functions by convention the complex conjugate is taken of the first membrs of the pair:
    (1.28)      𝐴̄ ⋅ 𝐵̄ = ∑𝑖=1𝑁𝐴*𝑖 𝐵𝑖      ⟨𝑓(𝑥)∣𝑔(𝑥)⟩ = -∞𝑓(𝑥)*𝑔(𝑥) 𝑑𝑥.
   The requirement to take the complex conjugate of one member of the inner product for complex vectors means that the order matters. So 𝐴̄ ⋅ 𝐵̄ is not the same as 𝐵̄ ⋅ 𝐴̄. That's because
    (1.29)      𝐴̄ ⋅ 𝐵̄ = ∑𝑖=1𝑁𝐴*𝑖 𝐵𝑖 = ∑𝑖=1𝑁(𝐴𝑖 𝐵*𝑖)* = ∑𝑖=1𝑁(𝐵*𝑖 𝐴𝑖)* = (𝐵̄ ⋅ 𝐴̄)*
                   ⟨𝑓(𝑥)∣𝑔(𝑥)⟩ = -∞𝑓(𝑥)*𝑔(𝑥) 𝑑𝑥 = -∞[𝑔(𝑥)*𝑓(𝑥)]* 𝑑𝑥 = (⟨𝑔(𝑥)∣𝑓(𝑥)⟩)*

          1.5 Orthogonal Functions  
For vectors, the concept of orthogonality is straightforward: two vectors are orthogonal if their scalar product is zero. Similar considerations apply to 𝑁-dimensonal vectors as well as continuous functions as shown in Fig. 1.9. So if the abstract vectors 𝐴̄ and 𝐵̄ are orthogonal, their inner product (𝐴̄, 𝐵̄) must equal zero:
               (𝐴̄, 𝐵̄) = ∑𝑖=1𝑁𝐴*𝑖 𝐵𝑖 = 0.
               ⟨𝑓(𝑥)∣𝑔(𝑥)⟩ = -∞𝑓(𝑥)*𝑔(𝑥) 𝑑𝑥 = 0.  
As we will see in section 2.5, orthogonal basis functions plays an important role in determining the possible outcome.

          1.6 Finding Components Using the Inner Product
The components of vectors can be written as the scalar product of each unit vector with the vector:
    (1.30)      𝐴x = 𝑖̄ ⋅ 𝐴̄     𝐴y = 𝑗̄ ⋅ 𝐴̄     𝐴z = 𝑘̄ ⋅ 𝐴̄
which can be concisely written as
    (1.31)      𝐴𝑖 = 𝜖̄𝑖 ⋅ 𝐴̄     𝑖 = 1, 2, 3,
in which 𝜖̄1 represents 𝑖̄, 𝜖̄2 represents 𝑗̄, and 𝜖̄3 represents 𝑘̄.
   This can be generalized to find the components of an 𝑁-dimensional abstract vectors represented by the ket ∣𝐴⟩ in a basis system with orthogonal basis vectors 𝜖̄1, 𝜖̄2, . . . 𝜖̄𝑁:
    (1.32)      𝐴𝑖 = 𝜖̄𝑖 ⋅ 𝐴̄/∣𝜖̄𝑖2 = ⟨𝜖̄𝑖∣𝐴⟩/⟨𝜖̄𝑖∣𝜖̄𝑖⟩.
Notice that the basis vectors in this case are orthogonal but they don't necessarily have unit length. The process of dividing by the square of the norm of a vector or function is called "normalization", and orthongonal vectors or functions that have a length of one unit are "orthonorrmal". The condition of orthonormality for basis vectors 𝜖̄ is often written as
    (1.34)      𝜖̄𝑖 ⋅ 𝜖̄𝑗 = ⟨𝜖̄𝑖∣𝜖̄𝑗⟩ = 𝛿𝑖𝑗,
in which 𝛿𝑖𝑗 represents the Knonecker delta, which has a value of one if 𝑖 = 𝑗 or zero if 𝑖 ≠ 𝑗.
   The expansion of a vector as the weighted combination of a set of basis vectors as the weighted combination od a set of basis vectors and use of normalized  scalar product to find the vector's components for a specified basis can be extended to the function of Hilbert space. So the expansion of function ∣𝜓⟩ using basis functions ∣𝜓𝑛⟩ is
    (1.35)      ∣𝜓⟩ = 𝑐1∣𝜓1⟩ + 𝑐1∣𝜓1⟩ + ⋅ ⋅ ⋅ + 𝑐𝑁∣𝜓𝑁⟩ = ∑𝑛=1𝑁𝑐𝑛∣𝜓𝑛⟩,
in which 𝑐𝑛 tells you the "amount" of basis function ∣𝜓𝑛⟩. As long as the basis functions ∣𝜓1⟩, ∣𝜓2⟩ . . . ∣𝜓𝑁⟩ are orthogonal, the components 𝑐1, 𝑐2 . . . 𝑐𝑁 can be found using the noralized inner product:
                    𝑐1 = ⟨𝜓1∣𝜓⟩/⟨𝜓1∣𝜓1⟩ = -∞𝜓*1(𝑥)𝜓(𝑥) 𝑑𝑥/[-∞𝜓*1(𝑥)𝜓1(𝑥) 𝑑𝑥]
    (1.36)      𝑐2 = ⟨𝜓2∣𝜓⟩/⟨𝜓2∣𝜓2⟩ = -∞𝜓*2(𝑥)𝜓(𝑥) 𝑑𝑥/[-∞𝜓*2(𝑥)𝜓2(𝑥) 𝑑𝑥]
                    𝑐𝑁 = ⟨𝜓𝑁∣𝜓⟩/⟨𝜓𝑁∣𝜓𝑁⟩ = -∞𝜓*𝑁(𝑥)𝜓(𝑥) 𝑑𝑥/[-∞𝜓*𝑁(𝑥)𝜓𝑁(𝑥) 𝑑𝑥].
   This approach to finding the components of a function (using sinusoidal basis functions) was pioneered by the French mathematician and physicist Jean-Baptiste Fourier (1768-1830). In quantum mechanics texts, this process is sometimes called "spectral decomposition", since the weighing coefficients (𝑐𝑛) are called the "spectrum" of a function.

2. Operators and Eigenfunctions

In quantum mechanics, every physical observable is associated with a linear "operator" that can be used to determine possible measurement and their probabilities for a given quantum state.

          2.1 Operators, Eigenvectors, and Eigenfunctions
   The operators in quantum mechanics are called "linear" because applying them to a sum of vectors or functions gives the same result as applying them to the individual vectors or functions and then summing the results. So if O is a linear operator (with a caret hat in the most common in quantum texts) and 𝑓1 and 𝑓2 are functions, then
    (2.1)      O(𝑓1 + 𝑓2) = O(𝑓1) + O(𝑓2).
Linear operators also have the property that multiplying a function by a scalar and then applying the operator gives the same result as first applying the operator and then multiplying the result by the scalar. So if O is a linear operator, 𝑐 is a (potentially complex)  scalar and 𝑓 is a function, then
    (2.2)      O(𝑐𝑓) = 𝑐O(𝑓),
   To understand the operators used in quantum mechanics, we will consider an operator as a square matrix multiplied by a vector like this
    (2.3)      matrix𝑅𝐴̄ = matrix[row-1(𝑅11   𝑅12row-2(𝑅21  𝑅22)] column matrix(𝐴1  𝐴2) = column matrix[(𝑅11𝐴1 + 𝑅12𝐴2)  (𝑅21𝐴1 + 𝑅22𝐴2)] 
   Consider, for example
                 matrix𝑅 = matrix[row-1(4   -2) row-2(-2  4]
and the vector 𝐴̄ = 𝑖̄ + 3𝑗̄. Writing the components of 𝐴̄ as a column vector and multiplying gives
    (2.4)      matrix𝑅𝐴̄ = matrix[row-1(4   -2) row-2(-2  4] column matrix(1  3) = = column matrix[(4)(1) + (-2)(3)  (-2)(1) + (4)(3)] = column matrix(-2  10)
So the operation of 𝑅matrix on vector 𝐴̄ produces another vector that has a different length and points in a different direction.
   Now consider the effect of 𝑅matrix on a different vector - for example 𝐵̄ = 𝑖̄ + 𝑗̄ shown in Fig. 2.2a.
In this case, the multoplication looks like this:
    (2.5)      matrix𝑅𝐵̄ = matrix[row-1(4   -2) row-2(-2  4] column matrix(1  1) = column matrix[(4)(1) + (-2)(1)  (-2)(1) + (4)(1)] = column mattrix(2  2)
                 = 2 column matrix(1  1) = 2𝐵̄.
   A vector for which the direction is not changed after multiplication by a matrix is called an "eigenvector" of that matrix, and the factor by which the length of the vector is scaled is called the "eigenvalue" for that eigenvector. So as shown Fig. 2.2b the vector 𝐵̄ = 𝑖̄ + 𝑗̄  is an eigenvector of 𝑅matrix with eigenvalue of 2.
   Eq. (2.5) is an example of an "eigenvalue equation"; the general form is
    (2.6)      𝑅matrix𝐴̄ = 𝜆𝐴̄,
where 𝐴̄ represents an eigenvector of 𝑅matrix with eigenvalue 𝜆.
   The procedure of determining the eigenvalues and eigenvectors of a matrix is not difficult and now we omit it here. If we work through that process for the 𝑅matrix we have the vector 𝐶̄ =  𝑖̄ - 𝑗̄ is also an eigenvector of 𝑅matrix, it's eigenvalue is 6.
   It is worth noting that the sum of the eigenvalue of a matrix is equal to the trace of the matrix (which is 8 in this case) and the product of the eigenvalue is equal to the determinant of the matrix (which is 12 in this case).
   There are mathematical processes that act as operators on functions to producenew functions and if the new function is a scalar multiple of the original function that function is called an "eigenfunction" of the operator. The eigenfunction  operation corresponding to the eigenvector equation (Eq. 2.6) is
    (2.7)      O𝜓 = 𝜆𝜓 in which 𝜓 represents  an eigenfunction of operator Ộ with eigenvalue 𝜆.
   For example, the second-derivative operator Ĝ2 = 𝑑2/𝑑𝑥2:
    (2.9)      Ĝ2𝑓(𝑥) = 𝑑2(sin 𝑘𝑥)/𝑑𝑥2 = -𝑘2sin 𝑘𝑥 = 𝜆(sin 𝑘𝑥).
That means that sin 𝑘𝑥 is an eigenfunction of the second-drivative operator Ĝ2 = 𝑑2/𝑑𝑥2, and the eigenvalue for this eigenfunction is 𝜆 = -𝑘2.
   In quantum mechanics the eigenvalues for those eigenfunction represent possible outcomes of measurements of that observable.

          2.2 Operators in Dirac Notations
It's helpful to become familiar with the way operators fit into Dirac notation. Using that notation makes the general eiegenvalue equation look like this:
    (2.10)      O ∣𝜓⟩ = 𝜆 ∣𝜓⟩
in which ket ∣𝜓⟩ is called an "eigenket" of operator O.
   Now consider what happen when you form the inner product of ket ∣𝜙⟩ with both sides of this equation:
              (∣𝜙⟩ , O ∣𝜓⟩) = (∣𝜙⟩ , 𝜆 ∣𝜓⟩).
Because the first members of the inner product becomes a bra
    (2.11)      ⟨𝜙∣ O ∣𝜓⟩ = ⟨𝜙∣ 𝜆 ∣𝜓⟩.
Expressions like this with an operator "sandwiched" between a bra and a ket, are extremely common (and useful) in quantum mechanichs.
   Just as a matrix operating on a column vector gives another colume vector, letting operator O work on ket ∣𝜓⟩ gives another ket, which we'll call ∣𝜓'⟩:
    (2.12,13)      O ∣𝜓⟩ = ∣𝜓'⟩.     ⟨𝜙∣ O ∣𝜓⟩ = ⟨𝜙∣𝜓'⟩.
This inner product is proportional to the projection of ket ∣𝜓'⟩ onto the direction of ket ∣𝜙⟩, and that projection is a scalar.
   To see how that works, consider an operator Ậ, which can be represented as a 2 ⨯ 2 matrix:
              matrix𝐴 = matrix[row-1(𝐴11  𝐴12)   row-1(𝐴21  𝐴22)]
in which the elements, 𝐴11, 𝐴12, 𝐴21 and 𝐴22 (collectively referred to as 𝐴𝑖𝑗) depend on the basis system. For example, applying operator Ậ to each of the orthonormal basis vector 𝜖1 and 𝜖2 represented by kets ∣𝜖1⟩ and ∣𝜖2⟩, the matrix elements determine the "amount" of each basis vector in the result:
    (2.14)     Ậ ∣𝜖1⟩ = 𝐴11∣𝜖1⟩ + 𝐴21∣𝜖2     Ậ ∣𝜖2⟩ = 𝐴21∣𝜖1⟩ + 𝐴22∣𝜖2.
Notice that it's the columns of Ẫ that determine the amount of each basis vector. Now take the inner product of the first of these equations with the first basis ket ∣𝜖1⟩:
              ⟨𝜖1∣ Ậ ∣𝜖1⟩ = ⟨𝜖1∣ 𝐴11 ∣𝜖1⟩ + ⟨𝜖1∣ 𝐴21 ∣𝜖2⟩ = 𝐴11 ⟨𝜖1∣𝜖1⟩ + 𝐴21 ⟨𝜖1∣𝜖2⟩ = 𝐴11, since ⟨𝜖1∣𝜖1⟩ = 1 and ⟨𝜖1∣𝜖2⟩ = 0 for an orthonormal basis system.
   Taking the inner product of the second equation in (2.14) with the first basis ket ∣𝜖1⟩ gives
              ⟨𝜖1∣ Ậ ∣𝜖2⟩ = ⟨𝜖1∣ 𝐴12 ∣𝜖1⟩ + ⟨𝜖1∣ 𝐴22 ∣𝜖2⟩ = 𝐴12 ⟨𝜖1∣𝜖1⟩ + 𝐴22 ⟨𝜖1∣𝜖2⟩ = 𝐴12.
   Forming the inner products of both equations in  (2.14) with the second basis ket ∣𝜖2⟩ yields 𝐴21 = ⟨𝜖2∣ Ậ ∣𝜖1⟩ and 𝐴22 = ⟨𝜖2∣ Ậ ∣𝜖2⟩:
    (2.15)      matrix𝐴 = matrix[row-1( ⟨𝜖1∣ Ậ ∣𝜖1⟩  ⟨𝜖2∣ Ậ ∣𝜖1⟩)   row-2(⟨𝜖1∣ Ậ ∣𝜖2⟩  ⟨𝜖2∣ Ậ ∣𝜖2⟩]
which can be written concisely as
    (2.16)      𝐴𝑖𝑗 = ⟨𝜖𝑖∣ Ậ ∣𝜖𝑗⟩.
   Here's an example. Consider the operator in the Cartesian coordinate system given by matrix𝑅 = matrix[row-1(4  -2)   row-2(-2  4)]. Imagine that you're interested in determining the elements of the matrix representing that operator in the two-dimensional orthonormal basis system with basis vectors 𝜖̄1 = 1/√2(𝑖̄ + 𝑗̄') = 1/√2 column matrix(1 1) and 𝜖̄2 = = 1/√2(𝑖̄ - 𝑗̄') = 1/√2 column matrix(1 -1). Using (2.16), the elements of 𝑅matrix in the (𝜖̄1, 𝜖̄2) basis are found as
              𝑅11 = ⟨𝜖1∣ Ȓ ∣𝜖1⟩ = 2,     𝑅12 = ⟨𝜖1∣ Ȓ ∣𝜖2⟩ = 0,     𝑅21 = ⟨𝜖2∣ Ȓ ∣𝜖1⟩ = 0,     𝑅22 = ⟨𝜖2∣ Ȓ ∣𝜖2⟩ = 6.
              matrix𝑅 = matrix[row-1(2  0)   row-2(0  6)]     in the in the 𝜖̄1, 𝜖̄2 basis.
Here the basis vectors 𝜖̄1 and 𝜖̄2 are the (normalized) eigenvectors of the matrix. And when an operator matrix with nondegenerate eigenvalues (that  is, no eigenvalue is shared by two or more eigenfunctions) is expressed using its eigenfunctions as basis functions, the matrix is diagonal (that is, all off-diagonal elements are zero/0, and the diagonal elements are the eigenvalues of the matrix.
   One additional bit of operator mathematics "commutation" will be to encounter in quantum mechanics. Two operators Ĝ and Ĥ are said to "commute" if the order of their application can be switched without changing the result. This can be written as
    (2.17)      Ĝ(Ĥ ∣𝜓⟩) = Ĥ(Ĝ ∣𝜓⟩)     if Ĝ and Ĥ commute
              Ĝ(Ĥ ∣𝜓⟩) - Ĥ(Ĝ ∣𝜓⟩) = 0     (ĜĤ - ĤĜ) ∣𝜓⟩) = 0.
The quantity in parenthesis  (ĜĤ - ĤĜ) is called the commutator of operators Ĝ and Ĥ and is commonly written as
    (2.18)      [Ĝ, Ĥ] = ĜĤ - ĤĜ.
So the bigger the change in the result caused by switching the order of operation, the bigger the commutator.
   The Heisenberg Uncertainty Principle limits the precision with which two observable whose operators do not commute may be simultaneously known.

          2.3 Hermitian Operators
An important characteristic of quantum operators may be understood by considering both sides of Eq. 2.11:
    (2.11)      ⟨𝜙∣ O ∣𝜓⟩ = ⟨𝜙∣ 𝜆 ∣𝜓⟩.
in which ∣𝜙⟩ and ∣𝜓⟩ represent quantum wavefunctions.
   Since the constant 𝜆 is outside both bra ⟨𝜙∣ and ket ∣𝜓⟩, the right side of (2.11) can be written as  
    (2.19)      ⟨𝜙∣ 𝜆 ∣𝜓⟩ = ⟨𝜙∣𝜓⟩ 𝜆 = 𝜆 ⟨𝜙∣𝜓⟩.
   The left side of (2.11) contains some interesting and useful concepts. We can think of this expression in ether of two ways.
One is
              ⟨𝜙∣  →  O ∣𝜓⟩.
   Alternatively, you can view Eq. 2.11 like this:
              ⟨𝜙∣ O  →  ∣𝜓⟩
in which bra ⟨𝜙∣ is operated upon by O and the result (another bra, remember) is destined to run into ket ∣𝜓⟩.
   In the first approach, if we would like, we can move operator O right inside the ket brackets with the label 𝜓, making a new ket:
    (2.20)      O ∣𝜓⟩ = ∣O𝜓⟩,
in which we ar doing is changing the vector to which the ket refers, from 𝜓̄ to the vector produced by operating Ō on 𝜓̄. It's that new ket that forms an inner product with ⟨𝜙∣ in the expression ⟨𝜙∣ O ∣𝜓⟩.
   But when we move the operator O inside the bra ⟨𝜙∣, we must change the operator and the change is called taking "adjoint" of the operator, written as O. So the process of moving operator O from outside yo the inside a bra like this:
    (2.21)      ⟨𝜓∣ O = ⟨O𝜓∣,
So the bra ⟨O𝜓∣ is the dual of ket ∣O𝜓⟩.
   Finding the adjoint of an operator in matrix straightforward. Just take the complex conjugate of each element of the matrix, and then form the transpose of the matrix. If operator O has the matrix representation
    (2.22)      matrixO = matrix[row-1(𝑂11  𝑂12  𝑂13row-2(𝑂21  𝑂22  𝑂23)  row-3(𝑂31  𝑂32  𝑂33)]
then is adjoint O is
    (2.23)      matrixO = matrix[row-1(𝑂*11  𝑂*21  𝑂*31row-2(𝑂*12  𝑂*22  𝑂*32)  row-3(𝑂*13  𝑂*23  𝑂*33)]
If we apply the conjugate-transpose process to a column vector, we we'll see that the Hermitian adjoint of a ket is the associated bra:
             ∣𝐴⟩ = column matrix(𝐴1  𝐴21  𝐴3)
             ∣𝐴⟩ = (𝐴*1  𝐴*2  𝐴*3) = ⟨𝐴∣.
If O transforms ket ∣𝜓⟩ into ket ∣𝜓'⟩, then O transforms bra ⟨𝜓∣ into bra ⟨𝜓'∣. In equations this is
    (2.24)      O ∣𝜓⟩ = ∣𝜓'⟩     ⟨𝜓∣ O = ⟨𝜓'∣,
   We should also aware that it's perfectly acceptable to evaluate an expression such as ⟨𝜓∣ O whithout moving operator inside
the bra. so if ∣𝜓⟩, ⟨𝜓∣ and O are given by
             ∣𝜓⟩ = column matrix(𝜓1  𝜓2)     ⟨𝜓∣ = (𝜓*1  𝜓*2)     matrixO = [row-1(𝑂11  𝑂12),  row-2(𝑂11  𝑂12)]
    (2.25)      ⟨𝜓∣ O = (𝜓*1  𝜓*2) matrix[row-1(𝑂11  𝑂12)  row-2(𝑂11  𝑂12)] = [(𝜓*1𝑂11 + 𝜓*2𝑂21)  (𝜓*1𝑂12 + 𝜓*2𝑂22),
which is the same result as ⟨O𝜓∣:
    (2.26)      matrixO = [row-1(𝑂*11  𝑂*21)  row-2(𝑂*12  𝑂*22)]
               ⟨O𝜓∣ = ∣O𝜓⟩ = (O ∣𝜓⟩) = [matrix{row-1(𝑂*11  𝑂*21)  row-2(𝑂*12  𝑂*22)} column matrix(𝜓1  𝜓2)]
                          = column matrix[(𝜓1𝑂*11 + 𝜓2𝑂*21)  (𝜓1𝑂*12 + 𝜓2𝑂*22)] = [(𝜓*1𝑂11 + 𝜓*2𝑂21)  (𝜓*1𝑂12 + 𝜓*2𝑂22)], in agreement with (2.25).
   So we can see the equivalence of following  expression:
    (2.27)      ⟨𝜙∣ O ∣𝜓⟩ = ⟨𝜙∣O𝜓⟩ = ⟨O𝜙∣𝜓⟩.
   Those operator are called "Hermitian", and their defining characteristic is this: Hermatian operators equal their own adjoints. So if O is a Hermitian operator, then
    (2.28)      O = O     (Hermitian O)
Comparing Eqs. 2.22 and 2.23 we can we can see that  for a Hermitian, The diagonal elements must all be real and off-diagonal element must equals the complex conjugate of the corresponding element on the other side of the diagonal.
   Thus operator O equals its adjoint O, then
    (2.29)      ⟨𝜙∣ O ∣𝜓⟩ = ⟨𝜙∣O𝜓⟩ = ⟨O𝜙∣𝜓⟩ = ⟨O𝜙∣𝜓⟩,
which means that a Hermitian operator may be applied to either member of an inner product with the same result.
For complex continuous functions such as 𝑓(𝑥) and 𝑔(𝑥), the equivalent to Eq. 2.29 is
    (2.30)      -∞𝑓(𝑥)*[O𝑔(𝑥)] 𝑑𝑥 = -∞[O𝑓(𝑥)*]𝑔(𝑥) 𝑑𝑥 = -∞[O𝑓(𝑥)*]𝑔(𝑥) 𝑑𝑥.
   Thus if ∣𝜓⟩ is an eigenket of O with eigenvalue 𝜆, then ∣O𝜓⟩ = ∣𝜆𝜓⟩ and ⟨O𝜙∣ = ⟨𝜆𝜙∣, so
    (2.32)      ⟨𝜓∣𝜆𝜓⟩ = ⟨𝜆𝜓∣𝜓⟩.
For kets, we can move a constant, even if that constant is complex, without changing the constant. So
    (2.33)      𝑐∣𝐴⟩ = ∣𝑐𝐴⟩.     because 𝑐∣𝐴⟩ = 𝑐column matrix(𝐴x  𝐴y  𝐴z) = column matrix(𝑐𝐴x  𝑐𝐴y  𝑐𝐴z) = ∣𝑐𝐴⟩
But for bras, if we want move a constant form one side to other side of a bra, it's necessary to take the complex conjugate of that conjugate of that constant:
    (2.34)      𝑐⟨𝐴∣ = ⟨𝑐*𝐴∣,
because in this case
              𝑐⟨𝐴∣ = 𝑐(𝐴*x  𝐴*y  𝐴*z) =  (𝑐𝐴*x  𝑐𝐴*y  𝑐𝐴*z) = ((𝑐*𝐴x)*  (𝑐*𝐴y)*  (𝑐*𝐴z)*) = ⟨𝑐*𝐴∣.
   Thus in (2.32), pulling the constant 𝜆 out of the ket on the left side and out of the bra on the right side gives
    (2.35)     ⟨𝜓∣𝜆𝜓⟩ = 𝜆*⟨𝜓∣𝜓⟩. 
And from (∣𝐴⟩ = 𝐴1∣𝜖1⟩ + 𝐴2∣𝜖2⟩ + ∙∙∙ 𝐴𝑁∣𝜖⟩2.19) Eq. 2.35 gives
    (2.36)     𝜆⟨𝜓∣𝜓⟩ = 𝜆*⟨𝜓∣𝜓⟩.
Hence 𝜆 = 𝜆*, which means that the eigenvalue 𝜆 must be real. So Hermitian operators must have real eigenvalues.
   Consider the case in which 𝜙 is an eigenfunction of Hermitian operator O with eigenvalue 𝜆𝜙 and 𝜓 is also an eigen function of O with eigenvalue 𝜆𝜓.  Eq. 2.29 is then
             ⟨𝜙∣ O ∣𝜓⟩ = ⟨𝜙∣𝜆𝜓𝜓⟩ = ⟨𝜆𝜙𝜙∣𝜓⟩     𝜆𝜓⟨𝜙∣𝜓⟩ = 𝜆*𝜙⟨𝜙∣𝜓⟩ = 𝜆𝜙⟨𝜙∣𝜓⟩     (𝜆𝜓 - 𝜆𝜙)⟨𝜙∣𝜓⟩ = 0.
This means that because (𝜆𝜓 - 𝜆𝜙) is usually not zero, ⟨𝜙∣𝜓⟩ = 0 and the eigenfunctions of a Hermitian operator with different eigenvalues must be orthogonal.
   And if two or more eigenfunctions share an eigenvalue, that's called the "degenerate" case, and the the eigenfunctions will not, in general, orthogonal. So in the degenerate case, there are an infinite number of non-orthogonal eigenfunctions, from which you can always construct an orthogonal set.
   There's one more useful characteristic of the eigenfunctions of a Hermitian operator: they form a complete set. That means that any function in the abstract vector space containing the eigenfunctions of a Hermitian operator may be made up a linear combination of those eigenfunctions.
   In the discussion of the solutions to the Schrodinger equation in Chapter 4 will show that every quantum observable (such as position, momentum, and energy) is associated with an operator, and the possible results of any measurement are given by the eigenvalues of that operator. Since the results of measurements must be real, operators associated with observables must be Hermitian.

          2.4 Projection Operators
A very useful Hermitian operator is we wll encounter in quantum mechanics is the "projection operator". To understand it, consider the ket  representing thre-dimensional vector 𝐴̄. Expanding that ket using the basis kets representing orthonormal vectors 𝜖1̄, 𝜖2̄ and 𝜖3̄ looks like this:
    (2.37, 38, 39)     ∣𝐴⟩ = 𝐴1∣𝜖1⟩ + 𝐴2∣𝜖2⟩ + 𝐴3∣𝜖2⟩ = ⟨𝜖1∣𝐴⟩ ∣𝜖1⟩ + ⟨𝜖2∣𝐴⟩ ∣𝜖2⟩ + ⟨𝜖3∣𝐴⟩ ∣𝜖3⟩ = 𝜖1⟩ ⟨𝜖1∣𝐴⟩∣ + ∣𝜖2⟩ ⟨𝜖2∣𝐴⟩ + ∣𝜖3⟩ ⟨𝜖3∣𝐴⟩
in which we can consider grouping ∣𝜖1⟩ ⟨𝜖1∣, ∣𝜖2⟩ ⟨𝜖2∣ and ∣𝜖3⟩ ⟨𝜖3∣ as projection operators. The general expression for a projection operator is
    (2.41)     𝑃^𝑖 = ∣𝜖𝑖⟩ ⟨𝜖𝑖∣,
where 𝜖𝑖̄ is any normalized vector. Feeding operator 𝑃^1 the ket representing vector 𝐴̄ shows what's happening:
    (2.42)     𝑃^1 ∣𝐴⟩ = ∣𝜖1⟩ ⟨𝜖1∣𝐴 = 𝐴1 ∣𝜖1
So applying the projection operator to ∣𝐴⟩ produces new ket 𝐴1 ∣𝜖1. For completeness, the other 𝑃^2 and 𝑃^3 to ∣𝐴⟩ are
    (2.43)     𝑃^2 ∣𝐴⟩ = ∣𝜖2⟩ ⟨𝜖2∣𝐴 = 𝐴2 ∣𝜖2⟩     𝑃^3 ∣𝐴⟩ = ∣𝜖3⟩ ⟨𝜖3∣𝐴 = 𝐴2 ∣𝜖3⟩.
If we sum the results, then the result is
              𝑃^1 ∣𝐴⟩ + 𝑃^2 ∣𝐴⟩ + 𝑃^3 ∣𝐴⟩ = 𝐴1 ∣𝜖1⟩ + 𝐴2 ∣𝜖2⟩ + 𝐴3 ∣𝜖3⟩ = ∣𝐴⟩  or  (𝑃^1 + 𝑃^2 + 𝑃^3) ∣𝐴⟩ = ∣𝐴⟩.
Writing this for the general case in an 𝑁-dimensional space:
    (2.44)     ∑𝑖=1𝑁𝑃^𝑛 ∣𝐴⟩ = ∣𝐴⟩.
This means that the sum of the projection oprators using all of the basis vectors equals the "identity operator" I. The identity operator is the Hermitian operator that produces a ket that is equal to the ket that is fed into the operator:
    (2.45)     I ∣𝐴⟩ = ∣𝐴⟩.
The matrix representation 𝐼matrix of the identity operator in three dimensions is
    (2.46)     matrix𝐼 = [row-1(1  0  0)  row-2(0  1  0) row-3(0  0  1)].
   The relation
    (2.47)     ∑𝑖=1𝑁𝑃^𝑛 ∣𝐴⟩ = ∑𝑁𝑛=1 ∣𝜖𝑛⟩ ⟨𝜖𝑛∣ = I
is called the "completeness" or "closure" relation, since it holds true when applied to any ket in an 𝑁-dimensional space. That means that any ket in that space can be represented as the sum of 𝑁 basis kets weighted by 𝑁 components.
   The projection operator in an 𝑁-dimensional space may be represented by an 𝑁⨯𝑁 matrix as in (2.16) and it's necessary to decide which basis system you'd like to use.
   One option is to use the basis system consisting of the eigenkets of the operator. In the matrix representing an operator is diagonal, and each of the diagonal elements is an eigenvalue of the matrix.
   For 𝑃^1 the eigenket equation is
    (2.48)    𝑃^1 ∣𝐴⟩ = 𝜆1 ∣𝐴⟩,
where ∣𝐴⟩ is an eigenket of 𝑃^1 with eigenvalue 𝜆1. With ∣𝜖1⟩ 𝑃^1 gives              𝑃^1 ∣𝜖1⟩ = ∣𝜖1⟩ ⟨𝜖1∣𝜖1⟩ = 𝜆1 ∣𝜖1⟩     𝜆1 = 1
Hence ∣𝜖1⟩ is indeed an eigenket of 𝑃^1, and the eigenvalue is one. Similarly ∣𝜖2⟩ and ∣𝜖3 ⟩ are eigenkets of 𝑃^1 with eigenvalues of 0 and 0 respectively. With these eigenkets the matrix elements (𝑃1)𝑖𝑗 can be found using Eq. 2.16:
    (2.49)    (𝑃1)𝑖𝑗 = ⟨𝜖1∣ 𝑃^1 ∣𝜖1⟩.
Then (𝑃1)11 = 1 and the rest of (𝑃1)𝑖𝑗 are all zero.
Thus the matrix representing 𝑃^1 in the basis of its eigenkets  ∣𝜖1⟩,  ∣𝜖2⟩, and  ∣𝜖3⟩ is
    (2.50)    matrix𝑃1 = matrix[row-1(1  0  0)  row-2(0  0  0)  row-3(0  0  0)].
   Through a similar analysis we obtain
    (2.51, 52)    matrix𝑃2 = matrix[row-1(0  0  0)  row-2(0  1  0)  row-3(0  0  0)]     matrix𝑃3 = matrix[row-1(0  0  0)  row-2(0  0  0)  row-3(0  0  1)].
According to the completeness relation Eq. 2.47, the matrix representation of the projection operators should added up to the identity operator:
    (2.53)    matrix𝑃1 + matrix𝑃2 + matrix𝑃3matrixI.
   An alternative method of finding the matrix elements of the projection operator is to use outer product rule for matrix multiplication,
              𝑃^1 =  ∣𝜖1⟩ ⟨𝜖1∣ = column matrix(1  0  0) (1  0  0) = matrix[row-1(1  0  0)  row-2(0  0  0)  row-3(0  0  0)]
              𝑃^2 =  ∣𝜖2⟩ ⟨𝜖2∣ = column matrix(0  1  0) (0  1  0) = matrix[row-1(0  0  0)  row-2(0  1  0)  row-3(0  0  0)]
              𝑃^3 =  ∣𝜖3⟩ ⟨𝜖3∣ = column matrix(0  0  1) (0  0  1) = matrix[row-1(0  0  0)  row-2(0  0  0)  row-3(0  0  1)].
   The projection operator is useful in determining the probability of measurement outcomes for a quantum observable by projecting the state of a system onto the eigenstates of the operator for that observable.

go to top