Abstract

The bare rudiments of the principle of mathematical induction as a method of proof date back to ancient times. In the contemporary university milieu, the demonstrative scheme is taught as part of a course in discrete mathematics, set theory, number theory, graph theory, group theory, game theory, linear algebra, logic, and combinatorics. In theoretical computer science, it bears the pivotal role of developing the appropriate cognitive skills necessary for the effective design and implementation of algorithms, assessing for both their correctness and complexity. Pure mathematics and computer science aside, the scope of its utility in the physical sciences remains limited. Following an outline of some elementary concepts from vector algebra and phasor analysis, the proofs by induction of a couple of salient results in multiple-slit interferometry are presented, viz., the fringe intensity distribution formula and the upper bound of the total fringe count. These specific optical instantiations serve to illustrate the versatility and power of the principle at tackling real-world problems. It thereby makes a welcome departure from the popular view of induction as a mere last resort for proving abstract mathematical statements.

1. Introduction

1.1. A Brief History of the Principle of Mathematical Induction

The exact origins and naming of the process of logical reasoning known today as the principle of mathematical induction (PMI) is the subject of much scholarly debate [1, 2]. There is evidence suggesting that the ancient Hindus and Greeks had possessed vague hints of this principle [2, 3]. Its full inception as a robust formal procedure for the demonstration of the validity of a proposition concerning natural numbers can be attributed to no single individual or culture. This is because for a long time, the method of proof was prolifically used across continents in diverse contexts, without bequeathing a special name to it. The works of al-Karaji (Arab), Levi Gershon (French Jew), Franciscus Maurolycus (Italian), Blaise Pascal (French), Pierre de Fermat (French), and Jakob Bernoulli (Swiss) stand testament to this historical oddity [46]. The earliest appearance of the term induction dates back to a treatise on arithmetic by the 17th century Englishman John Wallis. But it was his fellow countryman Augustus De Morgan’s more ornate choice of nomenclature mathematical induction in his Penny Cyclopedia article, some 200 years later, that was destined to rise to prominence and persist up to the present day.

1.2. An Outline of the Method of Proof

There are many variations in the formulation of the PMI that can be found in the literature [713]. The one most commonly enunciated in introductory courses entails two principal steps, viz., the base case (or basis/initial case) and the induction (or inductive) step (see Box 1). For pedagogical purposes, it is often convenient to split the latter into two further steps, the induction hypothesis and the induction step proper, thus, making a total of three steps instead of two. This approach is adopted in the later sections. The natural numbers are defined here as the set of all positive integers {1, 2, 3, …}, denoted by [14, 15].

1.3. Vector Algebra: Rules of Summation

A vector is an abstract object possessing the properties of magnitude and direction, such as force, velocity, displacement, and acceleration [16]. They obey some very well-defined algebraic and geometric laws. Among these are two equivalent rules of summation, viz., the parallelogram law and the triangle law [17, 18]. According to the former, if two vectors are represented in magnitude and direction by the two adjacent sides of a parallelogram that share a common vertex, then their resultant is represented in magnitude and direction by the diagonal passing through that shared point, and according to the latter, if two vectors are represented in magnitude and direction by the two sides of a triangle taken in the same order, then their resultant is represented in magnitude and direction by the third side of the triangle taken in the opposite order [19]. Consider two randomly oriented vectors and of magnitudes and that are summed using either one of the above rules, to yield a resultant of magnitude (see Figure 1). Denote the angles and by and , respectively (not shown in the figure). The magnitude and direction of the resultant may be given by the formulae

1.4. Phasor Analysis

Phasor analysis is an extremely handy tool for describing physical quantities that vary sinusoidally with time such as light intensity in physical optics and voltage (or current) in electrical engineering. A phasor can be thought of as a special kind of vector that rotates (counter-clockwise) about the origin of a rectangular (Cartesian) coordinate system with uniform angular speed [20]. Its graphical plot is called a phasor diagram (see Figure 2) [18]. As rotates about , the length of its projection onto the ordinate and abscissa varies sinusoidally. The frequency of linear oscillation of these projections about along either axis and the rotational speed of the phasor itself are related: . In Section 2, the parallelogram law of vector addition (Equation (1)) is applied in a successive fashion over many phasors that are of equal magnitude and equal angular spacing, in order to arrive at a general recurrence relation for the resultant.

2. Iteration of the Parallelogram Law in a Phasor Diagram

Consider phasors with angular spacings between them, where (see Figure 3). The resultant of the first two phasors is found with the 1st iteration of the parallelogram law; the resultant of three phasors is found with the 2nd iteration; the resultant of four phasors is found with the 3rd iteration and so on. These successive resultants lie at angular spacings . It thus follows that the final resultant of phasors can be determined by iterating the parallelogram law a total of times over each successive phasor and the resultant that immediately precedes it (see Table 1).

Let us assume that all the phasors have an equal magnitude and equal angular spacing . That is,

The magnitude and direction of successive resultants for the first four iterations, satisfying the conditions (2) and (3), are presented in Table 2. The detailed calculations of all the entries are available in the Supplementary Material. From inspection of this small sample of cases, the recurrence relations (4) and (5) for the iteration may be conjectured, subject to the initial condition . In Section 3, the standard formulation of the PMI is used to prove the closed form expressions of these relations.

3. Theorem 1 and Some Corollaries

Theorem 1. The sequence of resultants and their corresponding angular spacings obtained from the iterative summation of multiple phasors of equal magnitude and equal angular spacing using the parallelogram law can be expressed by the following recurrence relations and their corresponding closed forms, subject to the initial condition :

Proof. Step 1 (Base case): for , Note the initial condition . The base case thus satisfies the expressions for and .
Step 2 (Induction hypothesis): for some such that , let us assume that Step 3 (Induction step): for ,

Hence, by the principle of mathematical induction, we may conclude that the magnitude and direction of the resultant obtained after -iterations of the parallelogram law of vector addition is given by the following recurrence relations and their corresponding closed forms, subject to the initial condition :

Corollary 2. For phasors of equal magnitude and equal angular spacing , the final resultant may be expressed as .

Proof. By Theorem 1, the resultant obtained after -iterations of the parallelogram law of vector addition is given by . Based on Table 1, it is clear that the final resultant of a total of phasors is obtained upon substituting . Hence, QED

Corollary 3. For phasors of equal magnitude and equal angular spacing , the final resultant has a null magnitude.

Proof. Upon substituting in , we get QED

Corollary 4. For phasors of equal magnitude and equal angular spacing , the final resultant has a magnitude .

Proof. Upon directly substituting in , we get the indeterminate format . It is therefore, necessary to apply the appropriate limit operations for evaluating the final resultant at .

4. Analysis of Multiple Slit Interference

4.1. Fringe Intensity Distribution Formula

The closed form expression for the final resultant of phasors (Corollary 2) can be directly imported into physical optics for the purpose of deriving the fringe intensity distribution formula that is found in many standard treatments on multiple slit interference (see Figure 4) [21, 22]. There are four premises upon which this calculation is based. Firstly, the identical and equally spaced slits are of negligible widths and behave like coherent point sources of light that are in phase with each other at their respective spatial locations. Secondly, the instantaneous field contributions from each of the slits at some arbitrary point on the detection screen can be graphically represented as distinct phasors, all having the same angular frequency (see Figure 3). Thirdly, the distance between slits is negligible compared to the distance between the multiple slit-barrier and the screen (far field condition). Consequently, light rays that are convergent at a single point on the distant screen may be considered as nearly parallel in the vicinity of the slits [2325]. Fourthly, the relative fall in the intensity of light that occurs as it propagates away from the slits is negligible. So, the field amplitudes may be taken as nearly equal and undiminished over all points on the screen.

The projection of onto the -axis yields the instantaneous amplitude of oscillation .

By Theorem 1, we may substitute into Equation (11).

Substituting Corollary 2 into Equation (12),

Invoking the (time averaged) intensity-amplitude relationship,

Substituting into Equation (14) and defining the relative intensity as the ratio of the time averaged intensity to the intensity due to a single slit .

From Corollary 3 and Equation (15), corresponds to a minimum intensity, since the final resultant has a null magnitude.

From Corollary 4 and Equation (15), corresponds to a maximum intensity, since all the phasors are in perfect alignment (i.e., unidirectional).

By defining the normalized relative intensity ratio as , we may finally infer from Equations (16a) and (16b), the desired -slit interference formula that succinctly describes the variation of intensity of the bright fringes captured on a distant screen, after light is diffracted through a grating.

Equation (17) is conventionally derived by means of complex exponential representation followed by normalization [26]. However, it was arrived at here by the exclusive use of the PMI in conjunction with the parallelogram law of vector addition and phasor diagrams.

4.2. Total Fringe Count

In some recently published papers on the classical double slit and multiple slit experiments, a hyperbola-based analysis of wave interference was employed to study the distribution patterns of fringes on distant screens of varied shapes and orientations [2325, 27]. It was shown there that in the double-slit scenario, the total fringe count depends on the ratio of the interslit distance to the wavelength of light . When the screen is oriented parallel to the line joining the two slit-sources and , the total number of hyperbolic shaped fringes formed is given by (see Figure 5) [24]

When the screen is oriented orthogonal to the line joining two point-sources and , the total number of circular shaped fringes formed is given by (see Figure 6) [27]

The special brackets for denotes the floor function (see Supplementary Material for a formal proof of Equation (18)). Now, if there are instead slits (or equivalently, point-sources) under consideration, then some degree of overlap of the fringes formed from pair-wise interference may be expected. Nonetheless, an upper bound exists beyond which the total count cannot exceed (i.e., ). A proof of its quantitative expression using the generalized formulation of the PMI (with base case of magnitude 2) is presented below.

Theorem 5. In the multiple-slit experiment, wherein a linear series of equally spaced slits of negligible widths are treated as equivalent to a chain of coherent, in-phase, point-sources of light and the distant screen is oriented parallel to the line joining them, the upper bound of the total fringe count may be expressed as Here, denotes the combinatorial function, the total number of slit sources, the uniform inter-slit distance, the wavelength of light and the index of summation, respectively.

Proof. Step-1 (Base case): for , The base case thus satisfies Equation (18) for the double-slit scenario.
Step-2 (Induction hypothesis): for some such that , let us assume that Step-3 (Induction step): For ,

The induction step implies that the total fringe count for slit sources is equal to the total fringe count for the first slit sources added to the counts for the pair-wise combinations of the slit source with each of the other slit sources . This is clearly true from inspection of Figure 7. Thus, by the principle of mathematical induction, we may conclude that

Theorem 6. In the multiple point-source scenario, wherein a linear series of equally spaced point sized sources emanate spherical wavefronts of light in a coherent, in-phase manner and the distant screen is oriented orthogonal to the line joining them, the upper bound of the total fringe count may be expressed as Here, denotes the total number of point sources, the uniform intersource distance, the wavelength of light, and the index of summation, respectively.

Proof. Step 1 (Base case): for , The base case thus satisfies Equation (19) for the double point-source scenario.
Step 2 (Induction hypothesis): for some such that , let us assume that Step 3 (Induction step): for ,

The induction step implies that the total fringe count for point sources is equal to the total fringe count for the first point sources added to the counts for the pair-wise combinations of the point source with each of the other point sources . This is clearly true from inspection of Figure 8. Thus, by the principle of mathematical induction, we may conclude that

5. Discussion

In this paper, the principle of mathematical induction was enunciated as a method of proof of formal propositions, and its historical roots was very briefly touched upon. The demonstrative scheme has traditionally been taught to students as part of courses in pure mathematics and theoretical computer science. In the former stream, it is used to prove statements like “the sum of the first natural numbers is equal to the expression ”. But in the latter stream, it serves a much deeper purpose of an instructional nature, mainly in developing the appropriate cognitive skills necessary for the effective design and implementation of algorithms, assessing for both their correctness and complexity [28]. However, the scope of its utility in the physical sciences remains limited. This is clearly evident from the scarcity of available literature linking the two disparate domains and the overall exclusion of the topic from the core physics curriculum at the university level in most parts of the world.

In recent theoretical work on the multiple-slit experiment, the PMI was used to prove two important theorems—the generalized hyperbola and hyperboloid (or community) theorems for a linear array of point/slit sources [25]. The exact path (and phase) differences between any source pair could then be computed directly from these theorems and the intensity distribution curves plotted for varied orientations of the distant screen. The formalism was shown to encompass single-slit diffraction and double-slit interference, the near and far field conditions, the small and large angle scenarios, and a two-dimensional square array of point sources.

The new analysis offers several advantages over the conventional approach. These include firstly a unified geometrical framework that treats both interference and diffraction phenomena on the same classical footing, viz., the Huygens-Fresnel principle; secondly, a robust scheme for the counting of fringes and the stipulation of the laws governing their spatial distribution; thirdly, a visually more intuitive interpretation of wave superposition that bears pedagogical significance; fourthly, greater accuracy in the description of double-slit interference, multiple-slit interference, and single-slit diffraction of the Fraunhofer class; fifthly, novel proposals for the measurement of the wavelength of light, the refractive index of a liquid medium, the study of 2D materials and crystals, and the detection of gravitational waves—all based on the characteristics of concentric circular fringes formed by the interference of light from two (or more) point-sources. It is worth mentioning here that as part of a forthcoming project, the statement of Theorem 1 is employed to rigorously analyze single-slit diffraction of the Fresnel class.

6. Conclusion

The current paper takes the application of the PMI still further, by furnishing the proofs of a couple more salient results in multiple-slit interferometry, viz., the fringe intensity distribution formula and the upper bound of the total fringe count. These specific optical instantiations serve to illustrate the versatility and power of the principle at solving real-world, physical problems. It thereby makes a welcome departure from the popular view of induction as a mere last resort for proving abstract mathematical statements.

Data Availability

All supporting data is contained within this article and its accompanying supplementary material.

Conflicts of Interest

The author declares no conflict of interest regarding the publication of this paper.

Acknowledgments

Gloria in excelsis Deo.

Supplementary Materials

Contains additional information pertaining to the parallelogram law of vector addition and phasor diagrams (Sections 2 and 3), and the expression for total fringe count (Section 4.2). (Supplementary Materials)