[Discrete Math] The Pigeonhole Principle: A Beginner's Introduction

Overview

The Pigeonhole Principle (also sometimes shortened to simply 'PHP') is a mathematical principle in the large sphere called "discrete mathematics".

The Pigeonhole Principle can be first understood by seeing that, at its simplest form, it makes a very trivial observation: Given some number of objects and some lesser number of boxes to put the objects into (assuming all objects find their way into a box), there necessarily exists at least one box with multiple objects in it. This concept manifests itself in very simple cases; for example, that in the following section.

Note that this article is not extremely rigorous in that it does not mathematically prove many of the concepts and instead relies on intuition, as it is meant to grant understanding as opposed to a very rigorous education of the subject (which itself is very important and should be pursued by readers).

A Basic Example

Given 3 objects (represented as 'o's) and 2 boxes to put the objects into (represented as '[]'s), there is no possible configuration where all objects are in a box and there is not a box with at least 2 objects.

Attempting a counterexample

This can be demonstrated by attempting to make a system in which this is not true. This attempt can be started with the original situation that has been described.

o  o  o     } 3 objects
[]   []     } 2 boxes

Step 1

It is trivially true that to move towards the counterexample, one object must move into a box first. Note that which box is chosen is completely arbitrary (imagine that, were the other box to be chosen, the two boxes could swap and would leave an identical situation).

 o   o      } 2 objects left
[o]  []     } 1 object has been put in one of the boxes

Step 2

From this point, there are two possible actions that can be done in order to attempt to construct the system in which there is no box with 2 objects.

   o        } 1 object left
[oo] []     } 1 object has been put in the same box as the first object

In case 1, it is trivially true that the case in which all objects are inside a box and there is no box with greater than 1 element has not been created. Case 1, then, cannot be a possible course of action to prove this proposition incorrect.

   o        } 1 object left
[o] [o]     } 1 object has filled the empty box

In case 2, it can be seen that such a course of action creates a situation in which a system that demonstrates the proposition to be false cannot be made, as if one were to attempt to move the final object into a box, there would, in fact, be a box with more than 1 object inside of it.

This illustrates the truth of the Pigeonhole Principle in a very simple case, but note that it has not proven its truth for any other case.

Generalizing the Simple Case

It can be noted that given the truth of the above situation, similar situations arise when a different number of objects and boxes are used.

For example, given 21 objects and 20 boxes, it follows fairly intuitively that there must be one box with at least two objects inside of it; imagine a one-to-one mapping between boxes and objects that determined which box an object would be placed in. There would be one object paired to no box, and therefore, to place all objects into boxes, that object would necessarily be placed in a box with an object already in it, thus creating the box with two objects.

There are more cases, however, than just having n+1 objects and n boxes. Suppose that there exists n objects and m boxes.

What if n < m?

If n < m, it follows fairly intuitively that there must exist one box with one object inside of it. Imagine the simple case with 2 boxes and 1 object:

   o       } 1 object
[]   []    } 2 boxes

Regardless of which box that one object is placed in, the one box that the object is placed in will satisfy the aforementioned proposition.

What if n > m?

A subset of the cases where n > m has already been discussed; specifically, the situation in which n = m+1. In that situation, it has been shown that with n objects and m boxes, there will exist at least one box that contains two objects.

This can be extended to the case where n = m+k, where k is an integer in the range [1, m]. In that case, while it is certainly possible that a box contains more than 2 objects, the Pigeonhole Principle does not grant any authority to claim that such a configuration of objects and boxes must be present.

Imagine the case where n = 7 and m = 4, which satisfies the above generalized case (where n = m+k for some integer k in [1, m]). In such a case, the following situation would be present:

o o o o o o o           } 7 objects
 [] [] [] []            } 4 boxes

Were one to attempt to place all objects into the boxes evenly, the following situation would occur:

      o o o           } 3 objects remaining
 [o] [o] [o] [o]      } 1 object has been placed in each box

It can be seen that, while it is possible to stack all 3 remaining objects into one box, that is not necessarily guaranteed to occur. At the very minimum, the Pigeonhole Principle still only grants the authority to claim that there is at least one box with more than 1 object, as that is a case that will be met in the above scenario with complete certainty.

Given that n is the number of objects, m is the number of objects, and n = m+k for some integer k, what would happen if k surpassed m? That is, k is no longer within [1, m], but rather is within [m+1, 2*m]. Consider another simple case, where n = 7 and m = 3:

  o o o o o o o           } 7 objects
    [] [] []              } 3 boxes

It can be seen that, when each object is placed into a box, the guaranteed number of objects inside at least one box increases:

   o o o o        } 4 objects remaining
 [o] [o] [o]      } 1 object has been placed in each box

When each box has at least one object in it, the following case is inevitable (even when attempting to minimize the number of objects per box).

                  } 0 objects remaining
 [ooo] [oo] [oo]  } The remaining objects have been placed in the boxes in the fairest possible way

In Situation (3), it can be seen that there is, in fact, a box with 3 objects. This is necessarily the case, as the above case is one where the number of objects per box has been minimized (the objects have been evenly spread to the best possible extent).

Based on what has been demonstrated thus far, it can be intuitively reasoned (but not claimed with complete mathematical certainty--that would require a proof of these concepts, which may be done in a later article less suitable for complete beginners) that, given n objects and m boxes and given that n = m+k for some integer k:

  • When k is an integer in [1, m], there must be one box with 2 objects
  • When k is an integer in [m+1, 2*m], there must be one box with 3 objects

The pattern will continue. For example, when k is within [2m+1, 3m], say, when n = 7 and m = 2:

 [ooo] [oooo]  } 7 objects have been placed in the 2 boxes in the fairest way possible

It can be seen that there must be one box with at least 4 objects.

Defining a mathematical representation for what has been seen

To generalize what we can claim even further, it can be noted that the minimum number of objects in at least one box is equal to the ceiling of n/m (ceil(n/m)). For example, in the last case mentioned, when n = 7 and m = 2, ceil(n/m) = ceil(7/2) = ceil(3.5) = 4. This holds true in all following cases:

Case 1

n = 4, m = 3

 o o o o     } 4 objects
[] [] []     } 3 boxes
-------------------------
[oo] [o] [o]

By the Pigeonhole Principle, there must be one box with ceil(n/m) = ceil(4/3) = ceil(~1.333) = 2 objects.

Case 2

n = 8, m = 3

 o o o o o o o o     } 8 objects
[] [] []             } 3 boxes
-------------------------
[ooo] [ooo] [oo]

By the Pigeonhole Principle, there must be one box with ceil(n/m) = ceil(8/3) = ceil(~2.667) = 3 objects.

Case 3

n = 2, m = 3

  o  o               } 2 objects
[] [] []             } 3 boxes
-------------------------
[o] [o] []

By the Pigeonhole Principle, there must be one box with ceil(n/m) = ceil(2/3) = ceil(~0.667) = 1 object.

Definition

Thus far, the concepts demonstrated thus far have provided a definition of the Pigeonhole Principle that is the following:

Given n objects and m boxes, when all of the objects are placed into the boxes, there must exist at least one box with ceil(n/m) objects inside of it.

This article has not done much in the way of rigorously proving the above definition but has instead shown that the above definition is fairly intuitively obvious.

Sample Problems

The Pigeonhole Principle might seem fairly trivial, but it can provide interesting insight into seemingly difficult problems. Here are a few example problems in which the Pigeonhole Principle proves useful:

Problem 1

Given 5 points randomly placed inside of a 2x2 box, prove that there must exist a pair of two points that are within sqrt(2) of each other.

___________
|     *   |
|  *    * |
|         | 2
|*    *   |
|_________|
     2

The above case is one in which 5 points (objects) are being placed into 4 sub-squares (boxes), and therefore, by the Pigeonhole Principle, there must exist at least one box (sub-square) that contains 2 objects (points):

___________
|    |*   |
|  * |  * | 1
|----|----| 
|*   |*   | 1
|____|____|
  1     1

If it is considered that each sub-square is a 1x1 section of the original 2x2 square, which has a maximum distance of sqrt(2) (the diagonal of the square; this can be proved with calculus, but is fairly intuitively obvious), it then follows that the 2 points sharing the same sub-square are at least within sqrt(2) of each other.

Problem 2

Given 367 humans, prove that at least 2 share the same birthday.

It can be considered that 367 objects (humans) are being placed into 366 boxes (possible birthdays). By the Pigeonhole Principle, there must be at least one box (birthday) with two objects (humans) inside of it. Therefore, there must be at least two humans in the 367 human group that have the same birthday.

The above two problems are simple examples, but the Pigeonhole Principle proves its use in far more complicated scenarios as well. The reader is encouraged to research more problems to apply this principle further.


Edited by Ryan Fleury on