Home » Equidistribution Theory

Category Archives: Equidistribution Theory


On power free numbers

Entry 1. Let f(r) be any divergent series of positive terms, q_{r,k} be the r^{th} k-power free number, \zeta(k) be the Riemann Zeta function. Let S_f(r) = \sum_{i=1}^{r} f(i). If  g(x) is Riemann integrable in (0, \infty) then,

\displaystyle{ \sum_{r=1}^{n} f(q_{r,k}) g \Big(\frac{S_f(r)}{\zeta(k)}\Big) \sim \int_{f(1)}^{\frac{S_f(n)}{\zeta(k)}} g(x)dx.}

Corollary 1. As k \rightarrow \infty, \zeta(k) \rightarrow 1. Also every natural number is k-power free when k \rightarrow \infty. Hence the above result reduces to

 \displaystyle{ \sum_{r=1}^{n} f(r) g(S_f(r))\sim \int_{f(1)}^{S_f(n)} g(x)dx}.

Entry 2. Further let q'_{k,n} be the n^{th} k-power containing number and f be any function Riemann integrable in (1,\infty); then,

\displaystyle{ \sum_{k=2}^{\infty}\frac{1}{k} \Big\{\frac{f(q'_{k,1}) + f(q'_{k,2}) + f(q'_{k,3}) +\ldots}{f(q_{k,1}) + f(a_{k,2}) + f(q_{k,3}) +\ldots}\Big\}= 1 - \gamma}.


Benford process tends to minimize product

Consider the simple question, what is the largest product of two numbers that can be formed using all the ten digits of the decimal system, each digit used exactly once. We have two answers \{96420, 87531\} and \{9642, 875310\}. Continuing on the same line, we can seek three numbers that give the maximum product and so on. In general we can ask: Given a set A of n single digits numbers, not necessarily distinct, form a set of k numbers using all the given numbers i.e. of the digit 2 occurs thrice in the set A then we can use the digit 2 exactly three times while forming our set with the maximum product. Trivially we can see that there can be one of more answer depending on weather the number 0 is a element of A because  we can form more and one set with same product by placing the 0 in the unit’s place of any number that does not end in a 0 .

Suppose that a certain random process can generate one set of k number per run using all the elements of a given a set of numbers A. Hence an exhaustive run of this process will generate all possible set of numbers with k elements. Let the set of all numbers generated by this random process satisfy the conditions of Benford Law. We shall call such a process as a Benford process and formally define it as:

Benford Process: A Benford process is a random process which generates numbers that satisfy the conditions of Benford Law.

Now if we run a Benford process once, how likely is that the process generates a set of numbers that give the maximum product. In this article we shall see that under the conditions of Benford Law, the probability that the k numbers so generated will have the largest product should be the least. In other words, numbers generated a random process which satisfies the conditions of Benford Law are least likely give the maximum product.

Theorem 1. For any give k and a set of single digits numbers A, not necessarily distinct, the numbers generated by a Benford process, using all the elements of A, is least likely to have the maximum product.

Proof: The proof of this theorem follows trivially form the following lemma.

Lemma 1Let A=\{d_1, d_2, ... , d_n\} be a set of n digits in base 10 notation, not necessarily distinct, such that 0\le d_i \le d_{i+1}\le 9. If a set M_k (A) of k elements is formed using only and all the elements of the set A then the elements of M_k (A) are given by

a_j = d_{n-k+j}d_{n-2k+j}...d_{n-rk+j}

where n-rk+j \ge 1, j=1, 2, ... , k.

Thus larger a digit, the more significant position it occupies. According to the General Benford Law, the probability of a number starting with a larger digit in the most significant place is lower than that of number starting with a smaller number in the most significant place. Hence the theorem follows.

Asymptotic and partial equidistribution modulo one

1. Introduction

A sequence of real numbers a_n is equidistributed on a given interval if the probability of finding a_n in any subinterval is proportional to the subinterval length. In particular, if \{a_n\} denotes the fractional part of a_n by then a sequence a_n is said to be equidistributed modulo 1 or uniformly distributed modulo 1 if , for all 0 \le b < c \le 1,

\displaystyle{\lim_{N \rightarrow \infty}\frac{\#\{r \le N : b < \{a_n\} \le c\}}{N} = c - b.}

Often problems related to equidistribution modulo 1 can be established using a powerful theorem called Weyl’s criterion which gives the necessary and sufficient conditions for a sequence to be equidistributed mod 1. Weyl’s criterion can be stated in the following equivalent forms.

Theorem 1. (Weyl’s Criterion) Let a_n be a sequence of real numbers. The following are equivalent:

  1. The sequence a_n is equidistributed mod 1.
  2. For each nonzero integer k, we have \displaystyle{\lim_{n \rightarrow \infty}\frac{1}{n}\sum_{r<n}e^{2\pi i k a_r } = 0}.
  3. For each Riemann-integrable f:[0,1] \rightarrow \mathbb{C} we have

\displaystyle{\lim_{n \rightarrow \infty}\frac{1}{n}\sum_{r \le n}f(\{a_r\}) = \int_{0}^{1} f(x)dx}.

2. Asymptotic equidistribution mod 1

Lemma 2.1. If u_n is a bounded sequence of real numbers such that for every fixed \epsilon, 0 < \epsilon \le 1\lim_{n \rightarrow \infty}u_{[n\epsilon]} = \epsilon then u_n is equidistributed mod 1.

Example. Taking u_r = r/n in Lemma 2.1, we have

\displaystyle{\lim_{n \rightarrow \infty}\frac{1}{n}\sum_{r \le n}f\Big(\frac{r}{n}\Big) = \int_{0}^{1} f(x)dx}.

Recall that the above example, also known as the rectangle formula, is a standard method in elementary calculus for evaluating the limit of a sum using definite integral. Since the rectangle formula holds for all functions f Riemann integrable in (0,1) it implies that the sequence r/n approaches equidistribution mod 1 as n \rightarrow \infty. This leads us to the concept of asymptotic equidistribution mod 1 which is defined as follows.

Definition 2.2. A sequence of positive real numbers s_n is said to be asymptotically equidistributed modulo one if s_n is increasing and the sequence of ratios


approach uniform distribution modulo one as n \rightarrow \infty.

Theorem 2.3. The sequence s_n = a n(\ln (b n))^{c}, where a, b and c are positive constants, is asymptotically equidistributed mod 1 and therefore by Weyl’s Criterion we have

\displaystyle{\lim_{n \rightarrow \infty}\frac{1}{n}\sum_{r \le n}f\Big(\frac{s_r}{s_n}\Big) = \int_{0}^{1} f(x)dx}.

Thus the sequence of natural numbers, the sequence of primes p_n, the sequence of composite numbers c_n and the sequence of the imaginary parts of the non-trivial zeroes of the Riemann zeta function \gamma _n are asymptotically equidistributed mod 1.


\displaystyle{\lim_{n \rightarrow \infty} \frac{1}{n} \sum_{r \le n}\frac{p_n ^{2}}{p_n ^{2}+ p_r ^{2}}=\lim_{n \rightarrow \infty} \frac{1}{n} \sum_{r \le n}\frac{\gamma _n ^{2}}{\gamma _n ^{2}+ \gamma _r ^{2}} = \frac{\pi}{4}.}

Trivially if a_n equidistributed modulo 1 then a_n is also asymptotically equidistributed modulo one. The converse of this however is highly non trivial and to appreciate this fact we shall look at three celebrated results of analytic number theory in the field of equidistribution mod 1.

Let \alpha be any irrational number and \beta be any real number. We have

  1. H. Wely: The sequence \alpha n is equidistributed mod 1.
  2. I. Vinogradov: The sequence \alpha p_n is equidistributed mod 1
  3. E. Hlawka: The sequence \beta \gamma_n is equidistributed mod 1.

This three results would follow as special cases if the following conjecture on asymptotic equidistribution were true.

Conjecture 2.4. If a_n is also asymptotically equidistributed modulo 1 and \alpha is any irrational number then the sequence \alpha a_n is equidistributed modulo 1.

A proof or disproof of this conjecture has eluded me so far.

3. Partial equidistribution modulo 1

Consider any the set S_0 = {a_1, a_2, ... , a_n} where a_n is any sequence equidistributed modulo 1. Now from the set S_0 let us remove all elements  that satisfy the condition 0.2 \le a_i < 0.3 to obtain a new set S_1. Clearly no element of S_1 will have the digit 2 in the first place after the decimal point and therefore the set S_1 no longer equidistributed modulo 1 because no element of S_0 will lie in the interval (0.2,0.3[. In general we can remove from the set S_0 all the elements which begins with a given sequence of digits after the decimal point. The remaining elements of S_1 will form a partially equidistributed sequence.We call the set S_1 to be partially equidistributed modulo 1 because as we shall see below that the set S_1 shows properties analogous to the set S_0 which is equidistributed modulo 1.

If numbers are represented in decimal nation, the first digit after the decimal point can have 10 different values, the first two digits after the decimal point can have 100 different values and in general, if we consider the first m digits after the decimal point, we can have 10^m different possibilities.  From these 10^m different possibilities let us form a sequence b_n partially equidistributed modulo 1 such that the first m digits of b_r after the decimal point can take k different values from the set D=\{d_1, d_2, ... , d_k\}.

Theorem 3.1. If b_r is partially equidistributed modulo 1 and D=\{d_1, d_2, ... , d_k\} is the set of allowed values of the first m digits after the decimal point then

\displaystyle{\lim_{n \rightarrow \infty}\frac{1}{n}\sum_{r=1}^{n}f(b_r) = \frac{10^m}{k} \sum_{d \in D}\int_{\frac{d}{10^m}}^{\frac{d+1}{10^m}}f(x)dx}.

Theorem 3.1 is a generalization of (3) of Theorem 1. If k=10^m then set D will contain all possible combinations of m digits. Hence b_r will be equidistributed modulo 1 and RHS of Theorem 3.1 reduces to

\displaystyle{\int_{0}^{1} f(x)dx}.

To be contd.

Analogues of the van der Corput’s sequence

van der Corput sequence

A van der Corput sequence is a low-discrepancy sequence over the unit interval first published in 1935 by the Dutch mathematician J. G. van der Corput. It is constructed by reversing the base b representation of the sequence of natural numbers. For example, the decimal van der Corput sequence begins:

0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.01, 0.11, \ldots

van der Corput proved the following theorem:

Theorem 1: The sequence v(n) is equidistributed modulo one.

As a corollary of the above theorem, it follows from the property of sequence equidistributed modulo one that the mean value of v(n) approaches 0.5 as n tends to infinity i.e.

\displaystyle{\lim_{n \rightarrow \infty}\frac{1}{n}\sum_{r=1}^{n}v(r) = \frac{1}{2}.}

Generalized van der Corput sequence

In the same spirit, we can form van der Corput like sequence like by reversing the base b representation of the sequence of primes or the sequence of composite numbers or any other sequence and study the properties of these analogues of the van der Corput sequence. We define a generalization of the van der Corput type sequences as follows:

Definition: Let a_n be the base b representation of a sequence of positive reals defined for n \ge 1. We define v(a_n) as the sequence of numbers formed by placing a decimal point, and writing all the digits in the base b representation of the integer part of a_n in the reverse order after the decimal point.

Example: v(p_n) is the sequence

0.2, 0.3, 0.5, 0.7, 0.11, 0.31, 0.71, 0.91, 0.32, 0.92, \ldots.

Since primes other than 2 do not end in an even number in the unit’s place so clearly the sequence v(p_n) is clearly not equidistributed mod 1. In this case what will the limiting value of the mean value of v(p_n)? In general we want to know how for a give a_n, the sequence v(a_n) will behave. Before we proceed with this, we will first need to little bit about normal numbers.

Normal numbers.

A number is said to be simply normal to base b if its base b expansion has each digit appearing with average frequency tending to b^{-1}.

Champernowne’s number 0.123456789101112...

obtained by concatenating the decimal representations of the natural numbers in order, is normal in base 10, but it might not be normal in some other bases.

Definition: Let a_n be a sequence of natural number in b. We define by C_b (a_n) as the number formed by placing a decimal point and the concatenating the digits of the b representation of a_n.

Example: For the sequence of prime numbers p_n,

C_{10} (p_n) = 0.23571113171923...,

which is also known as the Copeland–Erdős constant.

In this regard, we have the following result.

Theorem 2. If a_n is a sequence of natural numbers in base 10 such that

  1. the unit’s digit of a_n can take any value form the set D=\{d_1, d_2, ... , d_k\} where each d_i \in \{0, 1, 2, ... , 9\}.
  2. C_{10} (a_n) is normal in base 10


\displaystyle{\lim_{n \rightarrow \infty}\frac{1}{n}\sum_{r=1}^{n}v(a_r) = \frac{1}{20} + \frac{1}{10k}\sum_{i=1}^{k}d_i}.

Corollary 1. Since the Copeland–Erdős constant C_{10} (p_n) is normal in base 10 and all primes greater than 5 have 1, 3, 7 or 9 in the unit’s place,

\displaystyle{\lim_{n \rightarrow \infty}\frac{1}{n}\sum_{r=1}^{n}v(p_r)=\frac{11}{20}.}

Corollary 2.

\displaystyle{\lim_{n \rightarrow \infty}\frac{1}{n}\sum_{r=1}^{n}v(p_r)=\lim_{n \rightarrow \infty}\frac{1}{n}\sum_{r=1}^{n}v(2r+1)=\frac{11}{20}}.

The above theorem is a special case of a more general result on partially equidistributed sequences about which I shal write in my next post. Interested readers can refer to my paper ‘Contributions to equidistribution theory.’

A mathematician’s mind-2

Link to Chapter 1.



Section 2.1. Method of Decomposing N

As the name of the chapter say, here I will write about how to see things that are not visible. This is best explained by the following example. Recall Lemma 1.2 from Chapter 1 which states that

Lemma 1.2. If the sequence a_n is uniformly distributed modulo 1 then

\displaystyle{\lim _{n \rightarrow \infty} \frac{1}{n}\sum_{r=1}^{n} f(a_r)  = \int_{0}^{1}f(x)dx. }

Now let us rewrite the above result as an asymptotic formula

\displaystyle{\sum_{r=1}^{n} 1. f(a_r) \sim  n \int_{0}^{1}f(x)dx. }

This form looks pretty much normal and fails to reveal any hidden pattern. But if we break the above asymptotic form one step further as shown below;

\displaystyle{\sum_{r=1}^{n} 1. f(a_r) \sim  \sum_{r=1}^{n}1. \int_{0}^{1}f(x)dx }

then it begins to look more promising and starts revealing inner hidden patterns. Such a pattern should should immediately ignite a mathematical mind to ask oneself to ask how would the formula look if I replace the coefficient of f(a_r); which is currently 1; by a general term say d_r. Once we reach this question, we know that we are on the right track; we have found possible clues to a hidden pattern that could lead to a missing mathematical formula.

At this point we have no idea how the actual formula; if at all it exists; would look like. So the best approach at this point would be to a reasonable guess based on intuition and logic. In the above example, it is not hard to see that a reasonable guess should be

\displaystyle{\sum_{r=1}^{n} d_r f(a_r) \sim  \sum_{r=1}^{n}d_r \int_{0}^{1}f(x)dx. }

If it happens that more than one guess seems reasonable, take all of them for consideration for the time begin. Later we can eliminate the incorrect models in the verification step. A good mathematician should be a great guesser. If you can make ten good guess, probably one is right and you have found something to work on. Ramanujan himself relied very strongly on his intuition and made discoveries that probably would never have been made had he not possessed this ability.

Section 2.2. More on the method of decomposing N

Let us see one more example of the method of decomposing N this time slightly more advanced then in our previous example. The following result is well known.

Lemma 2.1If f is monotonic and continuous and defined in [1; n] then

\displaystyle{\sum_{r=1}^{n}f(n) = \int_{1}^{n}f(x)dx + O(|f(n)| + |f(1)|.}

Take a good look at this result and try to find out if there is any possibility of improvisation. Observe that f(n) and the integral on the RHS both contain an n. Let us decompose this as the sum of n 1’s. We write

\displaystyle{\sum_{r=1}^{n} 1.f(1+...+1) = \int_{1}^{1+...+1}f(x)dx + O(|f(1+...+1)| + |f(1)|).}

Now lets us work with our intuition and guess. Let d_r be a sequence of positive reals and let \sum_{k=1}^{r} d_k = S_r. The above decomposition of n should at one point or another lead us to guess that there could be a result which would look like

\displaystyle{\sum_{r=1}^{n} d_r f(S_r) = \int_{S_1}^{S_n}f(x)dx + O(|f(S_n)| + |f(S_1)|).}

Later we shall see that we have guessed the integral on the RHS is correctly while the error term is slightly incorrect. Nonetheless we have made a good guess based on which we can begin working.

Section 2.3. Verify your formula numerically

Now that our observation and intuition have led us to guess some new mathematical results, we should verify or test our guesses with a few numerical examples. The reason for doing this is that in case of explicit formulas like in our examples, it is advisable to gain some confidence in our guesses before we attempt to prove them rigorously. Rigorous proof are usually harder and more time consuming and we want attempt a proof only when we have sufficient confidence in our guesses. Remember that while verification for any number of cases cases does not prove a formula, one counter example is sufficient to disprove the formula.

I verified my guesses using simple d_r = 1/r, r^2 and f(a_r) = 1/(1+a_r ^2) where a_r is the sequence of the fractional part of r\sqrt 2 which is uniformly distributed modulo one as shown my the Equidistribution Theorem. In the age of computers we can verify a formula easily by writing a program (PARI, Mathematica or even using MS Excel).

Section 2.4. Prove your formula rigorously

After successful verification of the formula, we can begin to construct a rigorous mathematical proof. Yes there is absolutely no guarantee that we will indeed find a proof. But there is bright side of this. The harder it is to prove something, the more likely it is that we have stumbled upon something important. If we are unable to find a rigorous proof but our formula passes all verification tests thenwe can conjecture the result and hope that someone else finds a rigorous proof of disproof of our result. This is how mathematicians usually come up with conjectures. Perhaps the most example of this method is the Prime Number Theorem. Based on numerical observation, young Gauss conjectured that the number of primes not exceeding x is approximately equal to Li(x). Almost a century later, this conjecture was proved true by Hadamard and Charles de la Vallee Poussin.

Continuing with our example, after verifying the formulas, I became more confident that the formulas are indeed true and I began looking for a mathematical proof. This stage is the most important, not only because we might end up proving a new result but also because we may encounter many unforeseen cases that needs to be taken care of, especially the boundary conditions, error terms, exceptions etc. For example when I was working on the proof of the first formula (or first guess), I found that the result is not true if d_r = \sin (r), 1/r^2 etc. It was then then I realized that the series \sum_{r=1}^{\infty}d_r should be divergent for the first formula. Similarly in the second formula, I had guessed the error term incorrectly and it was only during the proof stage that I could find the correct error term. Such intricate details might escape the verification stage and unless we are careful in the proof stage, our formula would be incomplete. Finally I could prove the following results rigorously; the details of which can be found in the paper ‘On a unified theory of numbers‘.

Lemma 2.2. If d_r is a divergent series of positive terms and a_r is uniformly distributed modulo 1 then

\displaystyle{\sum_{r=1}^{n} d_r f(a_r) \sim  \sum_{r=1}^{n}d_r \int_{0}^{1}f(x)dx. }

Notice that with Lemma 2.1, we have generalized Lemma 1.2.

Lemma 2.3If f is monotonic and continuous and defined in [1; n] and d_r is a series of positive terms such that \sum_{k=1}^{r} d_k = S_r; then

\displaystyle{\sum_{r=1}^{n} d_r f(S_r) = \int_{S_1}^{S_n}f(x)dx + O(\delta (n))}


\displaystyle{\delta (n) = d_n |f(S_n)| + d_1 |f(S_1)| + \int _{1}^{n} \frac{d(d_y)}{dy} S(y)dy.}


\displaystyle{\lim_{n \rightarrow \infty} \sum_{r < n} \frac{d_r}{S_n\sqrt{\ln S_n - \ln S_r}} = \sqrt{\pi}}.

\displaystyle{\sum_{r \le n}{p_r}^{a}({p_1}^{a}+{p_2}^{a} + \ldots + {p_r}^{a})^{b}\sim \frac{n^{b+1} p_n^{ab+a}}{(b+1)(a+1)^{b+1}}}.

Notes at the end of Chapter 2

In this chapter we saw how breaking numbers using the method of decomposition throws light on how well know results in mathematics could be improved. Writing n as \sum_{r=1}^{n} 1 is an explicit example of this. This does not mean that this is the only trick. What the reader should grasp here is the idea of decomposition. A mathematical explorer should always try to visualize results by improvising and by breaking the results into simpler forms in one or more ways.

A mathematician’s mind-1

I have always been thinking how mathematicians make discoveries. They come up with beautiful theorem and formulas that were unknown to mankind an hour ago until they discovered it first. Unlike the poets and writers, mathematicians cannot cook some stories or imagine some beautiful verses in the mind. So how does it work? When you are solving a problem you know where to begin, you know where to end and you will mostly have the tools that will help you solve the problem. Now let assume that you teacher enters to your class and instead of giving you a problem to solve from your course, asks you to discover something that you previously did not know. Now you don’t know where to begin where to end where to look for something new. So how do you go about making a mathematical discovery?

I am writing a series of article in this blog where I shall describe the thought process that I underwent as I made some interesting discoveries in analytical number theory. The purpose of this article is not to make a better problem solver but rather to encourage one to think like a mathematician who is on a journey of mathematical exploration and discovery. Therefore I shall not go into the precise technical details of the proofs of the results in this article but rather I shall give the sketch of the main ideas that led to these results and describe how one thought led to another and eventually culminated in a small theory comprising of many little discoveries along the way. Anyone interested in deriving the actual proof can easily reconstruct the proof from the central idea and thought process as well the descriptive details given for each result. The purpose of this article is to encourage the mathematical thought process that lead to new discoveries. In case there are any terminologies which I feel might be unfamiliar to my readers, I have provided hyperlink to a Wikipedia page on the same.




Many discoveries begin with an initial observation. For example thousands of years ago somebody might have observed that sum of the square of the sides of a right triangle equals the square of the hypotenuse. This observation stood the test of repeated experiments and only after that it would have inspired some ancient genius to establish this observation as a theorem by finding a rigorous proof. Similarly someone would have observed that the ratio of the circumference of a circle to its diameter remains constant and this eventually led to the discovery of \pi. A keen observation power is therefore one of the prerequisites of making a mathematical discovery.

Many years ago when I first learned definite integrals as a school student, I developed an affinity for one particular theorem which allowed us to evaluate the limits of a sum using definite integrals. The formula was

Lemma 1. If f(x) is Riemann integrable in [0,1] then

\displaystyle{\lim_{n \rightarrow \infty}\frac{1}{n}\sum_{r=1}^{n}f\Big (\frac{r}{n}\Big ) = \int_{0}^{1}f(x)dx}.

Three years ago I came across a type of sequence called equidistributed sequences or uniformly distributed sequences. There I found the following result theorem that had striking resemblance to Lemma 1.

Lemma 2. If the sequence a_n is uniformly distributed modulo 1 then

\displaystyle{\lim_{n \rightarrow \infty}\frac{1}{n}\sum_{r=1}^{n}f(a_r) = \int_{0}^{1}f(x)dx}.

As soon as I saw this result on uniformly distributed sequences modulo 1, I knew that I had seen something like it before; the result in Lemma 1. Replacing r/n in Lemma 1 by a_n we get Lemma 2. The similarity between Lemma 1 and Lemma 2 should automatically make us wonder why they are so similar. So at this point what are the options that we have? Either it is a coincidence or there is some hidden mathematical relation that both these results are following.  Since we are like explorers searching to make a discovery we would want to believe that this similarity is not a coincidence but because of some hidden mathematical relation that we seek to unearth.


Once we have found an observation to work on the next step is to find a mathematical explanation for the observation. If we can find an explanation within the existing mathematics then our observation is a new theorem or an extension or a generalization of one of the areas of existing mathematics. However if no known mathematics can give a satisfactory explanation of our observation then we will have to invent new mathematics all together.

Continuing with our example, what are the possible thoughts that can strike us when we try to explain the similarity between Lemma 1 and Lemma 2? The possible hypothesis are:

  • Lemma 1 and Lemma 2 are two different representations of the same phenomenon.
  • The sequence 1/n, 2/n, … , n/n approaches uniform distribution modulo 1 as n \rightarrow \infty.

To test our hypothesis we go back to the definition of uniform distribution modulo 1 and test if this definition applies to the sequence 1/n, 2/n, … , n/n as n \rightarrow \infty and we find that this actually is the case and our hypothesis is true.


So we have discovered that the sequence 1/n, 2/n, … , n/n approaches uniform distribution modulo 1 as n \rightarrow \infty. We shall use this starting point and a guiding star in our exploration. We now know that the theme of our exploration is going to be centered around sequences uniformly distributed modulo 1. The next logical questions that strikes (or rather should strike) our mind are:

  • Yes the sequence r/n uniform distribution modulo 1 but what about the sequence r^2/n^2 or in general r^a/n^a?
  • This will lead us to formulate the question in the most general form, when is the sequence b_r/b_n uniformly distributed modulo 1?

Again applying the definition of uniform distribution modulo 1, we can easily prove the following result. Let [x] denote the greatest integer function. We have

Lemma 3. As n \rightarrow \infty the sequence b_r/b_n,  r=1, 2, ..., n approaches uniform distribution modulo 1 if

\displaystyle{\lim_{n \rightarrow \infty}\frac{b_{[nt]}}{b_n} = t}.

for every t, 0<t<1.


So far so good but now we need show that our theory is valuable by showing its applications. We want to find the sequence that satisfy Lemma 3. It is easy to see that Lemma 3 implies b_n = o(n^{1+\epsilon}) for every \epsilon >0. Hence b_n grows much slower than any quadratic polynomial function of n.  At this point we sit an recall all well know sequences that grow at this  rate.

We can find many such sequences but one particular sequence that we cannot afford to miss out is the sequence of prime numbers. Recall that the prime number theorem implies that p_n \sim n\ln n. We find that the sequence of primes satisfy Lemma 3. Similarly we can show that the sequence of composite numbers denoted by c_n also satisfy Lemma 3.

We have found a lot of interesting insights so far. But can we be more adventurous? Notice that if two or more sequence satisfy Lemma 3 then their linear combination will also satisfy Lemma 3. So we can put together all our results so far in the form a beautiful and elegant theorem.

Theorem 1. Let p_n be the n-th prime, c_n be the n-th composite number and let \alpha, \beta and \gamma be any three constants such that \alpha p_n + \beta c_n + \gamma n \ne 0 then

\displaystyle{\lim_{n \rightarrow \infty}\frac{1}{n} \sum_{r \le n} f \Big(\frac{\alpha p_r + \beta c_r + \gamma r}{\alpha p_n + \beta c_n + \gamma n}\Big) = \int_{0}^{1}f(x)dx}.

Thus we see that starting with basic theorem on definite integrals and a fundamental theorem on uniform distribution modulo one, we by properly directing the thought process have ended up with a non-trivial theorem connecting prime numbers, composite numbers and natural numbers. Now we can let our imaginations run wild and we can use our theorem to derive many interesting results on prime or composites. For example

Example 1.

\displaystyle{\lim_{n \rightarrow \infty} \frac{1}{n} \sum_{r \le n}\frac{p_n ^{2}}{p_n ^{2}+ p_r ^{2}}=\lim_{n \rightarrow \infty} \frac{1}{n} \sum_{r \le n}\frac{c_n ^{2}}{c_n ^{2}+ c_r ^{2}}=\lim_{n \rightarrow \infty} \frac{1}{n} \sum_{r \le n}\frac{n ^{2}}{n ^{2}+ r ^{2}} = \frac{\pi}{4}}.

Example 2.

\displaystyle{\lim_{x \rightarrow \infty} \frac{1}{\pi (x)} \sum_{p \le x}\Big(\frac{p}{x}\Big)^{a-1} \ln^{b-1} \Big(\frac{cx}{p}\Big)= \frac{c^a}{a^b}\Gamma (b, a\ln c)}.

To read the second part of this post, click A mathematician’s mind-2.


In this chapter we saw how observing similarities between two different results of mathematics can reveal that both these results are special cases of a larger family of relations provides an ideal platform for beginning a mathematical exploration. In the second post on this topic in Chapter 2, we shall continue our exploration forward from here develop our theory further by extending our results to divergent series. The topic in this series of posts are based on my paper ‘On a unified theory of numbers‘ which I have co-authored with Prof. Marek Wolf, Institute of Theoritical Physics, Warsaw, Poland. Currently we are revising and re-formating our paper. If you spot any mathematical or typographical error please do let us know. Also any suggestion or comments to improve this blog are welcome.