Analytic number theory is so abysmally complex that we need a basic toolkit of summation formulas first in order to prove some of the most basic theorems of the theory.
Note: We need the Riemann integrability to be able to apply the fundamental theorem of calculus.
Proof 1 :
We prove the theorem by induction on
⌊
x
⌋
{\displaystyle \lfloor x\rfloor }
.
1.
⌊
x
⌋
=
1
{\displaystyle \lfloor x\rfloor =1}
:
First, we have in this case
∑
1
≤
n
≤
x
a
n
f
(
n
)
=
a
1
f
(
1
)
{\displaystyle \sum _{1\leq n\leq x}a_{n}f(n)=a_{1}f(1)}
.
Then, we have
A
(
x
)
f
(
x
)
=
a
1
f
(
x
)
=
a
1
f
(
1
)
−
a
1
(
f
(
1
)
−
f
(
x
)
)
=
a
1
f
(
1
)
+
∫
1
x
A
(
y
)
f
′
(
y
)
d
y
{\displaystyle A(x)f(x)=a_{1}f(x)=a_{1}f(1)-a_{1}(f(1)-f(x))=a_{1}f(1)+\int _{1}^{x}A(y)f'(y)dy}
by the fundamental theorem of calculus.
2. Induction step:
Define
N
:=
⌊
x
⌋
{\displaystyle N:=\lfloor x\rfloor }
. We have
∑
1
≤
n
≤
x
a
n
f
(
n
)
=
∑
1
≤
n
≤
x
−
1
a
n
f
(
n
)
+
a
N
f
(
N
)
=
A
(
x
−
1
)
f
(
x
−
1
)
−
∫
1
x
−
1
A
(
y
)
f
′
(
y
)
d
y
+
a
N
f
(
N
)
{\displaystyle {\begin{aligned}\sum _{1\leq n\leq x}a_{n}f(n)&=\sum _{1\leq n\leq x-1}a_{n}f(n)+a_{N}f(N)\\&=A(x-1)f(x-1)-\int _{1}^{x-1}A(y)f'(y)dy+a_{N}f(N)\end{aligned}}}
by the induction hypothesis. Further,
−
∫
1
x
−
1
A
(
y
)
f
′
(
y
)
d
y
=
−
∫
1
x
A
(
y
)
f
′
(
y
)
d
y
+
∫
x
−
1
x
A
(
y
)
f
′
(
y
)
d
y
=
−
∫
1
x
A
(
y
)
f
′
(
y
)
d
y
+
∫
x
−
1
N
A
(
y
)
f
′
(
y
)
d
y
+
∫
N
x
A
(
y
)
f
′
(
y
)
d
y
=
−
∫
1
x
A
(
y
)
f
′
(
y
)
d
y
+
A
(
N
−
1
)
∫
x
−
1
N
f
′
(
y
)
d
y
+
A
(
N
)
∫
N
x
f
′
(
y
)
d
y
=
−
∫
1
x
A
(
y
)
f
′
(
y
)
d
y
+
A
(
N
−
1
)
(
f
(
N
)
−
f
(
x
−
1
)
)
+
A
(
N
)
(
f
(
x
)
−
f
(
N
)
)
{\displaystyle {\begin{aligned}-\int _{1}^{x-1}A(y)f'(y)dy&=-\int _{1}^{x}A(y)f'(y)dy+\int _{x-1}^{x}A(y)f'(y)dy\\&=-\int _{1}^{x}A(y)f'(y)dy+\int _{x-1}^{N}A(y)f'(y)dy+\int _{N}^{x}A(y)f'(y)dy\\&=-\int _{1}^{x}A(y)f'(y)dy+A(N-1)\int _{x-1}^{N}f'(y)dy+A(N)\int _{N}^{x}f'(y)dy\\&=-\int _{1}^{x}A(y)f'(y)dy+A(N-1)(f(N)-f(x-1))+A(N)(f(x)-f(N))\end{aligned}}}
.
Putting things together, we obtain
∑
1
≤
n
≤
x
a
n
f
(
n
)
=
A
(
x
−
1
)
f
(
x
−
1
)
−
∫
1
x
A
(
y
)
f
′
(
y
)
d
y
+
A
(
N
−
1
)
(
f
(
N
)
−
f
(
x
−
1
)
)
+
A
(
N
)
(
f
(
x
)
−
f
(
N
)
)
+
a
N
f
(
N
)
{\displaystyle \sum _{1\leq n\leq x}a_{n}f(n)=A(x-1)f(x-1)-\int _{1}^{x}A(y)f'(y)dy+A(N-1)(f(N)-f(x-1))+A(N)(f(x)-f(N))+a_{N}f(N)}
and thus the desired formula.
The method of proof we applied here was using induction and then trying to express the terms from the induction hypothesis in terms of the terms from the desired formula.
◻
{\displaystyle \Box }
Proof 2 :
We prove the theorem by direct manipulation of the term on the left.
Define
k
:=
⌊
x
⌋
{\displaystyle k:=\lfloor x\rfloor }
.
∑
1
≤
n
≤
x
a
n
f
(
n
)
=
∑
1
≤
n
≤
x
(
A
(
n
)
−
A
(
n
−
1
)
)
f
(
n
)
=
∑
1
≤
n
≤
x
A
(
n
)
f
(
n
)
−
∑
0
≤
n
≤
x
−
1
A
(
n
)
f
(
n
+
1
)
=
∑
1
≤
n
≤
x
A
(
n
)
f
(
n
)
−
∑
1
≤
n
≤
x
−
1
A
(
n
)
f
(
n
+
1
)
=
∑
1
≤
n
≤
x
−
1
A
(
n
)
(
f
(
n
)
−
f
(
n
+
1
)
)
+
A
(
k
)
f
(
k
)
=
∑
1
≤
n
≤
x
−
1
A
(
n
)
(
−
∫
n
n
+
1
f
′
(
t
)
d
t
)
+
A
(
x
)
f
(
x
)
−
∫
k
x
A
(
t
)
f
′
(
t
)
d
t
{\displaystyle {\begin{aligned}\sum _{1\leq n\leq x}a_{n}f(n)&=\sum _{1\leq n\leq x}(A(n)-A(n-1))f(n)\\&=\sum _{1\leq n\leq x}A(n)f(n)-\sum _{0\leq n\leq x-1}A(n)f(n+1)=\sum _{1\leq n\leq x}A(n)f(n)-\sum _{1\leq n\leq x-1}A(n)f(n+1)\\&=\sum _{1\leq n\leq x-1}A(n)(f(n)-f(n+1))+A(k)f(k)\\&=\sum _{1\leq n\leq x-1}A(n)\left(-\int _{n}^{n+1}f'(t)dt\right)+A(x)f(x)-\int _{k}^{x}A(t)f'(t)dt\end{aligned}}}
◻
{\displaystyle \Box }
Proof 3 :
We prove the formula by the means of the Riemann-Stieltjes integral. Indeed, by integration by parts, we have
∑
1
≤
n
≤
x
a
n
f
(
n
)
=
∫
0
x
f
(
t
)
d
A
(
t
)
=
A
(
x
)
f
(
x
)
−
∫
0
x
A
(
t
)
d
f
(
t
)
=
A
(
x
)
f
(
x
)
−
∫
1
x
A
(
t
)
f
′
(
t
)
d
t
{\displaystyle {\begin{aligned}\sum _{1\leq n\leq x}a_{n}f(n)=\int _{0}^{x}f(t)dA(t)&=A(x)f(x)-\int _{0}^{x}A(t)df(t)\\&=A(x)f(x)-\int _{1}^{x}A(t)f'(t)dt\end{aligned}}}
.
◻
{\displaystyle \Box }
Corollary 1.2 :
∑
y
<
n
≤
x
a
n
f
(
n
)
=
A
(
x
)
f
(
x
)
−
A
(
y
)
f
(
y
)
−
∫
y
x
A
(
t
)
f
′
(
t
)
d
t
{\displaystyle \sum _{y<n\leq x}a_{n}f(n)=A(x)f(x)-A(y)f(y)-\int _{y}^{x}A(t)f'(t)dt}
.
Proof 1 :
We deduce the formula from integration by parts for the Riemann-Stieltjes integral.
∑
y
<
n
≤
x
a
n
f
(
n
)
=
∫
y
x
f
(
t
)
d
A
(
t
)
=
f
(
x
)
A
(
x
)
−
f
(
y
)
A
(
y
)
−
∫
y
x
A
(
t
)
d
f
(
t
)
=
f
(
x
)
A
(
x
)
−
f
(
y
)
A
(
y
)
−
∫
y
x
A
(
t
)
f
′
(
t
)
d
t
{\displaystyle {\begin{aligned}\sum _{y<n\leq x}a_{n}f(n)=\int _{y}^{x}f(t)dA(t)&=f(x)A(x)-f(y)A(y)-\int _{y}^{x}A(t)df(t)\\&=f(x)A(x)-f(y)A(y)-\int _{y}^{x}A(t)f'(t)dt\end{aligned}}}
◻
{\displaystyle \Box }
Proof 2 :
We directly manipulate the LHS (left hand side).
Define
k
:=
⌊
x
⌋
{\displaystyle k:=\lfloor x\rfloor }
and
m
:=
⌊
y
⌋
{\displaystyle m:=\lfloor y\rfloor }
.
∑
y
<
n
≤
x
a
n
f
(
n
)
=
∑
y
<
n
≤
x
(
A
(
n
)
−
A
(
n
−
1
)
)
f
(
n
)
=
∑
y
<
n
≤
x
A
(
n
)
f
(
n
)
−
∑
y
−
1
<
n
≤
x
−
1
A
(
n
)
f
(
n
+
1
)
=
∑
y
<
n
≤
x
−
1
A
(
n
)
(
f
(
n
)
−
f
(
n
+
1
)
)
−
A
(
m
)
f
(
m
)
+
A
(
k
)
f
(
k
)
=
∑
y
<
n
≤
x
−
1
A
(
n
)
−
(
∫
n
n
+
1
f
′
(
t
)
d
t
)
−
A
(
y
)
f
(
y
)
+
A
(
x
)
f
(
x
)
−
∫
y
m
A
(
t
)
f
′
(
t
)
d
t
−
∫
k
x
A
(
t
)
f
′
(
t
)
d
t
{\displaystyle {\begin{aligned}\sum _{y<n\leq x}a_{n}f(n)&=\sum _{y<n\leq x}(A(n)-A(n-1))f(n)\\&=\sum _{y<n\leq x}A(n)f(n)-\sum _{y-1<n\leq x-1}A(n)f(n+1)\\&=\sum _{y<n\leq x-1}A(n)(f(n)-f(n+1))-A(m)f(m)+A(k)f(k)\\&=\sum _{y<n\leq x-1}A(n)-\left(\int _{n}^{n+1}f'(t)dt\right)-A(y)f(y)+A(x)f(x)-\int _{y}^{m}A(t)f'(t)dt-\int _{k}^{x}A(t)f'(t)dt\end{aligned}}}
◻
{\displaystyle \Box }
Two further proofs are given in exercises 1.1.1 and 1.1.5.
We note that induction and direct manipulation are quicker proofs for theorem 1.1, while corollary 1.2 is quicker proven from theorem 1.1 or Riemann-Stieltjes integration.
Exercise 1.1.1 : Prove corollary 1.2 from theorem 1.1. Hint:
∑
y
<
n
≤
x
a
n
f
(
n
)
=
∑
1
≤
n
≤
x
a
n
f
(
n
)
−
∑
1
<
n
≤
y
a
n
f
(
n
)
{\displaystyle \sum _{y<n\leq x}a_{n}f(n)=\sum _{1\leq n\leq x}a_{n}f(n)-\sum _{1<n\leq y}a_{n}f(n)}
.
Exercise 1.1.2 : Compute
∑
n
=
1
N
n
3
{\displaystyle \sum _{n=1}^{N}n^{3}}
. Hint: Use
a
n
=
n
{\displaystyle a_{n}=n}
,
f
(
x
)
=
x
2
{\displaystyle f(x)=x^{2}}
, apply Abelian summation and split the resulting integral into pieces where
A
(
t
)
{\displaystyle A(t)}
is constant. Then apply a similar process.
Exercise 1.1.3 : Prove that the limit
lim
n
→
∞
(
−
log
(
n
)
+
∑
k
=
1
n
1
k
)
{\displaystyle \lim _{n\to \infty }\left(-\log(n)+\sum _{k=1}^{n}{\frac {1}{k}}\right)}
exists. This limit is called the Euler–Mascheroni constant . Hint: Use
a
n
=
1
{\displaystyle a_{n}=1}
and
f
(
x
)
=
1
x
{\displaystyle f(x)={\frac {1}{x}}}
.
Exercise 1.1.4 : Prove theorem 1.1 from corollary 1.2.
Exercise 1.1.5 : Prove corollary 1.2 using induction on
⌊
x
⌋
{\displaystyle \lfloor x\rfloor }
.
Definition 1.3 :
For
x
∈
R
{\displaystyle x\in \mathbb {R} }
, we define
{
x
}
:=
x
−
⌊
x
⌋
{\displaystyle \{x\}:=x-\lfloor x\rfloor }
.
Theorem 1.4 (Euler's summation formula) :
Let
f
:
R
→
R
{\displaystyle f:\mathbb {R} \to \mathbb {R} }
be a differentiable function, such that
f
′
{\displaystyle f'}
is Riemann integrable. Then
∑
y
<
n
≤
x
f
(
n
)
=
{
x
}
f
(
x
)
−
{
y
}
f
(
y
)
+
∫
y
x
f
(
t
)
d
t
+
∫
y
x
{
t
}
f
′
(
t
)
d
t
{\displaystyle \sum _{y<n\leq x}f(n)=\{x\}f(x)-\{y\}f(y)+\int _{y}^{x}f(t)dt+\int _{y}^{x}\{t\}f'(t)dt}
.
Proof :
We prove the theorem from Corollary 1.2, setting
a
n
=
1
{\displaystyle a_{n}=1}
and using integration by parts (integration by parts is proven using the fundamental theorem of calculus).
Indeed,
∑
y
<
n
≤
x
f
(
n
)
=
⌊
x
⌋
f
(
x
)
−
⌊
y
⌋
f
(
y
)
−
∫
y
x
⌊
t
⌋
f
′
(
t
)
d
t
=
⌊
x
⌋
f
(
x
)
−
⌊
y
⌋
f
(
y
)
−
∫
y
x
t
f
′
(
t
)
d
t
+
∫
y
x
{
t
}
f
′
(
t
)
d
t
=
{
x
}
f
(
x
)
−
{
y
}
f
(
y
)
+
∫
y
x
f
(
t
)
d
t
+
∫
y
x
{
t
}
f
′
(
t
)
d
t
{\displaystyle {\begin{aligned}\sum _{y<n\leq x}f(n)&=\lfloor x\rfloor f(x)-\lfloor y\rfloor f(y)-\int _{y}^{x}\lfloor t\rfloor f'(t)dt\\&=\lfloor x\rfloor f(x)-\lfloor y\rfloor f(y)-\int _{y}^{x}tf'(t)dt+\int _{y}^{x}\{t\}f'(t)dt\\&=\{x\}f(x)-\{y\}f(y)+\int _{y}^{x}f(t)dt+\int _{y}^{x}\{t\}f'(t)dt\end{aligned}}}
,
where in the last line we used integration by parts on the integral
∫
y
x
t
f
′
(
t
)
d
t
{\displaystyle \int _{y}^{x}tf'(t)dt}
.
◻
{\displaystyle \Box }
Prove corollary 1.5.
Proof 1 :
We prove the theorem by direct computation.
{\displaystyle {\begin{aligned}\end{aligned}}}
Proof 2 :
We prove the theorem from Euler's summation formula.
The Chebychev ψ and ϑ functions
Proposition (the Chebychev ψ function may be written as the sum of Chebyshev ϑ functions) :
We have the identity
ψ
(
x
)
=
∑
m
=
1
log
2
(
x
)
ϑ
(
x
1
/
m
)
{\displaystyle \psi (x)=\sum _{m=1}^{\log _{2}(x)}\vartheta (x^{1/m})}
.
Proposition (estimate of the distance between the Chebychev ψ and ϑ functions) :
Whenever
x
≥
3
,
594
,
641
{\displaystyle x\geq 3,594,641}
, we have
ψ
(
x
)
−
ϑ
(
x
)
=
x
+
O
(
x
ln
(
x
)
2
)
{\displaystyle \psi (x)-\vartheta (x)={\sqrt {x}}+O\left({\frac {\sqrt {x}}{\ln(x)^{2}}}\right)}
.
Note: The current proof gives an inferior error term. A subsequent version will redeem this issue. (Given the Riemann hypothesis, the error term can be made even smaller.)
Proof: We know that the formula
ψ
(
x
)
=
∑
m
=
1
log
2
(
x
)
ϑ
(
x
1
/
m
)
{\displaystyle \psi (x)=\sum _{m=1}^{\log _{2}(x)}\vartheta (x^{1/m})}
holds. Hence,
ψ
(
x
)
−
ϑ
(
x
)
=
∑
m
=
2
log
2
(
x
)
ϑ
(
x
1
/
m
)
{\displaystyle \psi (x)-\vartheta (x)=\sum _{m=2}^{\log _{2}(x)}\vartheta (x^{1/m})}
.
By a result obtained by Pierre Dusart (based upon the computational verification of the Riemann hypothesis for small moduli), we have
|
θ
(
x
)
−
x
|
≤
0.2
ln
(
x
)
2
x
{\displaystyle \left|\theta (x)-x\right|\leq {\frac {0.2}{\ln(x)^{2}}}x}
whenever
x
≥
3
,
594
,
641
{\displaystyle x\geq 3,594,641}
. If
x
{\displaystyle x}
is in that range, we hence conclude
ψ
(
x
)
−
ϑ
(
x
)
=
∑
m
=
2
log
2
(
x
)
ϑ
(
x
1
/
m
)
≤
(
1
+
0.2
ln
(
x
)
2
)
∑
m
=
2
log
2
(
x
)
x
1
/
m
{\displaystyle \psi (x)-\vartheta (x)=\sum _{m=2}^{\log _{2}(x)}\vartheta (x^{1/m})\leq \left(1+{\frac {0.2}{\ln(x)^{2}}}\right)\sum _{m=2}^{\log _{2}(x)}x^{1/m}}
.
By Euler's summation formula , we have
∑
m
=
2
log
2
(
x
)
x
1
/
m
=
∫
2
log
2
(
x
)
x
1
/
t
d
t
+
∫
2
log
2
(
x
)
{
t
}
(
d
d
t
x
1
/
t
)
d
t
+
{
log
2
(
x
)
}
x
1
/
log
2
(
x
)
−
{
2
}
x
1
/
2
{\displaystyle \sum _{m=2}^{\log _{2}(x)}x^{1/m}=\int _{2}^{\log _{2}(x)}x^{1/t}dt+\int _{2}^{\log _{2}(x)}\{t\}\left({\frac {d}{dt}}x^{1/t}\right)dt+\{\log _{2}(x)\}x^{1/\log _{2}(x)}-\{2\}x^{1/2}}
.
Certainly
{
2
}
=
0
{\displaystyle \{2\}=0}
and
{
log
2
(
x
)
}
≤
1
{\displaystyle \{\log _{2}(x)\}\leq 1}
. Moreover,
x
1
/
log
2
(
x
)
=
2
{\displaystyle x^{1/\log _{2}(x)}=2}
. Now derivation shows that
−
exp
(
ln
(
x
)
t
)
t
2
ln
(
x
)
{\displaystyle -\exp \left({\frac {\ln(x)}{t}}\right){\frac {t^{2}}{\ln(x)}}}
is an anti-derivative of the function
x
1
/
t
−
2
t
ln
(
x
)
x
1
/
t
{\displaystyle x^{1/t}-{\frac {2t}{\ln(x)}}x^{1/t}}
of
t
{\displaystyle t}
. By the fundamental theorem of calculus, it follows that
∫
a
b
(
1
−
2
t
ln
(
x
)
)
x
1
/
t
d
t
=
[
−
exp
(
ln
(
x
)
t
)
t
2
ln
(
x
)
]
t
=
a
t
=
b
{\displaystyle \int _{a}^{b}\left(1-{\frac {2t}{\ln(x)}}\right)x^{1/t}dt=\left[-\exp \left({\frac {\ln(x)}{t}}\right){\frac {t^{2}}{\ln(x)}}\right]_{t=a}^{t=b}}
for real numbers
a
,
b
∈
R
{\displaystyle a,b\in \mathbb {R} }
such that
a
<
b
{\displaystyle a<b}
. This integral is not precisely the one we want to estimate. Hence, some analytical trickery will be necessary in order to obtain the estimate we want.
We start by noting that if only the bracketed term in the integral were absent, we would have the estimate we desire. In order to proceed, we replace
x
{\displaystyle x}
by the more general expression
x
y
{\displaystyle xy}
(where
y
≥
1
{\displaystyle y\geq 1}
), and obtain
∫
a
b
(
1
−
2
t
ln
(
x
y
)
)
x
1
/
t
y
1
/
t
d
t
=
[
−
exp
(
ln
(
x
y
)
t
)
t
2
ln
(
x
y
)
]
t
=
a
t
=
b
{\displaystyle \int _{a}^{b}\left(1-{\frac {2t}{\ln(xy)}}\right)x^{1/t}y^{1/t}dt=\left[-\exp \left({\frac {\ln(xy)}{t}}\right){\frac {t^{2}}{\ln(xy)}}\right]_{t=a}^{t=b}}
.
The integrand is non-negative so long as
t
≤
ln
(
x
y
)
2
{\displaystyle t\leq {\frac {\ln(xy)}{2}}}
.
Moreover, if
t
0
{\displaystyle t_{0}}
is strictly within that range, we obtain
∫
2
t
0
x
1
/
t
y
1
/
t
d
t
≤
(
1
−
2
t
0
ln
(
x
y
)
)
−
1
∫
2
t
0
(
1
−
2
t
ln
(
x
y
)
)
x
1
/
t
y
1
/
t
d
t
=
[
−
exp
(
ln
(
x
y
)
t
)
t
2
ln
(
x
y
)
]
t
=
2
t
=
t
0
{\displaystyle \int _{2}^{t_{0}}x^{1/t}y^{1/t}dt\leq \left(1-{\frac {2t_{0}}{\ln(xy)}}\right)^{-1}\int _{2}^{t_{0}}\left(1-{\frac {2t}{\ln(xy)}}\right)x^{1/t}y^{1/t}dt=\left[-\exp \left({\frac {\ln(xy)}{t}}\right){\frac {t^{2}}{\ln(xy)}}\right]_{t=2}^{t=t_{0}}}
.
We now introduce a constant
K
∈
(
2
,
t
0
)
{\displaystyle K\in (2,t_{0})}
and obtain the integrals
∫
2
K
x
1
/
t
y
1
/
t
d
t
{\displaystyle \int _{2}^{K}x^{1/t}y^{1/t}dt}
and
∫
K
t
0
x
1
/
t
y
1
/
t
d
t
{\displaystyle \int _{K}^{t_{0}}x^{1/t}y^{1/t}dt}
.
The first integral majorises the integral
y
1
/
K
∫
2
K
x
1
/
t
d
t
{\displaystyle y^{1/K}\int _{2}^{K}x^{1/t}dt}
,
whereas the second integral majorises the integral
∫
K
t
0
x
1
/
t
d
t
{\displaystyle \int _{K}^{t_{0}}x^{1/t}dt}
.
We obtain that
∫
2
t
0
x
1
/
t
d
t
≤
1
y
1
/
K
∫
2
K
x
1
/
t
y
1
1
/
t
d
t
+
∫
K
t
0
x
1
/
t
y
2
1
/
t
d
t
{\displaystyle \int _{2}^{t_{0}}x^{1/t}dt\leq {\frac {1}{y^{1/K}}}\int _{2}^{K}x^{1/t}y_{1}^{1/t}dt+\int _{K}^{t_{0}}x^{1/t}y_{2}^{1/t}dt}
.
Now we would like to set
t
0
=
log
2
(
x
)
{\displaystyle t_{0}=\log _{2}(x)}
. To do so, we must ensure that
y
{\displaystyle y}
is sufficiently large so that
K
{\displaystyle K}
resp.
t
0
{\displaystyle t_{0}}
is strictly within the admissible interval.
The two summands on the left are now estimated using our computation above, where
t
0
{\displaystyle t_{0}}
is replaced by
K
{\displaystyle K}
for the first computation: Indeed,
∫
2
K
x
1
/
t
y
1
1
/
t
d
t
≤
(
1
−
2
K
ln
(
x
y
1
)
)
−
1
[
−
exp
(
ln
(
x
y
1
)
t
)
t
2
ln
(
x
y
1
)
]
t
=
2
t
=
K
{\displaystyle \int _{2}^{K}x^{1/t}y_{1}^{1/t}dt\leq \left(1-{\frac {2K}{\ln(xy_{1})}}\right)^{-1}\left[-\exp \left({\frac {\ln(xy_{1})}{t}}\right){\frac {t^{2}}{\ln(xy_{1})}}\right]_{t=2}^{t=K}}
and
∫
K
t
0
x
1
/
t
y
2
1
/
t
d
t
≤
(
1
−
2
t
0
ln
(
x
y
2
)
)
−
1
[
−
exp
(
ln
(
x
y
2
)
t
)
t
2
ln
(
x
y
2
)
]
t
=
K
t
=
t
0
{\displaystyle \int _{K}^{t_{0}}x^{1/t}y_{2}^{1/t}dt\leq \left(1-{\frac {2t_{0}}{\ln(xy_{2})}}\right)^{-1}\left[-\exp \left({\frac {\ln(xy_{2})}{t}}\right){\frac {t^{2}}{\ln(xy_{2})}}\right]_{t=K}^{t=t_{0}}}
.
Putting the estimates together and setting
t
0
=
log
2
(
x
)
{\displaystyle t_{0}=\log _{2}(x)}
, we obtain
∫
2
log
2
(
x
)
x
1
/
t
d
t
≤
1
y
1
1
/
K
(
1
−
2
K
ln
(
x
y
1
)
)
−
1
[
−
exp
(
ln
(
x
y
1
)
t
)
t
2
ln
(
x
y
1
)
]
t
=
2
t
=
K
+
(
1
−
2
t
0
ln
(
x
y
2
)
)
−
1
[
−
exp
(
ln
(
x
y
2
)
t
)
t
2
ln
(
x
y
2
)
]
t
=
K
t
=
log
2
(
x
)
{\displaystyle \int _{2}^{\log _{2}(x)}x^{1/t}dt\leq {\frac {1}{y_{1}^{1/K}}}\left(1-{\frac {2K}{\ln(xy_{1})}}\right)^{-1}\left[-\exp \left({\frac {\ln(xy_{1})}{t}}\right){\frac {t^{2}}{\ln(xy_{1})}}\right]_{t=2}^{t=K}+\left(1-{\frac {2t_{0}}{\ln(xy_{2})}}\right)^{-1}\left[-\exp \left({\frac {\ln(xy_{2})}{t}}\right){\frac {t^{2}}{\ln(xy_{2})}}\right]_{t=K}^{t=\log _{2}(x)}}
whenever
K
≤
ln
(
x
y
1
)
2
{\displaystyle K\leq {\frac {\ln(xy_{1})}{2}}}
and
log
2
(
x
)
≤
ln
(
x
y
2
)
2
{\displaystyle \log _{2}(x)\leq {\frac {\ln(xy_{2})}{2}}}
.
We now choose the ansatz
1
−
2
K
ln
(
x
y
1
)
=
C
{\displaystyle 1-{\frac {2K}{\ln(xy_{1})}}=C}
and
1
−
2
t
0
ln
(
x
y
2
)
=
D
{\displaystyle 1-{\frac {2t_{0}}{\ln(xy_{2})}}=D}
for constants
C
{\displaystyle C}
and
D
{\displaystyle D}
. These equations are readily seen to imply
y
1
=
1
x
exp
(
2
K
1
−
C
)
{\displaystyle y_{1}={\frac {1}{x}}\exp \left({\frac {2K}{1-C}}\right)}
and
y
2
=
1
x
exp
(
2
log
2
(
x
)
1
−
D
)
{\displaystyle y_{2}={\frac {1}{x}}\exp \left({\frac {2\log _{2}(x)}{1-D}}\right)}
.
Note though that
y
1
≥
1
{\displaystyle y_{1}\geq 1}
and
y
2
≥
1
{\displaystyle y_{2}\geq 1}
is needed. The first condition yields
K
≥
1
−
C
2
ln
(
x
)
{\displaystyle K\geq {\frac {1-C}{2}}\ln(x)}
.
The equations for
y
1
{\displaystyle y_{1}}
and
y
2
{\displaystyle y_{2}}
may be inserted into the above constraints on
K
{\displaystyle K}
and
log
2
(
x
)
{\displaystyle \log _{2}(x)}
; this yields
K
≤
2
K
1
−
C
{\displaystyle K\leq {\frac {2K}{1-C}}}
and
log
2
(
x
)
≤
2
log
2
(
x
)
1
−
D
{\displaystyle \log _{2}(x)\leq {\frac {2\log _{2}(x)}{1-D}}}
, that is,
C
≥
1
2
{\displaystyle C\geq {\frac {1}{2}}}
and
D
≥
1
2
{\displaystyle D\geq {\frac {1}{2}}}
.
If all these conditions are true, the ansatz immediately yields
∫
2
log
2
(
x
)
x
1
/
t
d
t
≤
C
−
1
y
1
1
/
K
[
−
exp
(
ln
(
x
y
1
)
t
)
t
2
ln
(
x
y
1
)
]
t
=
2
t
=
K
+
D
−
1
[
−
exp
(
ln
(
x
y
2
)
t
)
t
2
ln
(
x
y
2
)
]
t
=
K
t
=
log
2
(
x
)
{\displaystyle \int _{2}^{\log _{2}(x)}x^{1/t}dt\leq {\frac {C^{-1}}{y_{1}^{1/K}}}\left[-\exp \left({\frac {\ln(xy_{1})}{t}}\right){\frac {t^{2}}{\ln(xy_{1})}}\right]_{t=2}^{t=K}+D^{-1}\left[-\exp \left({\frac {\ln(xy_{2})}{t}}\right){\frac {t^{2}}{\ln(xy_{2})}}\right]_{t=K}^{t=\log _{2}(x)}}
.
We now amend our ansatz by further postulating
K
=
(
1
−
C
2
+
α
)
ln
(
x
)
{\displaystyle K=\left({\frac {1-C}{2}}+\alpha \right)\ln(x)}
.
This yields
y
1
=
1
x
exp
(
(
1
−
C
+
2
α
)
ln
(
x
)
1
−
C
)
{\displaystyle y_{1}={\frac {1}{x}}\exp \left({\frac {(1-C+2\alpha )\ln(x)}{1-C}}\right)}
and
C
−
1
y
1
1
/
K
[
−
exp
(
ln
(
x
y
1
)
t
)
t
2
ln
(
x
y
1
)
]
t
=
2
t
=
K
=
C
−
1
y
1
1
/
K
[
−
exp
(
(
1
−
C
+
2
α
)
ln
(
x
)
1
−
C
t
)
t
2
ln
(
x
y
1
)
]
t
=
2
t
=
K
{\displaystyle {\frac {C^{-1}}{y_{1}^{1/K}}}\left[-\exp \left({\frac {\ln(xy_{1})}{t}}\right){\frac {t^{2}}{\ln(xy_{1})}}\right]_{t=2}^{t=K}={\frac {C^{-1}}{y_{1}^{1/K}}}\left[-\exp \left({\frac {\frac {(1-C+2\alpha )\ln(x)}{1-C}}{t}}\right){\frac {t^{2}}{\ln(xy_{1})}}\right]_{t=2}^{t=K}}
.
From this we deduce that in order to obtain an asymptotically sharp error term, we need to set
α
=
0
{\displaystyle \alpha =0}
. But doing so yields the desired result.
◻
{\displaystyle \Box }
Arithmetic functions
In this chapter, we shall set up the basic theory of arithmetic functions. This theory will be seen in action in later chapters, but in particular in chapter 9.
Definition 2.1 :
An arithmetical function is a function
f
:
N
→
C
{\displaystyle f:\mathbb {N} \to \mathbb {C} }
.
Definition 2.2 (important arithmetical functions) :
The Kronecker delta :
δ
(
n
)
:=
{
1
n
=
1
0
n
≠
1
{\displaystyle \delta (n):={\begin{cases}1&n=1\\0&n\neq 1\end{cases}}}
Euler's totient function :
φ
(
n
)
:=
|
{
1
≤
k
≤
n
|
gcd
(
k
,
n
)
=
1
}
|
{\displaystyle \varphi (n):=\left|\{1\leq k\leq n|\gcd(k,n)=1\}\right|}
Möbius'
μ
{\displaystyle \mu }
-function :
μ
(
n
)
:=
{
0
there exists
m
∈
N
such that
m
2
|
n
(
−
1
)
r
n
is product of
r
pairwise different prime numbers
1
n
=
1
{\displaystyle \mu (n):={\begin{cases}0&{\text{ there exists }}m\in \mathbb {N} {\text{ such that }}m^{2}|n\\(-1)^{r}&n{\text{ is product of }}r{\text{ pairwise different prime numbers}}\\1&n=1\end{cases}}}
The von Mangoldt function :
Λ
(
n
)
:=
{
log
(
p
)
∃
p
prime
,
k
∈
N
:
n
=
p
k
0
otherwise
{\displaystyle \Lambda (n):={\begin{cases}\log(p)&\exists p{\text{ prime}},k\in \mathbb {N} :n=p^{k}\\0&{\text{otherwise}}\end{cases}}}
The monomials :
I
k
(
n
)
:=
n
k
{\displaystyle I_{k}(n):=n^{k}}
The number of distinct prime divisors :
ω
(
n
)
:=
r
,
n
=
p
1
k
1
⋯
p
r
k
r
{\displaystyle \omega (n):=r,n=p_{1}^{k_{1}}\cdots p_{r}^{k_{r}}}
,
k
1
,
…
,
k
r
∈
N
{\displaystyle k_{1},\ldots ,k_{r}\in \mathbb {N} }
The sum of prime factors with multiplicity :
Ω
(
n
)
:=
k
1
+
⋯
+
k
r
,
n
=
p
1
k
1
⋯
p
r
k
r
{\displaystyle \Omega (n):=k_{1}+\cdots +k_{r},n=p_{1}^{k_{1}}\cdots p_{r}^{k_{r}}}
,
k
1
,
…
,
k
r
∈
N
{\displaystyle k_{1},\ldots ,k_{r}\in \mathbb {N} }
The Liouville function :
λ
(
n
)
:=
(
−
1
)
Ω
(
n
)
{\displaystyle \lambda (n):=(-1)^{\Omega (n)}}
Exercise 2.1.1 : Compute
φ
(
20
)
{\displaystyle \varphi (20)}
,
φ
(
17
)
{\displaystyle \varphi (17)}
and
φ
(
22
)
{\displaystyle \varphi (22)}
.
Exercise 2.1.2 : Compute
μ
(
4278
)
{\displaystyle \mu (4278)}
. Hint:
4278
=
2
⋅
3
⋅
23
⋅
31
{\displaystyle 4278=2\cdot 3\cdot 23\cdot 31}
.
Exercise 2.1.3 : Compute
Λ
(
49
)
{\displaystyle \Lambda (49)}
up to three decimal places. Hint: Use a Taylor expansion.
Exercise 2.1.4 : Prove that for each
n
∈
N
{\displaystyle n\in \mathbb {N} }
and
k
1
,
k
2
,
k
3
,
k
4
∈
N
{\displaystyle k_{1},k_{2},k_{3},k_{4}\in \mathbb {N} }
μ
(
n
+
4
k
1
)
μ
(
n
+
4
k
2
+
1
)
μ
(
n
+
4
k
3
+
2
)
μ
(
n
+
4
k
4
+
3
)
=
0
{\displaystyle \mu (n+4k_{1})\mu (n+4k_{2}+1)\mu (n+4k_{3}+2)\mu (n+4k_{4}+3)=0}
.
The convolution and the ring of arithmetic functions
edit
In the following theorem, we show that the arithmetical functions form an Abelian monoid , where the monoid operation is given by the convolution. Further, since the sum of two arithmetic functions is again an arithmetic function, the arithmetic functions form a commutative ring. In fact, as we shall also see, they form an integral domain.
Proof :
1.:
f
∗
g
(
n
)
=
∑
d
|
n
f
(
d
)
g
(
n
d
)
=
∑
d
|
n
f
(
ψ
(
d
)
)
g
(
ψ
(
n
d
)
)
=
∑
d
|
n
f
(
n
d
)
g
(
d
)
=
g
∗
f
(
n
)
{\displaystyle {\begin{aligned}f*g(n)&=\sum _{d|n}f(d)g\left({\frac {n}{d}}\right)=\sum _{d|n}f(\psi (d))g\left(\psi \left({\frac {n}{d}}\right)\right)\\&=\sum _{d|n}f\left({\frac {n}{d}}\right)g(d)=g*f(n)\end{aligned}}}
,
where
ψ
(
d
)
:=
n
d
{\displaystyle \psi (d):={\frac {n}{d}}}
is a bijection from the set of divisors of
n
{\displaystyle n}
to itself.
2.:
f
∗
(
g
∗
h
)
=
f
∗
(
h
∗
g
)
=
∑
d
1
|
n
f
(
d
1
)
(
∑
d
2
|
n
d
1
g
(
n
d
1
d
2
)
h
(
d
2
)
)
=
∑
d
1
|
n
∑
d
2
|
n
d
1
f
(
d
1
)
g
(
n
d
1
d
2
)
h
(
d
2
)
=
∑
d
2
|
n
∑
d
1
|
n
d
2
f
(
d
1
)
g
(
n
d
1
d
2
)
h
(
d
2
)
{\displaystyle {\begin{aligned}f*(g*h)=f*(h*g)&=\sum _{d_{1}|n}f(d_{1})\left(\sum _{d_{2}|{\frac {n}{d_{1}}}}g\left({\frac {n}{d_{1}d_{2}}}\right)h(d_{2})\right)\\&=\sum _{d_{1}|n}\sum _{d_{2}|{\frac {n}{d_{1}}}}f(d_{1})g\left({\frac {n}{d_{1}d_{2}}}\right)h(d_{2})\\&=\sum _{d_{2}|n}\sum _{d_{1}|{\frac {n}{d_{2}}}}f(d_{1})g\left({\frac {n}{d_{1}d_{2}}}\right)h(d_{2})\end{aligned}}}
,
where the last equality follows from the identity function
I
d
:
{
(
d
1
,
d
2
)
:
d
2
|
n
,
d
1
|
n
d
2
}
→
{
(
d
1
,
d
2
)
:
d
1
|
n
,
d
2
|
n
d
1
}
{\displaystyle Id:\left\{(d_{1},d_{2}):d_{2}|n,d_{1}{\big |}{\frac {n}{d_{2}}}\right\}\to \left\{(d_{1},d_{2}):d_{1}|n,d_{2}{\big |}{\frac {n}{d_{1}}}\right\}}
being a bijection. But
∑
d
2
|
n
∑
d
1
|
n
d
2
f
(
d
1
)
g
(
n
d
1
d
2
)
h
(
d
2
)
=
(
f
∗
g
)
∗
h
{\displaystyle \sum _{d_{2}|n}\sum _{d_{1}|{\frac {n}{d_{2}}}}f(d_{1})g\left({\frac {n}{d_{1}d_{2}}}\right)h(d_{2})=(f*g)*h}
and hence associativity.
3.:
δ
∗
f
(
n
)
=
∑
d
|
n
δ
(
d
)
f
(
n
d
)
=
f
(
n
)
{\displaystyle \delta *f(n)=\sum _{d|n}\delta (d)f\left({\frac {n}{d}}\right)=f(n)}
◻
{\displaystyle \Box }
Theorem 2.5 :
The ring of arithmetic functions is an integral domain.
Proof :
Let
f
,
g
≠
0
{\displaystyle f,g\neq 0}
be arithmetic functions, and let
n
,
k
∈
N
{\displaystyle n,k\in \mathbb {N} }
be minimal such that
f
(
n
)
≠
0
{\displaystyle f(n)\neq 0}
,
g
(
k
)
≠
0
{\displaystyle g(k)\neq 0}
. Then
f
∗
g
(
n
k
)
=
∑
d
|
n
k
f
(
d
)
g
(
n
k
d
)
=
f
(
n
)
g
(
k
)
{\displaystyle f*g(nk)=\sum _{d|nk}f(d)g\left({\frac {nk}{d}}\right)=f(n)g(k)}
.
◻
{\displaystyle \Box }
We shall now determine the units of the ring of arithmetic functions.
Theorem 2.6 :
Let
f
{\displaystyle f}
be an arithmetic function. Then
f
{\displaystyle f}
is invertible (with respect to convolution) if and only if
f
(
1
)
≠
0
{\displaystyle f(1)\neq 0}
.
Proof :
Assume first
f
(
1
)
=
0
{\displaystyle f(1)=0}
. Then for any arithmetic function
g
{\displaystyle g}
,
f
∗
g
(
1
)
=
0
≠
1
=
δ
(
1
)
{\displaystyle f*g(1)=0\neq 1=\delta (1)}
.
Assume now
f
(
1
)
≠
0
{\displaystyle f(1)\neq 0}
. Then
g
{\displaystyle g}
, given by the recursive formula
g
(
1
)
=
f
(
1
)
−
1
{\displaystyle g(1)=f(1)^{-1}}
,
g
(
n
)
=
−
f
(
1
)
−
1
∑
d
|
n
d
<
n
g
(
d
)
f
(
n
d
)
=
g
(
n
)
−
f
(
1
)
−
1
g
∗
f
(
n
)
{\displaystyle g(n)=-f(1)^{-1}\sum _{d|n \atop d<n}g(d)f\left({\frac {n}{d}}\right)=g(n)-f(1)^{-1}g*f(n)}
,
n
>
1
{\displaystyle n>1}
is an inverse (and thus the inverse) of
f
{\displaystyle f}
, since
f
∗
g
(
1
)
=
f
(
1
)
g
(
1
)
=
1
{\displaystyle f*g(1)=f(1)g(1)=1}
and for
n
>
1
{\displaystyle n>1}
inductively
f
∗
g
(
n
)
=
f
∗
g
(
n
)
−
f
(
1
)
−
1
f
∗
g
∗
f
(
n
)
=
f
∗
g
(
n
)
−
f
−
1
∑
d
|
n
d
<
n
f
∗
g
(
d
)
f
(
n
d
)
=
0
{\displaystyle {\begin{aligned}f*g(n)&=f*g(n)-f(1)^{-1}f*g*f(n)\\&=f*g(n)-f^{-1}\sum _{d|n \atop d<n}f*g(d)f\left({\frac {n}{d}}\right)=0\end{aligned}}}
◻
{\displaystyle \Box }
Exercise 2.2.1 :
Exercise 2.2.2 :
Multiplicative functions
edit
Definition 2.7 :
An arithmetical function
f
:
N
→
C
{\displaystyle f:\mathbb {N} \to \mathbb {C} }
is called multiplicative iff it satisfies
∀
n
,
k
∈
N
:
gcd
(
n
,
k
)
=
1
⇒
f
(
n
k
)
=
f
(
n
)
f
(
k
)
{\displaystyle \forall n,k\in \mathbb {N} :\gcd(n,k)=1\Rightarrow f(nk)=f(n)f(k)}
, and
f
(
1
)
=
1
{\displaystyle f(1)=1}
.
Theorem 2.8 :
Let
f
,
g
{\displaystyle f,g}
be multiplicative arithmetical functions. Then
f
∗
g
{\displaystyle f*g}
is multiplicative.
Proof :
Let
gcd
(
k
,
n
)
=
1
{\displaystyle \gcd(k,n)=1}
. Then
f
∗
g
(
k
n
)
=
∑
d
|
(
k
n
)
f
(
d
)
g
(
k
n
d
)
=
∑
d
1
|
k
d
2
|
n
f
(
d
1
d
2
)
g
(
k
n
d
1
d
2
)
{\displaystyle f*g(kn)=\sum _{d|(kn)}f(d)g\left({\frac {kn}{d}}\right)=\sum _{d_{1}|k \atop d_{2}|n}f(d_{1}d_{2})g\left({\frac {kn}{d_{1}d_{2}}}\right)}
,
since the function
θ
(
d
)
:=
(
gcd
(
d
,
n
)
,
gcd
(
d
,
k
)
)
{\displaystyle \theta (d):=(\gcd(d,n),\gcd(d,k))}
is a bijection from the divisors of
n
k
{\displaystyle nk}
to the Cartesian product of the divisors of
n
{\displaystyle n}
and the divisors of
k
{\displaystyle k}
; this is because multiplication is the inverse:
gcd
(
d
,
n
)
gcd
(
d
,
k
)
=
d
{\displaystyle \gcd(d,n)\gcd(d,k)=d}
,
(
gcd
(
d
1
d
2
,
n
)
,
gcd
(
d
1
d
2
,
k
)
)
=
(
d
1
,
d
2
)
{\displaystyle (\gcd(d_{1}d_{2},n),\gcd(d_{1}d_{2},k))=(d_{1},d_{2})}
.
To rigorously prove this actually is an exercise in itself. But due to the multiplicativity of
f
{\displaystyle f}
and
g
{\displaystyle g}
,
∑
d
1
|
k
d
2
|
n
f
(
d
1
d
2
)
g
(
k
n
d
1
d
2
)
=
∑
d
1
|
k
d
2
|
n
f
(
d
1
)
g
(
k
d
1
)
f
(
d
2
)
g
(
n
d
2
)
=
f
∗
g
(
k
)
⋅
f
∗
g
(
n
)
{\displaystyle \sum _{d_{1}|k \atop d_{2}|n}f(d_{1}d_{2})g\left({\frac {kn}{d_{1}d_{2}}}\right)=\sum _{d_{1}|k \atop d_{2}|n}f(d_{1})g\left({\frac {k}{d_{1}}}\right)f(d_{2})g\left({\frac {n}{d_{2}}}\right)=f*g(k)\cdot f*g(n)}
.
Furthermore,
f
∗
g
(
1
)
=
f
(
1
)
g
(
1
)
=
1
{\displaystyle f*g(1)=f(1)g(1)=1}
.
◻
{\displaystyle \Box }
Since
δ
{\displaystyle \delta }
is multiplicative, we conclude that the multiplicative functions form an Abelian submonoid of the arithmetic functions with convolution. Unfortunately, we do not have a subring since the sum of two multiplicative functions is never multiplicative (look at
n
=
1
{\displaystyle n=1}
).
Theorem 2.9 :
Let
f
{\displaystyle f}
be a multiplicative function such that
∑
j
=
1
∞
f
(
j
)
{\displaystyle \sum _{j=1}^{\infty }f(j)}
converges absolutely. Then
∑
j
=
1
∞
f
(
j
)
=
∏
p
prime
(
∑
k
=
0
∞
f
(
p
k
)
)
{\displaystyle \sum _{j=1}^{\infty }f(j)=\prod _{p{\text{ prime}}}\left(\sum _{k=0}^{\infty }f(p^{k})\right)}
.
Proof : Let
p
1
,
p
2
,
p
3
,
…
=
2
,
3
,
5
,
…
{\displaystyle p_{1},p_{2},p_{3},\ldots =2,3,5,\ldots }
be the ordered sequence of all prime numbers. For all
m
1
,
…
,
m
r
,
r
∈
N
{\displaystyle m_{1},\ldots ,m_{r},r\in \mathbb {N} }
we have
S
f
,
r
,
m
1
,
…
,
m
r
:=
∑
d
|
(
p
1
m
1
⋯
p
r
m
r
)
f
(
d
)
=
∑
0
≤
k
1
≤
m
1
⋯
∑
0
≤
k
r
≤
m
r
f
(
p
1
k
1
)
⋯
f
(
p
r
k
r
)
=
∏
l
=
1
r
(
∑
k
=
0
m
l
f
(
p
l
k
)
)
{\displaystyle S_{f,r,m_{1},\ldots ,m_{r}}:=\sum _{d|\left(p_{1}^{m_{1}}\cdots p_{r}^{m_{r}}\right)}f(d)=\sum _{0\leq k_{1}\leq m_{1}}\cdots \sum _{0\leq k_{r}\leq m_{r}}f\left(p_{1}^{k_{1}}\right)\cdots f\left(p_{r}^{k_{r}}\right)=\prod _{l=1}^{r}\left(\sum _{k=0}^{m_{l}}f(p_{l}^{k})\right)}
due to the multiplicativity of
f
{\displaystyle f}
. For each
r
{\displaystyle r}
, we successively take
m
1
→
∞
{\displaystyle m_{1}\to \infty }
, ...,
m
r
→
∞
{\displaystyle m_{r}\to \infty }
and then
r
→
∞
{\displaystyle r\to \infty }
. It follows from the definitions and the rule
x
n
→
x
⇒
y
x
n
→
y
x
{\displaystyle x_{n}\to x\Rightarrow yx_{n}\to yx}
that the right hand side converges to
∏
l
=
1
∞
(
∑
k
=
0
∞
f
(
p
l
k
)
)
{\displaystyle \prod _{l=1}^{\infty }\left(\sum _{k=0}^{\infty }f(p_{l}^{k})\right)}
.
We claim that
lim
r
→
∞
lim
m
1
→
∞
⋯
lim
m
r
→
∞
S
f
,
r
,
m
1
,
…
,
m
r
=
∑
j
=
1
∞
f
(
j
)
{\displaystyle \lim _{r\to \infty }\lim _{m_{1}\to \infty }\cdots \lim _{m_{r}\to \infty }S_{f,r,m_{1},\ldots ,m_{r}}=\sum _{j=1}^{\infty }f(j)}
.
Indeed, choose
N
∈
N
{\displaystyle N\in \mathbb {N} }
such that
∑
j
=
N
+
1
∞
|
f
(
j
)
|
<
ϵ
{\displaystyle \sum _{j=N+1}^{\infty }|f(j)|<\epsilon }
.
Then by the fundamental theorem of arithmetic, there exists an
R
∈
N
{\displaystyle R\in \mathbb {N} }
and
M
1
,
…
,
M
R
∈
N
{\displaystyle M_{1},\ldots ,M_{R}\in \mathbb {N} }
such that
∀
j
∈
{
1
,
…
,
N
}
:
j
|
p
1
M
1
⋯
p
R
M
R
{\displaystyle \forall j\in \{1,\ldots ,N\}:j|p_{1}^{M_{1}}\cdots p_{R}^{M_{R}}}
.
Then we have by the triangle inequality for
T
>
R
{\displaystyle T>R}
,
L
1
>
M
1
,
…
,
L
R
>
M
R
{\displaystyle L_{1}>M_{1},\ldots ,L_{R}>M_{R}}
and
L
R
+
1
,
…
,
L
T
{\displaystyle L_{R+1},\ldots ,L_{T}}
arbitrary that
|
S
f
,
T
,
L
1
,
…
,
L
T
−
∑
j
=
1
∞
f
(
j
)
|
≤
|
∑
d
|
(
p
1
L
1
⋯
p
T
L
T
)
d
≤
N
f
(
d
)
−
∑
j
=
1
N
f
(
j
)
|
+
|
∑
d
|
(
p
1
L
1
⋯
p
T
L
T
)
d
>
N
f
(
d
)
−
∑
j
=
N
+
1
∞
f
(
j
)
|
<
0
+
ϵ
.
{\displaystyle {\begin{aligned}\left|S_{f,T,L_{1},\ldots ,L_{T}}-\sum _{j=1}^{\infty }f(j)\right|&\leq \left|\sum _{d|\left(p_{1}^{L_{1}}\cdots p_{T}^{L_{T}}\right) \atop d\leq N}f(d)-\sum _{j=1}^{N}f(j)\right|+\left|\sum _{d|\left(p_{1}^{L_{1}}\cdots p_{T}^{L_{T}}\right) \atop d>N}f(d)-\sum _{j=N+1}^{\infty }f(j)\right|\\&<0+\epsilon .\end{aligned}}}
From this easily follows the claim.
It is left to show that the product on the left is independent of the order of multiplication. But this is clear since if the sequence
(
p
n
)
n
∈
N
{\displaystyle (p_{n})_{n\in \mathbb {N} }}
is enumerated differently, the argument works in just the same way and the left hand side remains the same.
◻
{\displaystyle \Box }
Definition 2.10 :
An arithmetical function
f
:
N
→
C
{\displaystyle f:\mathbb {N} \to \mathbb {C} }
is called strongly multiplicative iff it satisfies
∀
n
,
k
∈
N
:
f
(
n
k
)
=
f
(
n
)
f
(
k
)
{\displaystyle \forall n,k\in \mathbb {N} :f(nk)=f(n)f(k)}
, and
f
(
1
)
=
1
{\displaystyle f(1)=1}
.
Equivalently, a strongly multiplicative function is a monoid homomorphism
N
→
C
×
{\displaystyle \mathbb {N} \to \mathbb {C} ^{\times }}
.
Theorem 2.11 :
Let
f
{\displaystyle f}
be a strongly multiplicative function such that
∑
j
=
1
∞
f
(
j
)
{\displaystyle \sum _{j=1}^{\infty }f(j)}
converges absolutely. Then
∑
j
=
1
∞
f
(
j
)
=
∏
p
prime
1
1
−
f
(
p
)
{\displaystyle \sum _{j=1}^{\infty }f(j)=\prod _{p{\text{ prime}}}{\frac {1}{1-f(p)}}}
.
Proof :
Due to theorem 2.9, we have
∑
n
=
1
∞
f
(
n
)
=
∏
p
prime
(
∑
k
=
0
∞
f
(
p
k
)
)
{\displaystyle \sum _{n=1}^{\infty }f(n)=\prod _{p{\text{ prime}}}\left(\sum _{k=0}^{\infty }f(p^{k})\right)}
.
Due to strong multiplicativity and the geometric series, the latter expression equals
∏
p
prime
1
1
−
f
(
p
)
{\displaystyle \prod _{p{\text{ prime}}}{\frac {1}{1-f(p)}}}
.
◻
{\displaystyle \Box }
Exercise 2.3.1 : Let
h
{\displaystyle h}
be an arithmetic function such that for all
m
,
n
∈
N
{\displaystyle m,n\in \mathbb {N} }
h
(
n
m
)
=
h
(
n
)
+
h
(
m
)
{\displaystyle h(nm)=h(n)+h(m)}
, and let
s
∈
C
∖
{
0
}
{\displaystyle s\in \mathbb {C} \setminus \{0\}}
. Prove that the function
f
(
n
)
:=
exp
(
log
(
s
)
h
(
n
)
)
{\displaystyle f(n):=\exp(\log(s)h(n))}
is multiplicative.
Examples 2.13 :
We shall here compute the Bell series for some important arithmetic functions.
We note that in general for a completely multiplicative function
f
{\displaystyle f}
, we have
f
p
(
x
)
=
∑
j
=
0
∞
f
(
p
)
j
x
j
=
1
1
−
f
(
p
)
x
{\displaystyle f_{p}(x)=\sum _{j=0}^{\infty }f(p)^{j}x^{j}={\frac {1}{1-f(p)x}}}
.
In particular, in this case the Bell series defines a function.
1. The Kronecker delta:
δ
p
(
x
)
=
∑
j
=
0
∞
δ
(
p
j
)
x
j
=
1
{\displaystyle \delta _{p}(x)=\sum _{j=0}^{\infty }\delta (p^{j})x^{j}=1}
2. Euler' totient function (we use lemma 9.?):
φ
p
(
x
)
=
∑
j
=
0
∞
φ
(
p
j
)
x
j
=
∑
j
=
0
∞
φ
(
p
j
)
x
j
=
1
+
∑
j
=
1
∞
(
p
j
−
p
j
−
1
)
x
j
=
1
+
x
1
−
p
x
{\displaystyle {\begin{aligned}\varphi _{p}(x)&=\sum _{j=0}^{\infty }\varphi (p^{j})x^{j}\\&=\sum _{j=0}^{\infty }\varphi (p^{j})x^{j}\\&=1+\sum _{j=1}^{\infty }(p^{j}-p^{j-1})x^{j}\\&={\frac {1+x}{1-px}}\end{aligned}}}
3. The Möbius
m
u
{\displaystyle mu}
function:
μ
p
(
x
)
=
∑
j
=
0
∞
μ
(
p
j
)
x
j
=
1
−
x
{\displaystyle \mu _{p}(x)=\sum _{j=0}^{\infty }\mu (p^{j})x^{j}=1-x}
4. The von Mangoldt function:
Λ
p
(
x
)
=
∑
j
=
0
∞
Λ
(
p
j
)
x
j
=
∑
j
=
0
∞
log
(
p
)
x
j
{\displaystyle \Lambda _{p}(x)=\sum _{j=0}^{\infty }\Lambda (p^{j})x^{j}=\sum _{j=0}^{\infty }\log(p)x^{j}}
5. The monomials:
(
I
k
)
p
=
∑
j
=
0
∞
p
k
j
x
j
=
1
1
−
p
k
x
{\displaystyle (I_{k})_{p}=\sum _{j=0}^{\infty }p^{kj}xj={\frac {1}{1-p^{k}x}}}
6. The number of distinct prime divisors:
ω
p
(
x
)
=
∑
j
=
1
∞
x
j
=
1
1
−
x
−
1
{\displaystyle \omega _{p}(x)=\sum _{j=1}^{\infty }x^{j}={\frac {1}{1-x}}-1}
7. The number of prime divisors including multiplicity:
Ω
p
(
x
)
=
∑
j
=
1
∞
j
x
j
=
1
x
∑
j
=
0
∞
(
x
j
)
′
=
1
x
(
∑
j
=
0
∞
x
j
)
′
=
1
x
(
1
1
−
x
)
′
=
−
1
x
1
(
1
−
x
)
2
{\displaystyle {\begin{aligned}\Omega _{p}(x)&=\sum _{j=1}^{\infty }jx^{j}\\&={\frac {1}{x}}\sum _{j=0}^{\infty }(x^{j})'\\&={\frac {1}{x}}\left(\sum _{j=0}^{\infty }x^{j}\right)'\\&={\frac {1}{x}}\left({\frac {1}{1-x}}\right)'\\&=-{\frac {1}{x}}{\frac {1}{(1-x)^{2}}}\end{aligned}}}
8. The Liouville function:
λ
p
(
x
)
=
∑
j
=
0
∞
(
−
1
)
j
x
j
=
1
1
+
x
{\displaystyle \lambda _{p}(x)=\sum _{j=0}^{\infty }(-1)^{j}x^{j}={\frac {1}{1+x}}}
Theorem 2.14 (compatibility of Bell series and convolution) :
Let
f
,
g
{\displaystyle f,g}
arithmetic functions, and
p
{\displaystyle p}
be a prime. Then
f
p
⋅
g
p
=
(
f
∗
g
)
p
{\displaystyle f_{p}\cdot g_{p}=(f*g)_{p}}
.
Proof :
f
p
(
x
)
g
p
(
x
)
=
(
∑
j
=
0
∞
f
(
p
j
)
x
j
)
(
∑
j
=
0
∞
g
(
p
j
)
x
j
)
=
∑
k
=
0
∞
x
k
(
∑
j
=
0
k
f
(
p
j
)
g
(
p
k
−
j
)
)
=
∑
k
=
0
∞
x
k
(
f
∗
g
)
(
p
k
)
{\displaystyle {\begin{aligned}f_{p}(x)g_{p}(x)&=\left(\sum _{j=0}^{\infty }f(p^{j})x^{j}\right)\left(\sum _{j=0}^{\infty }g(p^{j})x^{j}\right)\\&=\sum _{k=0}^{\infty }x^{k}\left(\sum _{j=0}^{k}f(p^{j})g(p^{k-j})\right)\\&=\sum _{k=0}^{\infty }x^{k}(f*g)(p^{k})\end{aligned}}}
◻
{\displaystyle \Box }
In case of multiplicativity, we have the following theorem:
Theorem 2.15 (Uniqueness theorem) :
Let
f
,
g
{\displaystyle f,g}
be multiplicative functions. Then
f
=
g
⇔
f
p
=
g
p
{\displaystyle f=g\Leftrightarrow f_{p}=g_{p}}
.
Proof :
⇒
{\displaystyle \Rightarrow }
is pretty obvious;
⇐
{\displaystyle \Leftarrow }
:
f
p
(
x
)
=
g
p
(
x
)
{\displaystyle f_{p}(x)=g_{p}(x)}
as formal power series is equivalent to saying
∀
k
∈
N
:
f
(
p
k
)
=
g
(
p
k
)
{\displaystyle \forall k\in \mathbb {N} :f(p^{k})=g(p^{k})}
. If now
n
=
p
1
k
1
⋯
p
r
k
r
{\displaystyle n=p_{1}^{k_{1}}\cdots p_{r}^{k_{r}}}
, then
f
(
n
)
=
f
(
p
1
k
1
⋯
p
r
k
r
)
=
f
(
p
1
k
1
)
⋯
f
(
p
r
k
r
)
=
g
(
p
1
k
1
)
⋯
g
(
p
r
k
r
)
=
g
(
n
)
{\displaystyle f(n)=f(p_{1}^{k_{1}}\cdots p_{r}^{k_{r}})=f(p_{1}^{k_{1}})\cdots f(p_{r}^{k_{r}})=g(p_{1}^{k_{1}})\cdots g(p_{r}^{k_{r}})=g(n)}
due to the multiplicativity of
f
{\displaystyle f}
and
g
{\displaystyle g}
.
◻
{\displaystyle \Box }
In chapter 9, we will use Bell series to obtain equations for number-theoretic functions.
Definition 2.16 :
Let
f
{\displaystyle f}
be an arithmetic function. Then the derivative of
f
{\displaystyle f}
is defined to be the function
f
′
(
n
)
:=
f
(
n
)
log
(
n
)
{\displaystyle f'(n):=f(n)\log(n)}
.
Proof :
1. is easily checked.
2.:
(
f
∗
g
)
′
(
n
)
=
(
f
∗
g
)
(
n
)
log
(
n
)
=
∑
d
|
n
f
(
d
)
g
(
n
/
d
)
(
log
(
d
)
+
log
(
n
/
d
)
)
=
∑
d
|
n
f
(
d
)
g
(
n
/
d
)
log
(
d
)
+
∑
d
|
n
f
(
d
)
g
(
n
/
d
)
log
(
n
/
d
)
{\displaystyle {\begin{aligned}(f*g)'(n)&=(f*g)(n)\log(n)\\&=\sum _{d|n}f(d)g(n/d)(\log(d)+\log(n/d))\\&=\sum _{d|n}f(d)g(n/d)\log(d)+\sum _{d|n}f(d)g(n/d)\log(n/d)\end{aligned}}}
3.
We have
δ
′
=
0
{\displaystyle \delta '=0}
and
(
f
∗
f
−
1
)
=
δ
{\displaystyle (f*f^{-1})=\delta }
. Hence, by 2.
0
=
(
f
∗
f
−
1
)
′
=
f
′
∗
f
−
1
+
f
∗
(
f
−
1
)
′
{\displaystyle 0=(f*f^{-1})'=f'*f^{-1}+f*(f^{-1})'}
.
Convolving with
f
−
1
{\displaystyle f^{-1}}
and using
f
−
1
∗
f
−
1
=
(
f
∗
f
)
−
1
{\displaystyle f^{-1}*f^{-1}=(f*f)^{-1}}
yields the desired formula.
◻
{\displaystyle \Box }
Note that a chain rule wouldn't make much sense, since
f
{\displaystyle f}
arithmetic may map anywhere but to
N
{\displaystyle \mathbb {N} }
and thus
g
∘
f
{\displaystyle g\circ f}
doesn't make a lot of sense in general.
Characters and Dirichlet characters
Definitions, basic properties
edit
Lemma 4.2 :
Let
G
{\displaystyle G}
be a finite group and let
f
:
G
→
C
{\displaystyle f:G\to \mathbb {C} }
be a character. Then
∀
σ
∈
G
:
|
f
(
σ
)
|
=
1
{\displaystyle \forall \sigma \in G:|f(\sigma )|=1}
.
In particular,
∀
σ
∈
G
:
f
(
σ
)
≠
0
{\displaystyle \forall \sigma \in G:f(\sigma )\neq 0}
.
Proof :
Since
G
{\displaystyle G}
is finite, each
σ
∈
G
{\displaystyle \sigma \in G}
has finite order
n
:=
o
r
d
(
σ
)
{\displaystyle n:=\mathrm {ord} (\sigma )}
. Furthermore, let
ρ
∈
G
{\displaystyle \rho \in G}
such that
f
(
ρ
)
≠
0
{\displaystyle f(\rho )\neq 0}
; then
f
(
ρ
)
=
f
(
σ
)
f
(
σ
−
1
ρ
)
{\displaystyle f(\rho )=f(\sigma )f(\sigma ^{-1}\rho )}
and thus
f
(
σ
)
≠
0
{\displaystyle f(\sigma )\neq 0}
. Hence, we are allowed to cancel and
|
f
(
σ
)
|
=
|
f
(
σ
n
+
1
)
|
=
|
f
(
σ
)
|
n
+
1
⇒
|
f
(
σ
)
|
=
1
{\displaystyle |f(\sigma )|=|f(\sigma ^{n+1})|=|f(\sigma )|^{n+1}\Rightarrow |f(\sigma )|=1}
.
◻
{\displaystyle \Box }
Lemma 4.3 :
Let
G
{\displaystyle G}
be a finite group and let
f
,
g
:
G
→
C
{\displaystyle f,g:G\to \mathbb {C} }
be characters. Then the function
h
:
G
→
C
,
h
(
τ
)
:=
f
(
τ
)
⋅
g
(
τ
)
{\displaystyle h:G\to \mathbb {C} ,h(\tau ):=f(\tau )\cdot g(\tau )}
is also a character.
Proof :
h
(
σ
τ
)
=
f
(
σ
τ
)
g
(
σ
τ
)
=
f
(
σ
)
g
(
σ
)
f
(
τ
)
g
(
τ
)
=
h
(
σ
)
h
(
τ
)
≠
0
{\displaystyle h(\sigma \tau )=f(\sigma \tau )g(\sigma \tau )=f(\sigma )g(\sigma )f(\tau )g(\tau )=h(\sigma )h(\tau )\neq 0}
,
since
C
{\displaystyle \mathbb {C} }
is a field and thus free of zero divisors.
◻
{\displaystyle \Box }
Lemma 4.4 :
Let
G
{\displaystyle G}
be a finite group and let
f
:
G
→
C
{\displaystyle f:G\to \mathbb {C} }
be a character. Then the function
g
:
G
→
C
,
g
(
τ
)
:=
1
f
(
τ
)
{\displaystyle g:G\to \mathbb {C} ,g(\tau ):={\frac {1}{f(\tau )}}}
is also a character.
Proof : Trivial, since
∀
τ
∈
G
:
f
(
τ
)
≠
0
{\displaystyle \forall \tau \in G:f(\tau )\neq 0}
as shown by the previous lemma.
◻
{\displaystyle \Box }
The previous three lemmas (or only the first, together with a few lemmas from elementary group theory) justify the following definition.
Definition 4.5
Let
G
{\displaystyle G}
be a finite group. Then the group
{
f
:
G
→
C
|
f
is a character
}
=
H
o
m
(
G
,
C
×
)
{\displaystyle \{f:G\to \mathbb {C} |f{\text{ is a character }}\}=\mathrm {Hom} (G,\mathbb {C} ^{\times })}
is called the character group of
G
{\displaystyle G}
.
We need the following result from group theory:
Proof :
Since
G
{\displaystyle G}
is the disjoint union of the cosets of
H
{\displaystyle H}
,
N
{\displaystyle N}
is the disjoint union
⋃
j
=
0
n
−
1
τ
j
H
{\displaystyle \bigcup _{j=0}^{n-1}\tau ^{j}H}
, as
ρ
H
=
H
⇔
ρ
∈
H
{\displaystyle \rho H=H\Leftrightarrow \rho \in H}
and
τ
l
H
=
τ
m
H
⇔
τ
l
−
m
∈
H
⇔
k
|
(
l
−
m
)
{\displaystyle \tau ^{l}H=\tau ^{m}H\Leftrightarrow \tau ^{l-m}\in H\Leftrightarrow k|(l-m)}
. Hence, the cardinality of
N
{\displaystyle N}
equals
k
⋅
n
{\displaystyle k\cdot n}
.
Furthermore, if
τ
l
σ
,
τ
m
ρ
∈
N
{\displaystyle \tau ^{l}\sigma ,\tau ^{m}\rho \in N}
, then
τ
l
σ
(
τ
m
ρ
)
−
1
=
τ
l
−
m
σ
ρ
−
1
∈
N
{\displaystyle \tau ^{l}\sigma (\tau ^{m}\rho )^{-1}=\tau ^{l-m}\sigma \rho ^{-1}\in N}
, and hence
N
{\displaystyle N}
is a subgroup.
◻
{\displaystyle \Box }
Theorems about characters
edit
Dirichlet characters
edit
Dirichlet series
For the remainder of this book, we shall use Riemann 's convention of denoting complex numbers :
s
=
σ
+
i
t
{\displaystyle s=\sigma +it}
Definition 5.1 :
Let
f
{\displaystyle f}
be an arithmetic function. Then the Dirichlet series associated to
f
{\displaystyle f}
is the series
∑
n
=
1
∞
f
(
n
)
n
s
{\displaystyle \sum _{n=1}^{\infty }{\frac {f(n)}{n^{s}}}}
,
where
s
{\displaystyle s}
ranges over the complex numbers.
Convergence considerations
edit
Proof :
Denote by
S
{\displaystyle S}
the set of all real numbers
σ
{\displaystyle \sigma }
such that
∑
n
=
1
∞
|
f
(
n
)
n
s
|
{\displaystyle \sum _{n=1}^{\infty }\left|{\frac {f(n)}{n^{s}}}\right|}
diverges. Due to the assumption, this set is neither empty nor equal to
C
{\displaystyle \mathbb {C} }
. Further, if
σ
0
+
i
t
0
∉
S
{\displaystyle \sigma _{0}+it_{0}\notin S}
, then for all
σ
>
σ
0
{\displaystyle \sigma >\sigma _{0}}
and all
t
{\displaystyle t}
σ
+
i
t
∉
S
{\displaystyle \sigma +it\notin S}
, since
|
f
(
n
)
n
s
0
|
=
|
f
(
n
)
|
n
σ
0
≥
|
f
(
n
)
|
n
σ
=
|
f
(
n
)
n
s
|
{\displaystyle \left|{\frac {f(n)}{n^{s_{0}}}}\right|={\frac {|f(n)|}{n^{\sigma _{0}}}}\geq {\frac {|f(n)|}{n^{\sigma }}}=\left|{\frac {f(n)}{n^{s}}}\right|}
and due to the comparison test. It follows that
S
{\displaystyle S}
has a supremum. Let
σ
a
{\displaystyle \sigma _{a}}
be that supremum. By definition, for
σ
>
σ
a
{\displaystyle \sigma >\sigma _{a}}
we have convergence, and if we had convergence for
σ
<
σ
a
{\displaystyle \sigma <\sigma _{a}}
we would have found a lower upper bound due to the above argument, contradicting the definition of
σ
a
{\displaystyle \sigma _{a}}
.
◻
{\displaystyle \Box }
Theorem 5.3 (abscissa of conditional convergence) :
Theorem 8.4 (Euler product) :
Let
f
{\displaystyle f}
be a strongly multiplicative function, and let
s
∈
C
{\displaystyle s\in \mathbb {C} }
such that the corresponding Dirichlet series converges absolutely. Then for that series we have the formula
∑
n
=
1
∞
f
(
n
)
n
s
=
∏
p
prime
1
1
−
f
(
p
)
p
s
{\displaystyle \sum _{n=1}^{\infty }{\frac {f(n)}{n^{s}}}=\prod _{p{\text{ prime}}}{\frac {1}{1-{\frac {f(p)}{p^{s}}}}}}
.
Proof :
This follows directly from theorem 2.11 and the fact that
f
{\displaystyle f}
strongly multiplicative
⇒
{\displaystyle \Rightarrow }
f
(
n
)
n
s
{\displaystyle {\frac {f(n)}{n^{s}}}}
strongly multiplicative.
◻
{\displaystyle \Box }
Lemma 2.9 :
∑
d
|
n
μ
(
d
)
n
d
=
∏
j
=
1
r
(
p
j
k
j
−
p
j
k
j
−
1
)
{\displaystyle \sum _{d|n}\mu (d){\frac {n}{d}}=\prod _{j=1}^{r}\left(p_{j}^{k_{j}}-p_{j}^{k_{j}-1}\right)}
.
Proof :
For
κ
∈
Z
r
{\displaystyle \kappa \in \mathbb {Z} ^{r}}
a multiindex,
α
∈
{
0
,
1
}
r
{\displaystyle \alpha \in \{0,1\}^{r}}
and
Q
∈
C
r
{\displaystyle Q\in \mathbb {C} ^{r}}
a vector define
Q
α
:=
∏
j
=
1
r
q
j
α
j
{\displaystyle Q^{\alpha }:=\prod _{j=1}^{r}q_{j}^{\alpha _{j}}}
,
Q
κ
:=
(
q
1
k
1
,
…
,
q
r
k
r
)
{\displaystyle Q^{\kappa }:=(q_{1}^{k_{1}},\ldots ,q_{r}^{k_{r}})}
.
Let
n
=
(
P
κ
)
1
=
p
1
k
1
⋯
p
r
k
r
{\displaystyle n=(P^{\kappa })^{1}=p_{1}^{k_{1}}\cdots p_{r}^{k_{r}}}
. Then
∏
j
=
1
r
(
p
j
k
j
−
p
j
k
j
−
1
)
=
∑
α
∈
{
0
,
1
}
r
(
P
κ
)
α
(
P
κ
−
1
)
1
−
α
(
−
1
)
|
1
−
α
|
=
∑
d
|
n
μ
(
d
)
n
d
{\displaystyle {\begin{aligned}\prod _{j=1}^{r}\left(p_{j}^{k_{j}}-p_{j}^{k_{j}-1}\right)&=\sum _{\alpha \in \{0,1\}^{r}}(P^{\kappa })^{\alpha }(P^{\kappa -1})^{1-\alpha }(-1)^{|}{1-\alpha |}\\&=\sum _{d|n}\mu (d){\frac {n}{d}}\end{aligned}}}
.
◻
{\displaystyle \Box }
Lemma 2.10 :
φ
=
μ
∗
I
1
{\displaystyle \varphi =\mu *I_{1}}
.
Proof 1 :
We prove the lemma from lemma 2.14.
We have by lemma 2.14
φ
(
n
)
=
∑
k
=
1
n
δ
(
gcd
(
k
,
n
)
)
=
∑
k
=
1
n
∑
d
|
gcd
(
k
,
n
)
μ
(
d
)
=
∑
d
|
n
∑
k
=
1
n
[
d
|
k
]
μ
(
d
)
=
∑
d
|
n
∑
j
=
1
n
/
d
μ
(
d
)
{\displaystyle {\begin{aligned}\varphi (n)&=\sum _{k=1}^{n}\delta (\gcd(k,n))\\&=\sum _{k=1}^{n}\sum _{d|\gcd(k,n)}\mu (d)\\&=\sum _{d|n}\sum _{k=1}^{n}[d|k]\mu (d)\\&=\sum _{d|n}\sum _{j=1}^{n/d}\mu (d)\end{aligned}}}
◻
{\displaystyle \Box }
Proof 2 :
We prove the lemma from the product formula for Euler's totient function and lemma 2.9. Indeed, for
n
=
p
1
k
1
⋯
p
r
k
r
{\displaystyle n=p_{1}^{k_{1}}\cdots p_{r}^{k_{r}}}
φ
(
n
)
=
∏
j
=
1
r
(
p
j
k
j
−
p
j
k
j
−
1
)
=
μ
∗
I
1
{\displaystyle \varphi (n)=\prod _{j=1}^{r}(p_{j}^{k_{j}}-p_{j}^{k_{j}-1})=\mu *I_{1}}
.
◻
{\displaystyle \Box }
Lemma 2.14 :
E
∗
μ
=
δ
{\displaystyle E*\mu =\delta }
.
Proof 1 :
We use the Möbius inversion formula.
Indeed,
E
(
n
)
=
∑
d
|
n
δ
(
d
)
=
1
{\displaystyle E(n)=\sum _{d|n}\delta (d)=1}
, and hence
δ
=
μ
∗
E
{\displaystyle \delta =\mu *E}
.
◻
{\displaystyle \Box }
Proof 2 :
We use multiplicativity.
Indeed, for a prime
p
{\displaystyle p}
,
k
∈
N
{\displaystyle k\in \mathbb {N} }
we have
E
∗
μ
(
p
k
)
=
∑
j
=
0
k
μ
(
p
j
)
=
μ
(
1
)
+
μ
(
p
)
=
0
{\displaystyle E*\mu (p^{k})=\sum _{j=0}^{k}\mu (p^{j})=\mu (1)+\mu (p)=0}
,
and thus due to the multiplicativity of
μ
{\displaystyle \mu }
and
E
{\displaystyle E}
E
∗
μ
(
n
)
=
0
{\displaystyle E*\mu (n)=0}
if
n
{\displaystyle n}
contains at least one prime factor. Since further
E
∗
μ
(
1
)
=
1
{\displaystyle E*\mu (1)=1}
the claim follows.
◻
{\displaystyle \Box }
Proof 3 :
We prove the lemma by direct computation. Indeed, if
1
≠
n
=
P
κ
{\displaystyle 1\neq n=P^{\kappa }}
, then
E
∗
μ
(
n
)
=
∑
d
|
n
μ
(
d
)
=
∑
α
∈
{
0
,
1
}
r
μ
(
P
α
)
=
∑
β
∈
{
0
,
1
}
r
−
1
(
μ
(
P
(
0
,
β
)
)
+
μ
(
P
(
1
,
β
)
)
)
=
0
{\displaystyle {\begin{aligned}E*\mu (n)&=\sum _{d|n}\mu (d)\\&=\sum _{\alpha \in \{0,1\}^{r}}\mu (P^{\alpha })\\&=\sum _{\beta \in \{0,1\}^{r-1}}\left(\mu (P^{(0,\beta )})+\mu (P^{(1,\beta )})\right)=0\end{aligned}}}
.
◻
{\displaystyle \Box }
Proof 4 :
We prove the lemma from the Binomial theorem and combinatorics.
Let
n
=
p
1
k
1
⋯
p
r
k
r
{\displaystyle n=p_{1}^{k_{1}}\cdots p_{r}^{k_{r}}}
. From combinatorics we note that for
m
≤
r
{\displaystyle m\leq r}
, there exist
(
m
r
)
{\displaystyle {\binom {m}{r}}}
distinct ways to pick a subset
I
⊆
{
1
,
…
,
r
}
{\displaystyle I\subseteq \{1,\ldots ,r\}}
such that
|
I
|
=
m
{\displaystyle |I|=m}
. Define
α
I
=
(
α
1
,
…
,
α
r
)
∈
{
0
,
1
}
r
{\displaystyle \alpha _{I}=(\alpha _{1},\ldots ,\alpha _{r})\in \{0,1\}^{r}}
where
α
j
=
1
⇔
j
∈
I
{\displaystyle \alpha _{j}=1\Leftrightarrow j\in I}
. Then, by the Binomial theorem
E
∗
μ
(
n
)
=
∑
d
|
n
μ
(
d
)
=
∑
I
⊆
{
1
,
…
,
r
}
μ
(
P
α
I
)
=
∑
I
⊆
{
1
,
…
,
r
}
(
−
1
)
|
I
|
=
∑
j
=
0
r
(
j
r
)
(
−
1
)
j
1
j
=
(
1
−
1
)
r
=
0
{\displaystyle {\begin{aligned}E*\mu (n)&=\sum _{d|n}\mu (d)\\&=\sum _{I\subseteq \{1,\ldots ,r\}}\mu (P^{\alpha _{I}})\\&=\sum _{I\subseteq \{1,\ldots ,r\}}(-1)^{|I|}\\&=\sum _{j=0}^{r}{\binom {j}{r}}(-1)^{j}1^{j}=(1-1)^{r}=0\end{aligned}}}
.
◻
{\displaystyle \Box }
Lemma 2.11 (Gauß 1801) :
∀
n
∈
N
:
n
=
∑
d
|
n
φ
(
d
)
{\displaystyle \forall n\in \mathbb {N} :n=\sum _{d|n}\varphi (d)}
.
Proof 1 :
We use the Möbius inversion formula, proven below without using this lemma, and lemma 2.10.
We have
∑
d
|
n
φ
(
d
)
=
S
φ
(
n
)
{\displaystyle \sum _{d|n}\varphi (d)=S_{\varphi }(n)}
and hence
φ
=
μ
∗
S
φ
{\displaystyle \varphi =\mu *S_{\varphi }}
by the Möbius inversion formula. On the other hand,
φ
(
n
)
=
μ
∗
I
1
(
n
)
{\displaystyle \varphi (n)=\mu *I_{1}(n)}
by lemma 2.10.
Hence, we obtain
μ
∗
S
φ
=
μ
∗
I
1
{\displaystyle \mu *S_{\varphi }=\mu *I_{1}}
, and by cancellation of
μ
{\displaystyle \mu }
(the arithmetic functions form an integral domain) we get the lemma.
◻
{\displaystyle \Box }
Proof 2 :
We use the converse of the Möbius inversion formula, proven below without using this lemma, and lemma 2.10.
Since
φ
(
n
)
=
μ
∗
I
1
(
n
)
{\displaystyle \varphi (n)=\mu *I_{1}(n)}
by lemma 2.10, we obtain from the converse of the Möbius inversion formula that
I
1
(
n
)
=
S
φ
(
n
)
{\displaystyle I_{1}(n)=S_{\varphi }(n)}
.
◻
{\displaystyle \Box }
Proof 3 :
We prove the lemma by double counting.
We first note that there are
n
{\displaystyle n}
many fractions of the form
m
n
{\displaystyle {\frac {m}{n}}}
,
1
≤
m
≤
n
{\displaystyle 1\leq m\leq n}
.
We now prove that there are also
∑
d
|
n
φ
(
d
)
{\displaystyle \sum _{d|n}\varphi (d)}
many fractions of this form. Indeed, each fraction
m
n
{\displaystyle {\frac {m}{n}}}
,
1
≤
m
≤
n
{\displaystyle 1\leq m\leq n}
can be reduced to
b
d
{\displaystyle {\frac {b}{d}}}
, where
gcd
(
b
,
d
)
=
1
{\displaystyle \gcd(b,d)=1}
.
d
{\displaystyle d}
is a divisor of
n
{\displaystyle n}
, since it is obtained by dividing
n
{\displaystyle n}
. Furthermore, for each divisor
d
{\displaystyle d}
of
n
{\displaystyle n}
there exist precisely
φ
(
d
)
{\displaystyle \varphi (d)}
many such fractions by definition of
φ
{\displaystyle \varphi }
.
◻
{\displaystyle \Box }
Proof 4 :
We prove the lemma by the means of set theory.
Define
S
n
,
d
:=
{
1
≤
l
≤
n
|
gcd
(
d
,
l
)
=
1
}
{\displaystyle S_{n,d}:=\{1\leq l\leq n|\gcd(d,l)=1\}}
. Then
S
n
,
d
=
{
d
k
|
1
≤
k
≤
n
/
d
,
gcd
(
k
,
n
)
=
1
}
=
d
S
n
/
d
,
1
{\displaystyle S_{n,d}=\{dk|1\leq k\leq n/d,\gcd(k,n)=1\}=dS_{n/d,1}}
. Since
|
S
n
/
d
,
1
|
=
φ
(
n
/
d
)
{\displaystyle |S_{n/d,1}|=\varphi (n/d)}
and
{
1
,
…
,
n
}
{\displaystyle \{1,\ldots ,n\}}
is the disjoint union of the sets
S
n
,
d
,
d
|
n
{\displaystyle S_{n,d},d|n}
, we thus have
n
=
∑
d
|
n
|
S
n
,
d
|
=
∑
d
|
n
d
φ
(
n
d
)
{\displaystyle n=\sum _{d|n}|S_{n,d}|=\sum _{d|n}d\varphi \left({\frac {n}{d}}\right)}
.
◻
{\displaystyle \Box }
The next theorem comprises one of the most important examples for a multiplicative function.
Theorem 2.12 (Euler 1761) :
Euler's totient function is multiplicative.
Proof 1 :
We prove the theorem using double counting (due to Kronecker ).
By definition of
φ
{\displaystyle \varphi }
, there are
φ
(
m
)
φ
(
n
)
{\displaystyle \varphi (m)\varphi (n)}
sums of the form
k
m
+
l
n
,
1
≤
k
≤
m
,
1
≤
l
≤
n
{\displaystyle {\frac {k}{m}}+{\frac {l}{n}},1\leq k\leq m,1\leq l\leq n}
,
where both summands are reduced. We claim that there is a bijection
{
k
m
+
l
n
|
1
≤
k
≤
m
,
1
≤
l
≤
n
,
k
m
,
l
n
reduced
}
→
{
r
m
n
|
1
≤
r
≤
r
m
n
reduced
}
{\displaystyle \left\{{\frac {k}{m}}+{\frac {l}{n}}{\big |}1\leq k\leq m,1\leq l\leq n,{\frac {k}{m}},{\frac {l}{n}}{\text{ reduced}}\right\}\to \left\{{\frac {r}{mn}}{\big |}1\leq r\leq {\frac {r}{mn}}{\text{ reduced}}\right\}}
.
From this would follow
φ
(
m
)
φ
(
n
)
=
φ
(
m
n
)
{\displaystyle \varphi (m)\varphi (n)=\varphi (mn)}
.
We claim that such a bijection is given by
k
m
+
l
n
↦
n
k
+
m
l
mod
m
n
n
m
{\displaystyle {\frac {k}{m}}+{\frac {l}{n}}\mapsto {\frac {nk+ml\mod mn}{nm}}}
.
Well-definedness: Let
k
m
{\displaystyle {\frac {k}{m}}}
,
l
n
{\displaystyle {\frac {l}{n}}}
be reduced. Then
k
m
+
l
n
=
k
n
+
l
m
mod
m
n
n
m
{\displaystyle {\frac {k}{m}}+{\frac {l}{n}}={\frac {kn+lm\mod mn}{nm}}}
is also reduced, for if
p
|
(
n
m
)
{\displaystyle p|(nm)}
, then without loss of generality
p
|
n
{\displaystyle p|n}
, and from
p
|
(
k
n
+
l
m
−
c
n
m
)
{\displaystyle p|(kn+lm-cnm)}
follows
p
|
l
{\displaystyle p|l}
or
p
|
m
{\displaystyle p|m}
. In both cases we obtain a contradiction, either to
gcd
(
m
,
n
)
=
1
{\displaystyle \gcd(m,n)=1}
or to
l
n
{\displaystyle {\frac {l}{n}}}
is reduced.
Surjectivity: Let
r
m
n
{\displaystyle {\frac {r}{mn}}}
be reduced. Using the Euclidean algorithm, we find
a
,
b
∈
N
{\displaystyle a,b\in \mathbb {N} }
such that
a
n
+
b
m
=
1
{\displaystyle an+bm=1}
. Then
r
a
n
+
r
b
m
=
r
{\displaystyle ran+rbm=r}
. Define
k
=
r
a
mod
m
{\displaystyle k=ra\mod m}
,
l
=
r
b
mod
n
{\displaystyle l=rb\mod n}
. Then
k
n
+
l
m
=
(
r
a
+
t
m
)
n
+
(
r
b
+
s
n
)
m
≡
r
mod
m
n
{\displaystyle kn+lm=(ra+tm)n+(rb+sn)m\equiv r\mod mn}
.
Injectivity: Let
k
n
+
l
m
≡
k
′
n
+
l
′
m
mod
m
n
{\displaystyle kn+lm\equiv k'n+l'm\mod mn}
. We show
k
=
k
′
{\displaystyle k=k'}
; the proof for
l
=
l
′
{\displaystyle l=l'}
is the same.
Indeed, from
k
n
+
l
m
≡
k
′
n
+
l
′
m
mod
m
n
{\displaystyle kn+lm\equiv k'n+l'm\mod mn}
follows
k
n
≡
k
′
n
mod
m
{\displaystyle kn\equiv k'n\mod m}
, and since
gcd
(
m
,
n
)
=
1
{\displaystyle \gcd(m,n)=1}
,
n
{\displaystyle n}
is invertible modulo
m
{\displaystyle m}
, which is why we may multiply this inverse on the right to obtain
k
≡
k
′
mod
m
{\displaystyle k\equiv k'\mod m}
. Since
1
≤
k
,
k
′
≤
m
{\displaystyle 1\leq k,k'\leq m}
, the claim follows.
◻
{\displaystyle \Box }
Proof 2 :
We prove the theorem from the Chinese remainder theorem .
Let
n
=
p
1
k
1
⋯
p
r
k
r
{\displaystyle n=p_{1}^{k_{1}}\cdots p_{r}^{k_{r}}}
. From the Chinese remainder theorem, we obtain a ring isomorphism
Z
/
n
→
Z
/
p
1
k
1
×
⋯
×
Z
/
p
r
k
r
{\displaystyle \mathbb {Z} /n\to \mathbb {Z} /p_{1}^{k_{1}}\times \cdots \times \mathbb {Z} /p_{r}^{k_{r}}}
,
which induces a group isomorphism
(
Z
/
n
)
×
→
(
Z
/
p
1
k
1
)
×
×
⋯
×
(
Z
/
p
r
k
r
)
×
{\displaystyle (\mathbb {Z} /n)^{\times }\to \left(\mathbb {Z} /p_{1}^{k_{1}}\right)^{\times }\times \cdots \times \left(\mathbb {Z} /p_{r}^{k_{r}}\right)^{\times }}
.
Hence,
|
(
Z
/
n
)
∗
|
=
∏
j
=
1
r
|
(
Z
/
p
j
k
j
)
∗
|
{\displaystyle \left|(\mathbb {Z} /n)^{*}\right|=\prod _{j=1}^{r}\left|\left(\mathbb {Z} /p_{j}^{k_{j}}\right)^{*}\right|}
, and from
∀
m
∈
N
:
φ
(
m
)
=
|
(
Z
/
m
)
∗
|
{\displaystyle \forall m\in \mathbb {N} :\varphi (m)=\left|(\mathbb {Z} /m)^{*}\right|}
follows the claim.
◻
{\displaystyle \Box }
Proof 3 : We prove the theorem from lemma 2.11 and induction (due to Hensel ).
Let
m
,
n
∈
N
{\displaystyle m,n\in \mathbb {N} }
such that
gcd
(
m
,
n
)
=
1
{\displaystyle \gcd(m,n)=1}
. By lemma 2.11, we have
m
=
∑
e
|
m
φ
(
e
)
{\displaystyle m=\sum _{e|m}\varphi (e)}
and
n
=
∑
d
|
m
φ
(
d
)
{\displaystyle n=\sum _{d|m}\varphi (d)}
and hence
m
n
=
φ
(
m
)
φ
(
n
)
+
φ
(
m
)
∑
d
|
n
,
d
<
n
φ
(
d
)
+
φ
(
n
)
∑
e
|
m
,
e
<
m
φ
(
e
)
+
∑
d
|
n
,
e
|
m
d
<
n
,
e
<
m
φ
(
d
)
φ
(
e
)
{\displaystyle mn=\varphi (m)\varphi (n)+\varphi (m)\sum _{d|n,d<n}\varphi (d)+\varphi (n)\sum _{e|m,e<m}\varphi (e)+\sum _{d|n,e|m \atop d<n,e<m}\varphi (d)\varphi (e)}
.
Furthermore, by lemma 2.11 and the bijection from the proof of theorem 2.8,
m
n
=
∑
f
|
m
n
φ
(
f
)
=
∑
e
|
m
,
d
|
n
φ
(
e
d
)
{\displaystyle mn=\sum _{f|mn}\varphi (f)=\sum _{e|m,d|n}\varphi (ed)}
.
By induction on
e
d
,
e
n
,
m
d
<
m
n
{\displaystyle ed,en,md<mn}
we thus have
m
n
=
φ
(
m
n
)
+
φ
(
m
)
∑
d
|
n
φ
(
d
)
+
φ
(
n
)
∑
e
|
m
φ
(
e
)
+
∑
e
|
m
,
d
|
n
e
<
m
,
d
<
n
φ
(
e
)
φ
(
d
)
{\displaystyle mn=\varphi (mn)+\varphi (m)\sum _{d|n}\varphi (d)+\varphi (n)\sum _{e|m}\varphi (e)+\sum _{e|m,d|n \atop e<m,d<n}\varphi (e)\varphi (d)}
.
◻
{\displaystyle \Box }
Proof 4 : We prove the theorem from lemma 2.11 and the Möbius inversion formula.
Indeed, from lemma 2.10 and the Möbius inversion formula, we obtain
φ
=
μ
∗
I
1
{\displaystyle \varphi =\mu *I_{1}}
,
which is why
φ
{\displaystyle \varphi }
is multiplicative as the convolution of two multiplicative functions.
◻
{\displaystyle \Box }
Proof 5 : We prove the theorem from Euler's product formula.
Indeed, if
m
=
P
κ
{\displaystyle m=P^{\kappa }}
and
n
=
Q
ι
{\displaystyle n=Q^{\iota }}
and
gcd
(
m
,
n
)
=
1
{\displaystyle \gcd(m,n)=1}
, then
P
∩
Q
=
∅
{\displaystyle P\cap Q=\emptyset }
and hence
∏
p
|
n
(
p
κ
p
−
p
κ
p
−
1
)
∏
q
|
n
(
q
ι
−
q
ι
q
−
1
)
=
φ
(
m
n
)
{\displaystyle \prod _{p|n}(p^{\kappa _{p}}-p^{\kappa _{p}-1})\prod _{q|n}(q^{\iota }-q^{\iota _{q}-1})=\varphi (mn)}
.
◻
{\displaystyle \Box }
Theorem 2.15 (Möbius inversion formula) :
Let
f
{\displaystyle f}
be an arithmetical function and define
S
f
(
n
)
:=
∑
d
|
n
f
(
d
)
=
f
∗
E
(
n
)
{\displaystyle S_{f}(n):=\sum _{d|n}f(d)=f*E(n)}
.
Then
f
=
μ
∗
S
f
=
∑
d
|
n
μ
(
d
)
S
f
(
n
d
)
{\displaystyle f=\mu *S_{f}=\sum _{d|n}\mu (d)S_{f}\left({\frac {n}{d}}\right)}
.
Proof :
By lemma 2.14 and associativity of convolution,
μ
∗
S
f
=
μ
∗
f
∗
E
=
μ
∗
E
∗
f
=
δ
∗
f
=
f
{\displaystyle \mu *S_{f}=\mu *f*E=\mu *E*f=\delta *f=f}
.
◻
{\displaystyle \Box }
Proof 1 :
We prove the theorem from lemma 2.10 and the fact that
φ
{\displaystyle \varphi }
is multiplicative.
Indeed, let
p
{\displaystyle p}
be a prime number and let
k
∈
N
{\displaystyle k\in \mathbb {N} }
. Then
φ
(
p
k
)
=
p
k
−
p
k
−
1
{\displaystyle \varphi (p^{k})=p^{k}-p^{k-1}}
, since
φ
(
p
k
)
=
(
μ
∗
I
1
)
(
p
k
)
=
∑
j
=
0
k
μ
(
p
j
)
p
k
−
j
{\displaystyle \varphi (p^{k})=(\mu *I_{1})(p^{k})=\sum _{j=0}^{k}\mu (p^{j})p^{k-j}}
by lemma 2.10. Therefore,
φ
(
n
)
=
∏
j
=
1
r
(
p
j
k
j
−
p
j
k
j
−
1
)
=
n
∏
j
=
1
r
{\displaystyle \varphi (n)=\prod _{j=1}^{r}\left(p_{j}^{k_{j}}-p_{j}^{k_{j}-1}\right)=n\prod _{j=1}^{r}}
,
where the latter equation follows from
n
∏
j
=
1
r
(
1
−
1
p
j
)
=
p
1
k
1
⋯
p
r
k
r
∏
j
=
1
r
(
1
−
1
p
j
)
=
∏
j
=
1
r
p
j
k
j
(
1
−
1
p
j
)
=
∏
j
=
1
r
(
p
j
k
j
−
p
j
k
j
−
1
)
{\displaystyle {\begin{aligned}n\prod _{j=1}^{r}\left(1-{\frac {1}{p_{j}}}\right)&=p_{1}^{k_{1}}\cdots p_{r}^{k_{r}}\prod _{j=1}^{r}\left(1-{\frac {1}{p_{j}}}\right)\\&=\prod _{j=1}^{r}p_{j}^{k_{j}}\left(1-{\frac {1}{p_{j}}}\right)\\&=\prod _{j=1}^{r}\left(p_{j}^{k_{j}}-p_{j}^{k_{j}-1}\right)\end{aligned}}}
.
◻
{\displaystyle \Box }
Proof 2 :
We prove the identity by the means of probability theory.
Let
n
∈
N
{\displaystyle n\in \mathbb {N} }
,
n
=
p
1
k
1
⋯
p
r
k
r
{\displaystyle n=p_{1}^{k_{1}}\cdots p_{r}^{k_{r}}}
. Choose
Ω
=
{
1
,
…
,
n
}
{\displaystyle \Omega =\{1,\ldots ,n\}}
,
F
=
2
Ω
{\displaystyle {\mathcal {F}}=2^{\Omega }}
,
P
(
A
)
:=
|
A
|
n
{\displaystyle P(A):={\frac {|A|}{n}}}
. For
j
∈
{
1
,
…
,
r
}
{\displaystyle j\in \{1,\ldots ,r\}}
define the event
E
p
j
:=
{
1
≤
k
≤
n
|
p
j
|
n
}
{\displaystyle E_{p_{j}}:=\{1\leq k\leq n|p_{j}|n\}}
. Then we have
P
(
E
p
1
¯
∩
⋯
∩
E
p
r
¯
)
=
φ
(
n
)
n
{\displaystyle P\left({\overline {E_{p_{1}}}}\cap \cdots \cap {\overline {E_{p_{r}}}}\right)={\frac {\varphi (n)}{n}}}
.
On the other hand, for each
J
=
{
j
1
,
…
,
j
l
}
⊆
{
1
,
…
,
r
}
{\displaystyle J=\{j_{1},\ldots ,j_{l}\}\subseteq \{1,\ldots ,r\}}
, we have
P
(
E
p
j
1
∩
⋯
∩
E
p
j
1
)
=
P
(
{
1
≤
k
≤
n
|
∏
j
∈
J
p
j
|
k
}
)
=
1
∏
j
∈
J
p
j
=
∏
j
∈
J
P
(
E
p
j
)
{\displaystyle {\begin{aligned}P\left(E_{p_{j_{1}}}\cap \cdots \cap E_{p_{j_{1}}}\right)&=P\left(\left\{1\leq k\leq n{\big |}\prod _{j\in J}p_{j}|k\right\}\right)\\&={\frac {1}{\prod _{j\in J}p_{j}}}=\prod _{j\in J}P\left(E_{p_{j}}\right)\end{aligned}}}
.
Thus, it follows that
E
p
1
,
…
,
E
p
r
{\displaystyle E_{p_{1}},\ldots ,E_{p_{r}}}
are independent. But since events are independent if and only if their complements are, we obtain
φ
(
n
)
n
=
P
(
E
p
1
¯
∩
⋯
∩
E
p
r
¯
)
=
P
(
E
p
1
¯
)
)
⋯
P
(
E
p
r
¯
)
=
∏
j
=
1
r
(
1
−
1
p
j
)
{\displaystyle {\frac {\varphi (n)}{n}}=P\left({\overline {E_{p_{1}}}}\cap \cdots \cap {\overline {E_{p_{r}}}}\right)=P\left({\overline {E_{p_{1}}}})\right)\cdots P\left({\overline {E_{p_{r}}}}\right)=\prod _{j=1}^{r}\left(1-{\frac {1}{p_{j}}}\right)}
.
◻
{\displaystyle \Box }
Proof 3 :
We prove the identity from the Möbius inversion formula and lemmas 2.9 and 2.10.
But by the Möbius inversion formula and since by lemma 2.10
S
φ
=
I
1
{\displaystyle S_{\varphi }=I_{1}}
,
∑
d
|
n
μ
(
d
)
n
d
=
μ
∗
S
φ
(
n
)
=
φ
(
n
)
{\displaystyle \sum _{d|n}\mu (d){\frac {n}{d}}=\mu *S_{\varphi }(n)=\varphi (n)}
.
◻
{\displaystyle \Box }
Proof 4 :
We prove the identity from the inclusion–exclusion principle .
Indeed, by one of de Morgan's rules and the inclusion–exclusion principle we have for sets
A
1
,
…
,
A
r
⊆
S
{\displaystyle A_{1},\ldots ,A_{r}\subseteq S}
|
⋂
j
=
1
r
A
j
|
=
|
S
∖
⋃
j
=
1
r
(
S
∖
A
j
)
|
=
|
S
|
−
|
⋃
j
=
1
r
(
S
∖
A
j
)
|
=
|
S
|
+
∑
∅
≠
J
⊆
{
1
,
…
,
r
}
(
−
1
)
|
J
|
|
⋂
j
∈
J
(
S
∖
A
j
)
|
=
∑
J
⊆
{
1
,
…
,
r
}
(
−
1
)
|
J
|
|
⋂
j
∈
J
(
S
∖
A
j
)
|
{\displaystyle {\begin{aligned}\left|\bigcap _{j=1}^{r}A_{j}\right|&=\left|S\setminus \bigcup _{j=1}^{r}\left(S\setminus A_{j}\right)\right|\\&=|S|-\left|\bigcup _{j=1}^{r}\left(S\setminus A_{j}\right)\right|\\&=|S|+\sum _{\emptyset \neq J\subseteq \{1,\ldots ,r\}}(-1)^{|J|}\left|\bigcap _{j\in J}\left(S\setminus A_{j}\right)\right|\\&=\sum _{J\subseteq \{1,\ldots ,r\}}(-1)^{|J|}\left|\bigcap _{j\in J}\left(S\setminus A_{j}\right)\right|\end{aligned}}}
,
where we use the convention that the empty intersection equals the universal set
S
{\displaystyle S}
.
Let now
n
=
p
1
k
1
⋯
p
r
k
r
{\displaystyle n=p_{1}^{k_{1}}\cdots p_{r}^{k_{r}}}
, and define
S
=
{
m
|
1
≤
m
≤
n
}
{\displaystyle S=\{m|1\leq m\leq n\}}
and
A
j
:=
{
l
∈
S
|
p
j
⧸
|
l
}
{\displaystyle A_{j}:=\{l\in S|p_{j}\not |l\}}
for
1
≤
j
≤
r
{\displaystyle 1\leq j\leq r}
. Since
φ
(
n
)
=
|
⋂
j
=
1
r
A
j
|
{\displaystyle \varphi (n)=\left|\bigcap _{j=1}^{r}A_{j}\right|}
,
we then have
φ
(
n
)
=
∑
J
⊆
{
1
,
…
,
r
}
(
−
1
)
|
J
|
|
⋂
j
∈
J
(
S
∖
A
j
)
|
{\displaystyle \varphi (n)=\sum _{J\subseteq \{1,\ldots ,r\}}(-1)^{|J|}\left|\bigcap _{j\in J}\left(S\setminus A_{j}\right)\right|}
.
But for each
J
⊆
{
1
,
…
,
r
}
{\displaystyle J\subseteq \{1,\ldots ,r\}}
, we have
|
⋂
j
∈
J
(
S
∖
A
j
)
|
=
{
1
≤
l
≤
n
|
∀
j
∈
J
:
p
j
|
l
}
=
{
1
≤
l
≤
n
|
(
∏
j
∈
J
p
j
)
|
l
}
=
n
∏
j
∈
J
p
j
{\displaystyle {\begin{aligned}\left|\bigcap _{j\in J}\left(S\setminus A_{j}\right)\right|&=\left\{1\leq l\leq n{\big |}\forall j\in J:p_{j}|l\right\}\\&=\left\{1\leq l\leq n{\big |}\left(\prod _{j\in J}p_{j}\right)|l\right\}={\frac {n}{\prod _{j\in J}p_{j}}}\end{aligned}}}
.
It follows
φ
(
n
)
=
n
∑
J
⊆
{
1
,
…
,
r
}
(
−
1
)
|
J
|
1
∏
j
∈
J
p
j
{\displaystyle \varphi (n)=n\sum _{J\subseteq \{1,\ldots ,r\}}(-1)^{|J|}{\frac {1}{\prod _{j\in J}p_{j}}}}
,
and since
∏
m
=
1
r
(
1
−
1
p
m
)
=
∑
J
⊆
{
1
,
…
,
r
}
(
−
1
)
|
J
|
1
∏
j
∈
J
p
j
{\displaystyle \prod {m=1}^{r}\left(1-{\frac {1}{p_{m}}}\right)=\sum _{J\subseteq \{1,\ldots ,r\}}(-1)^{|J|}{\frac {1}{\prod _{j\in J}p_{j}}}}
,
the theorem is proven.
◻
{\displaystyle \Box }
Theorem 8.? (The Selberg identity) :
Partial fraction decomposition
Proof :
We proceed by induction on
n
{\displaystyle n}
. For
n
=
1
{\displaystyle n=1}
, the statement is true since by division with remainder, we may write
f
(
x
)
=
q
1
(
x
)
p
1
(
x
)
+
r
(
x
)
{\displaystyle f(x)=q_{1}(x)p_{1}(x)+r(x)}
with
deg
(
r
)
<
deg
(
p
1
)
{\displaystyle \deg(r)<\deg(p_{1})}
to obtain
f
(
x
)
g
(
x
)
=
q
1
(
x
)
p
1
(
x
)
k
1
−
1
+
r
(
x
)
g
(
x
)
{\displaystyle {\frac {f(x)}{g(x)}}={\frac {q_{1}(x)}{p_{1}(x)^{k_{1}-1}}}+{\frac {r(x)}{g(x)}}}
,
and we have reduced the degree of the denominator by one (the latter summand already satisfies the required condition). By repetition of this process, we eventually obtain a denominator of one and thus a polynomial.
Let now the hypothesis be true for
n
∈
N
{\displaystyle n\in \mathbb {N} }
, and assume that
g
=
∏
j
=
1
n
+
1
p
j
k
j
{\displaystyle g=\prod _{j=1}^{n+1}p_{j}^{k_{j}}}
. Write
G
=
∏
j
=
1
n
p
j
k
j
{\displaystyle G=\prod _{j=1}^{n}p_{j}^{k_{j}}}
and
H
=
p
n
+
1
k
n
+
1
{\displaystyle H=p_{n+1}^{k_{n+1}}}
. By irreducibility,
gcd
(
G
,
H
)
=
1
{\displaystyle \gcd(G,H)=1}
. Hence, we find polynomials
S
,
T
{\displaystyle S,T}
such that
1
=
S
G
+
T
H
{\displaystyle 1=SG+TH}
. Then
f
g
=
f
(
S
G
+
T
H
)
g
=
f
S
G
g
+
f
T
H
g
{\displaystyle {\frac {f}{g}}={\frac {f(SG+TH)}{g}}={\frac {fSG}{g}}+{\frac {fTH}{g}}}
.
Each of the summands of the last term can by the induction hypothesis be written in the desired form.
◻
{\displaystyle \Box }
No matter how complicated our fraction of polynomials
f
g
{\displaystyle {\frac {f}{g}}}
may be, we can give the partial fraction decomposition in finite time, using easy techniques. The method, which for the sake of simplicity differs from the one given in the above constructive existence proof, goes as follows:
Split the polynomial
g
{\displaystyle g}
into irreducible factors.
Using division with remainder of
f
{\displaystyle f}
by
g
{\displaystyle g}
, reduce to the case
deg
(
f
)
<
deg
(
g
)
{\displaystyle \deg(f)<\deg(g)}
(the resulting polynomial
q
{\displaystyle q}
is allowed in the formula of theorem 2.1).
Solve the equation given in theorem 2.1 for the
a
l
,
j
{\displaystyle a_{l,j}}
(this is equivalent to solving a system of linear equations; namely multiply by
g
{\displaystyle g}
and then equate coefficients).
Theorem 2.2 :
The algorithm given above always terminates and gives the partial fraction decomposition of
f
g
{\displaystyle {\frac {f}{g}}}
.
Proof : Due to theorem 2.1, in step three we do obtain a system of linear equations which is solvable. Hence follow termination and correctness.
◻
{\displaystyle \Box }
Lemma 5.1 (Convergence of real products) :
Let
(
a
n
)
n
∈
N
{\displaystyle (a_{n})_{n\in \mathbb {N} }}
be such that
∑
n
=
1
∞
|
a
n
|
{\displaystyle \sum _{n=1}^{\infty }|a_{n}|}
converges absolutely. Then if
∀
n
∈
N
:
|
b
n
|
≤
|
a
n
|
{\displaystyle \forall n\in \mathbb {N} :|b_{n}|\leq |a_{n}|}
,
∏
n
=
1
∞
(
1
+
b
n
)
{\displaystyle \prod _{n=1}^{\infty }(1+b_{n})}
converges.
Proof :
Without loss of generality, we assume
|
b
n
|
<
1
2
{\displaystyle |b_{n}|<{\frac {1}{2}}}
for all
n
∈
N
{\displaystyle n\in \mathbb {N} }
.
Denote
p
n
:=
∏
j
=
1
n
(
1
+
b
n
)
{\displaystyle p_{n}:=\prod _{j=1}^{n}(1+b_{n})}
.
Then we have
q
n
:=
log
(
p
n
)
=
∑
j
=
1
n
log
(
1
+
b
j
)
{\displaystyle q_{n}:=\log(p_{n})=\sum _{j=1}^{n}\log(1+b_{j})}
.
We now apply the Taylor formula of first degree with Lagrange remainder to
log
(
1
+
x
)
{\displaystyle \log(1+x)}
at
1
{\displaystyle 1}
to obtain for
|
x
|
<
1
2
{\displaystyle |x|<{\frac {1}{2}}}
log
(
1
+
x
)
=
1
2
x
−
1
2
(
1
+
ξ
)
2
x
2
{\displaystyle \log(1+x)={\frac {1}{2}}x-{\frac {1}{2(1+\xi )^{2}}}x^{2}}
,
ξ
∈
(
1
2
,
3
2
)
{\displaystyle \xi \in \left({\frac {1}{2}},{\frac {3}{2}}\right)}
.
Hence, we have for
|
x
|
<
1
2
{\displaystyle |x|<{\frac {1}{2}}}
|
log
(
1
+
x
)
|
≤
|
1
2
x
−
1
2
(
1
+
ξ
)
2
x
2
|
≤
|
x
|
{\displaystyle |\log(1+x)|\leq \left|{\frac {1}{2}}x-{\frac {1}{2(1+\xi )^{2}}}x^{2}\right|\leq |x|}
,
ξ
∈
(
1
2
,
3
2
)
{\displaystyle \xi \in \left({\frac {1}{2}},{\frac {3}{2}}\right)}
.
Hence,
|
log
(
1
+
b
j
)
|
≤
|
b
j
|
{\displaystyle |\log(1+b_{j})|\leq |b_{j}|}
and thus we obtain the (even absolute) convergence of the
q
n
{\displaystyle q_{n}}
; thus, by the continuity of the exponential, also the
p
n
{\displaystyle p_{n}}
converge.
◻
{\displaystyle \Box }
Proof :
We define
p
n
:=
∏
j
=
1
n
(
1
+
s
j
)
{\displaystyle p_{n}:=\prod _{j=1}^{n}(1+s_{j})}
,
q
n
:=
∏
j
=
1
n
(
1
+
a
j
)
{\displaystyle q_{n}:=\prod _{j=1}^{n}(1+a_{j})}
. We note that
|
p
n
|
≤
∏
j
=
1
n
(
1
+
|
s
j
|
)
≤
q
n
{\displaystyle |p_{n}|\leq \prod _{j=1}^{n}(1+|s_{j}|)\leq q_{n}}
.
Without loss of generality we may assume that all the products are nonzero; else we have immediate convergence (to zero).
We now prove that
(
p
n
)
n
∈
N
{\displaystyle (p_{n})_{n\in \mathbb {N} }}
is a Cauchy sequence. Indeed, we have
|
p
n
+
k
−
p
n
|
=
|
p
n
|
|
p
n
+
k
p
n
−
1
|
{\displaystyle |p_{n+k}-p_{n}|=|p_{n}|\left|{\frac {p_{n+k}}{p_{n}}}-1\right|}
and furthermore
|
p
n
+
k
p
n
−
1
|
=
|
s
n
+
1
+
⋯
+
s
n
+
k
+
s
n
+
1
s
n
+
2
+
⋯
+
s
n
+
1
⋯
s
n
+
k
|
≤
|
s
n
+
1
|
+
⋯
+
|
s
n
+
k
|
+
|
s
n
+
1
s
n
+
2
|
+
⋯
+
|
s
n
+
1
⋯
s
n
+
k
|
≤
a
n
+
1
+
⋯
+
a
n
+
k
+
a
n
+
1
a
n
+
2
+
⋯
+
a
n
+
1
⋯
a
n
+
k
=
q
n
+
k
q
n
−
1
=
|
q
n
+
k
q
n
−
1
|
{\displaystyle {\begin{aligned}\left|{\frac {p_{n+k}}{p_{n}}}-1\right|&=\left|s_{n+1}+\cdots +s_{n+k}+s_{n+1}s_{n+2}+\cdots +s_{n+1}\cdots s_{n+k}\right|\\&\leq |s_{n+1}|+\cdots +|s_{n+k}|+|s_{n+1}s_{n+2}|+\cdots +|s_{n+1}\cdots s_{n+k}|\\&\leq a_{n+1}+\cdots +a_{n+k}+a_{n+1}a_{n+2}+\cdots +a_{n+1}\cdots a_{n+k}\\&={\frac {q_{n+k}}{q_{n}}}-1=\left|{\frac {q_{n+k}}{q_{n}}}-1\right|\end{aligned}}}
and therefore
|
p
n
+
k
−
p
n
|
=
|
p
n
|
|
p
n
+
k
p
n
−
1
|
≤
|
q
n
|
|
q
n
+
k
q
n
−
1
|
=
|
q
n
+
k
−
q
n
|
{\displaystyle |p_{n+k}-p_{n}|=|p_{n}|\left|{\frac {p_{n+k}}{p_{n}}}-1\right|\leq |q_{n}|\left|{\frac {q_{n+k}}{q_{n}}}-1\right|=|q_{n+k}-q_{n}|}
.
Since
q
n
→
q
{\displaystyle q_{n}\to q}
, it is a Cauchy sequence, and thus, by the above inequality, so is
(
p
n
)
n
∈
N
{\displaystyle (p_{n})_{n\in \mathbb {N} }}
. The last claim of the theorem follows by taking
k
→
∞
{\displaystyle k\to \infty }
in the above inequality.
◻
{\displaystyle \Box }
Proof 1 :
We prove the theorem using lemma 5.1 and the comparison test.
Indeed, by lemma 5.1 the product
∏
j
=
1
∞
(
1
+
|
a
j
|
)
{\displaystyle \prod _{j=1}^{\infty }(1+|a_{j}|)}
converges. Hence by theorem 5.2, we obtain convergence and the desired inequality.
◻
{\displaystyle \Box }
Proof 2 (without the inequality) :
We prove the theorem except the inequality at the end from lemma 5.1 and by using the Taylor formula on
arcsin
{\displaystyle \arcsin }
.
We define
p
n
:=
∏
j
=
1
n
(
1
+
s
j
)
{\displaystyle p_{n}:=\prod _{j=1}^{n}(1+s_{j})}
. Then since every complex number satisfies
z
=
|
z
|
e
i
arg
(
z
)
{\displaystyle z=|z|e^{i\arg(z)}}
, we need to prove the convergence of the sequences
(
|
p
n
|
)
n
∈
N
{\displaystyle (|p_{n}|)_{n\in \mathbb {N} }}
and
(
arg
(
p
n
)
)
n
∈
N
{\displaystyle (\arg(p_{n}))_{n\in \mathbb {N} }}
.
For the first sequence, we note that the convergence of
(
|
p
n
|
)
n
∈
N
{\displaystyle (|p_{n}|)_{n\in \mathbb {N} }}
is equivalent to the convergence of
(
|
p
n
|
2
)
n
∈
N
{\displaystyle (|p_{n}|^{2})_{n\in \mathbb {N} }}
. Now for each
k
∈
N
{\displaystyle k\in \mathbb {N} }
|
1
+
{\displaystyle |1+}
Proof :
First, we note that
g
(
z
)
{\displaystyle g(z)}
is well-defined for each
z
{\displaystyle z}
due to theorem 5.2. In order to prove that the product is holomorphic, we use the fact from complex analysis that if a sequence of functions converging locally uniformly to another function has infinitely many holomorphic members, then the limit is holomorphic as well. Indeed, we note by the inequality in theorem 5.3, that we are given uniform convergence. Hence, the theorem follows.
◻
{\displaystyle \Box }
The Weierstraß factorisation
edit
The following lemma is of great importance, since we can deduce three important theorems from it:
The existence of holomorphic functions with prescribed zeroes
The Weierstraß factorisation theorem (a way to write any holomorphic function made up from linear factors and the exponential)
The Mittag-Leffler theorem (named after Gösta Mittag-Leffler (one guy))
Lemma 5.5 :
Let
(
a
n
)
n
∈
N
{\displaystyle (a_{n})_{n\in \mathbb {N} }}
be a sequence of complex numbers such that
0
<
|
a
1
|
≤
|
a
2
|
≤
⋯
{\displaystyle 0<|a_{1}|\leq |a_{2}|\leq \cdots }
and
lim
n
→
∞
|
a
n
|
=
∞
{\displaystyle \lim _{n\to \infty }|a_{n}|=\infty }
.
Then the function
∏
n
=
1
∞
(
1
−
s
a
n
)
e
∑
k
=
1
n
−
1
(
−
1
)
k
+
1
s
k
k
a
n
k
{\displaystyle \prod _{n=1}^{\infty }\left(1-{\frac {s}{a_{n}}}\right)e^{\sum _{k=1}^{n-1}(-1)^{k+1}{\frac {s^{k}}{ka_{n}^{k}}}}}
has exactly the zeroes
{
a
n
|
n
∈
N
}
{\displaystyle \{a_{n}|n\in \mathbb {N} \}}
in the correct multiplicity.
Proof :
Define for each
n
∈
N
{\displaystyle n\in \mathbb {N} }
u
n
(
s
)
:=
(
1
−
s
a
n
)
e
∑
k
=
1
n
−
1
(
−
1
)
k
+
1
s
k
k
a
n
k
{\displaystyle u_{n}(s):=\left(1-{\frac {s}{a_{n}}}\right)e^{\sum _{k=1}^{n-1}(-1)^{k+1}{\frac {s^{k}}{ka_{n}^{k}}}}}
.
Our plan is to prove that
∏
n
=
1
∞
u
n
(
s
)
{\displaystyle \prod _{n=1}^{\infty }u_{n}(s)}
converges uniformly in every subcircle of the circle of radius
|
a
N
|
{\displaystyle |a_{N}|}
for every
N
∈
N
{\displaystyle N\in \mathbb {N} }
.
Since the function
z
↦
log
(
1
+
z
)
{\displaystyle z\mapsto \log(1+z)}
is holomorphic in a unit ball around zero, it is equal to its Taylor series there, i.e.
log
(
1
+
z
)
=
∑
k
=
1
∞
z
k
k
{\displaystyle \log(1+z)=\sum _{k=1}^{\infty }{\frac {z^{k}}{k}}}
.
Hence, for
|
s
|
<
|
a
n
|
{\displaystyle |s|<|a_{n}|}
log
(
u
n
(
s
)
)
=
∑
k
=
n
∞
(
−
1
)
k
+
1
s
k
k
a
n
k
{\displaystyle \log \left(u_{n}(s)\right)=\sum _{k=n}^{\infty }(-1)^{k+1}{\frac {s^{k}}{ka_{n}^{k}}}}
.
Let now
N
∈
N
{\displaystyle N\in \mathbb {N} }
be given and
n
≥
N
{\displaystyle n\geq N}
be arbitrary. Then we have for
|
s
|
<
(
1
−
ϵ
)
|
a
N
|
{\displaystyle |s|<(1-\epsilon )|a_{N}|}
,
ϵ
>
0
{\displaystyle \epsilon >0}
arbitrary
|
log
(
u
n
(
s
)
)
|
=
|
∑
k
=
n
∞
(
−
1
)
k
+
1
s
k
k
a
n
k
|
≤
∑
k
=
n
∞
|
s
k
k
a
n
k
|
≤
∑
k
=
n
∞
(
1
−
ϵ
)
k
=
(
1
−
ϵ
)
n
1
ϵ
{\displaystyle {\begin{aligned}\left|\log \left(u_{n}(s)\right)\right|&=\left|\sum _{k=n}^{\infty }(-1)^{k+1}{\frac {s^{k}}{ka_{n}^{k}}}\right|\\&\leq \sum _{k=n}^{\infty }\left|{\frac {s^{k}}{ka_{n}^{k}}}\right|\\&\leq \sum _{k=n}^{\infty }(1-\epsilon )^{k}&=(1-\epsilon )^{n}{\frac {1}{\epsilon }}\end{aligned}}}
.
Now summing over
n
≥
N
{\displaystyle n\geq N}
, we obtain
|
∑
n
=
N
∞
log
(
u
n
(
s
)
)
|
≤
∑
n
=
N
∞
(
1
−
ϵ
)
n
1
ϵ
<
∞
{\displaystyle \left|\sum _{n=N}^{\infty }\log \left(u_{n}(s)\right)\right|\leq \sum _{n=N}^{\infty }(1-\epsilon )^{n}{\frac {1}{\epsilon }}<\infty }
for all
|
s
|
<
(
1
−
ϵ
)
|
a
N
|
{\displaystyle |s|<(1-\epsilon )|a_{N}|}
. Hence, we have uniform convergence in that circle; thus the sum of the logarithms is holomorphic, and so is the original product if we plug everything into the exponential function (note that we do have
exp
(
log
(
z
)
)
=
z
{\displaystyle \exp(\log(z))=z}
even if
z
{\displaystyle z}
is an arbitrary complex number).
◻
{\displaystyle \Box }
Note that our method of proof was similar to how we proved lemma 5.1. In spite of this, it is not possible to prove the above lemma directly from theorem 5.4 since the corresponding series does not converge if the
a
n
{\displaystyle a_{n}}
are chosen increasing too slowly.
Proof :
We order
(
s
n
)
n
∈
N
{\displaystyle (s_{n})_{n\in \mathbb {N} }}
increasingly according to the modulus
|
s
n
|
{\displaystyle |s_{n}|}
and the standard greater or equal order on the real numbers. We go on to observe that then
|
s
n
|
→
∞
{\displaystyle |s_{n}|\to \infty }
, since if it were to remain bounded, there would be an accumulation point according to the Heine–Borel theorem . Also, the sequence is zero only finitely many often (otherwise zero would be an accumulation point). After eliminating the zeroes from the sequence
(
s
n
)
n
∈
N
{\displaystyle (s_{n})_{n\in \mathbb {N} }}
we call the remaining sequence
(
a
n
)
n
∈
N
{\displaystyle (a_{n})_{n\in \mathbb {N} }}
. Let
m
∈
N
{\displaystyle m\in \mathbb {N} }
the number of zeroes in
(
s
n
)
n
∈
N
{\displaystyle (s_{n})_{n\in \mathbb {N} }}
. Then due to lemma 5.5, the function
s
m
∏
n
=
1
∞
(
1
−
s
a
n
)
e
∑
k
=
1
n
−
1
(
−
1
)
k
+
1
s
k
k
a
n
k
{\displaystyle s^{m}\prod _{n=1}^{\infty }\left(1-{\frac {s}{a_{n}}}\right)e^{\sum _{k=1}^{n-1}(-1)^{k+1}{\frac {s^{k}}{ka_{n}^{k}}}}}
has the required properties.
◻
{\displaystyle \Box }
Proof :
First, we note that
(
s
n
)
n
∈
N
{\displaystyle (s_{n})_{n\in \mathbb {N} }}
does not have an accumulation point, since otherwise
f
{\displaystyle f}
would be the constant zero function by the identity theorem from complex analysis. From theorem 5.6, we obtain that the function
g
(
s
)
:=
s
m
∏
n
=
1
∞
(
1
−
s
a
n
)
e
∑
k
=
1
n
−
1
(
−
1
)
k
+
1
s
k
k
a
n
k
{\displaystyle g(s):=s^{m}\prod _{n=1}^{\infty }\left(1-{\frac {s}{a_{n}}}\right)e^{\sum _{k=1}^{n-1}(-1)^{k+1}{\frac {s^{k}}{ka_{n}^{k}}}}}
has exactly the zeroes
{
s
n
|
n
∈
N
}
{\displaystyle \{s_{n}|n\in \mathbb {N} \}}
with the right multiplicity, where the sequence
(
a
n
)
n
∈
N
{\displaystyle (a_{n})_{n\in \mathbb {N} }}
are the nonzero elements of the sequence
(
s
n
)
n
∈
N
{\displaystyle (s_{n})_{n\in \mathbb {N} }}
ordered ascendingly with respect to their absolute value and
m
∈
N
{\displaystyle m\in \mathbb {N} }
is the number of zeroes within the sequence
(
s
n
)
n
∈
N
{\displaystyle (s_{n})_{n\in \mathbb {N} }}
. We have that
f
/
g
{\displaystyle f/g}
has no zeroes and is bounded and hence holomorphic due to Riemann's theorem on resolvable singularities. For, if
f
/
g
{\displaystyle f/g}
were unbounded, it would have a singularity at a zero
z
0
{\displaystyle z_{0}}
of
g
{\displaystyle g}
. This singularity can not be essential since dividing
g
{\displaystyle g}
by finitely many linear factors would eliminate that singularity. Hence we have a pole, and this would be resolvable by multiplying linear factors to
f
/
g
{\displaystyle f/g}
. But then
g
/
f
{\displaystyle g/f}
has a zero of the order of that pole, which is not possible since we may eliminate all the zeroes of
g
/
f
{\displaystyle g/f}
by writing
f
=
(
z
−
z
0
)
l
h
{\displaystyle f=(z-z_{0})^{l}h}
,
h
{\displaystyle h}
holomorphic and nonzero at
z
0
{\displaystyle z_{0}}
, where
l
{\displaystyle l}
is the order of the zero of
f
{\displaystyle f}
at
z
0
{\displaystyle z_{0}}
.
Hence,
f
/
g
{\displaystyle f/g}
has a holomorphic logarithm on
C
{\displaystyle \mathbb {C} }
, which we shall denote by
H
{\displaystyle H}
. This satisfies
z
m
e
H
(
z
)
∏
n
=
1
∞
(
1
−
z
a
n
)
e
∑
k
=
1
n
−
1
(
−
1
)
k
+
1
z
k
k
a
n
k
=
f
(
z
)
{\displaystyle z^{m}e^{H(z)}\prod _{n=1}^{\infty }\left(1-{\frac {z}{a_{n}}}\right)e^{\sum _{k=1}^{n-1}(-1)^{k+1}{\frac {z^{k}}{ka_{n}^{k}}}}=f(z)}
.
◻
{\displaystyle \Box }
Proof :
From theorem 5.7 we obtain a function
g
{\displaystyle g}
with zeroes
{
s
n
|
n
∈
N
}
{\displaystyle \{s_{n}|n\in \mathbb {N} \}}
in the right multiplicity. Set
f
=
1
/
g
{\displaystyle f=1/g}
.
◻
{\displaystyle \Box }
The Hadamard factorisation
edit
In this subsection, we strive to factor certain holomorphic functions in a way that makes them even easier to deal with than the Weierstraß factorisation. This is the Hadamard factorisation. It only works for functions satisfying a certain growth estimate, but in fact, many important functions occuring in analytic number theory do satisfy this estimate, and thus that factorisation will give us ways to prove certain theorems about those functions.
In order to prove that we may carry out a Hadamard factorisation, we need some estimates for holomorphic functions as well as some preparatory lemmata.
Estimates for holomorphic functions
edit
Proof :
Set
m
:=
N
(
r
)
{\displaystyle m:=N(r)}
and define the function
g
:
C
→
C
{\displaystyle g:\mathbb {C} \to \mathbb {C} }
by
g
(
s
)
:=
{
f
(
s
)
∏
j
=
1
m
R
2
−
s
s
j
¯
R
(
s
−
s
j
)
s
∉
{
s
1
,
…
,
s
m
}
lim
t
→
s
f
(
s
)
∏
j
=
1
m
R
2
−
t
s
j
¯
R
(
t
−
s
j
)
otherwise
{\displaystyle g(s):={\begin{cases}f(s)\prod _{j=1}^{m}{\frac {R^{2}-s{\overline {s_{j}}}}{R(s-s_{j})}}&s\notin \{s_{1},\ldots ,s_{m}\}\\\lim _{t\to s}f(s)\prod _{j=1}^{m}{\frac {R^{2}-t{\overline {s_{j}}}}{R(t-s_{j})}}&{\text{otherwise}}\end{cases}}}
,
where the latter limit exists by developing
f
{\displaystyle f}
into a power series at
s
{\displaystyle s}
and observing that the constant coefficient vanishes. By Riemann's theorem on removable singularities,
g
{\displaystyle g}
is holomorphic. We now have
|
g
(
0
)
|
=
|
f
(
0
)
|
∏
j
=
1
m
R
|
s
j
|
{\displaystyle |g(0)|=|f(0)|\prod _{j=1}^{m}{\frac {R}{|s_{j}|}}}
,
and if further
|
s
|
=
R
{\displaystyle |s|=R}
, then
|
s
|
R
=
1
{\displaystyle {\frac {|s|}{R}}=1}
and hence we may multiply that number without change to anything to obtain for
j
∈
{
1
,
…
,
m
}
{\displaystyle j\in \{1,\ldots ,m\}}
|
R
2
−
s
s
j
¯
R
(
s
−
s
j
)
|
=
1
⇔
|
s
|
R
|
R
2
−
s
s
j
¯
R
(
s
−
s
j
)
|
=
1
⇔
|
R
2
s
−
s
2
s
j
¯
R
2
s
−
R
2
s
j
|
=
1
⇔
|
R
2
−
s
s
j
¯
R
2
−
R
2
s
j
s
|
=
1
{\displaystyle {\begin{aligned}\left|{\frac {R^{2}-s{\overline {s_{j}}}}{R(s-s_{j})}}\right|=1&\Leftrightarrow {\frac {|s|}{R}}\left|{\frac {R^{2}-s{\overline {s_{j}}}}{R(s-s_{j})}}\right|=1\\&\Leftrightarrow \left|{\frac {R^{2}s-s^{2}{\overline {s_{j}}}}{R^{2}s-R^{2}s_{j}}}\right|=1\\&\Leftrightarrow \left|{\frac {R^{2}-s{\overline {s_{j}}}}{R^{2}-R^{2}{\frac {s_{j}}{s}}}}\right|=1\end{aligned}}}
.
Now writing
s
=
σ
+
i
t
{\displaystyle s=\sigma +it}
and
s
j
=
σ
j
+
i
t
j
{\displaystyle s_{j}=\sigma _{j}+it_{j}}
, we obtain on the one hand
s
s
j
¯
=
σ
σ
j
+
t
t
j
+
i
(
t
σ
j
−
σ
t
j
)
{\displaystyle s{\overline {s_{j}}}=\sigma \sigma _{j}+tt_{j}+i(t\sigma _{j}-\sigma t_{j})}
and on the other hand
R
2
s
s
j
=
R
2
σ
j
σ
+
t
j
t
+
i
(
t
j
σ
−
σ
j
t
)
σ
2
+
t
2
{\displaystyle R^{2}{\frac {s}{s_{j}}}=R^{2}{\frac {\sigma _{j}\sigma +t_{j}t+i(t_{j}\sigma -\sigma _{j}t)}{\sigma ^{2}+t^{2}}}}
.
Hence,
s
s
j
¯
¯
=
R
2
s
s
j
{\displaystyle {\overline {s{\overline {s_{j}}}}}=R^{2}{\frac {s}{s_{j}}}}
,
which is why both
s
s
j
¯
{\displaystyle s{\overline {s_{j}}}}
and
R
2
s
s
j
{\displaystyle R^{2}{\frac {s}{s_{j}}}}
have the same distance to
R
2
{\displaystyle R^{2}}
, since
R
2
{\displaystyle R^{2}}
lies on the real axis.
Hence, due to the maximum principle, we have
|
f
(
0
)
|
∏
j
=
1
m
R
|
s
j
|
=
|
g
(
0
)
|
≤
max
|
s
|
=
R
|
g
(
s
)
|
=
max
|
s
|
=
R
|
f
(
s
)
|
{\displaystyle |f(0)|\prod _{j=1}^{m}{\frac {R}{|s_{j}|}}=|g(0)|\leq \max _{|s|=R}|g(s)|=\max _{|s|=R}|f(s)|}
.
◻
{\displaystyle \Box }
Proof :
First, we consider the case
s
0
=
0
{\displaystyle s_{0}=0}
and
f
(
0
)
=
0
{\displaystyle f(0)=0}
. We may write
f
{\displaystyle f}
in its power series form
f
(
s
)
=
∑
j
=
1
∞
a
j
s
j
,
s
∈
B
R
(
0
)
{\displaystyle f(s)=\sum _{j=1}^{\infty }a_{j}s^{j},s\in B_{R}(0)}
,
where
a
j
=
f
(
j
)
(
0
)
j
!
{\displaystyle a_{j}={\frac {f^{(j)}(0)}{j!}}}
. If we write
∂
B
R
(
0
)
∋
s
=
R
e
i
φ
{\displaystyle \partial B_{R}(0)\ni s=Re^{i\varphi }}
and
a
j
=
|
a
j
|
e
φ
j
{\displaystyle a_{j}=|a_{j}|e^{\varphi _{j}}}
, we obtain by Euler's formula
ℜ
(
a
j
s
)
=
|
a
j
|
R
j
cos
(
j
φ
+
φ
j
)
{\displaystyle \Re \left(a_{j}s\right)=|a_{j}|R^{j}\cos(j\varphi +\varphi _{j})}
and thus
ℜ
f
(
s
)
=
∑
j
=
1
∞
|
a
j
|
R
j
cos
(
j
φ
+
φ
j
)
{\displaystyle \Re f(s)=\sum _{j=1}^{\infty }|a_{j}|R^{j}\cos(j\varphi +\varphi _{j})}
.
Since the latter sum is majorised by the sum
∑
j
=
1
∞
|
a
j
|
R
j
{\displaystyle \sum _{j=1}^{\infty }|a_{j}|R^{j}}
,
it converges absolutely and uniformly in
φ
{\displaystyle \varphi }
. Hence, by exchanging the order of integration and summation, we obtain
∫
0
2
π
ℜ
f
(
R
e
i
φ
)
d
φ
=
0
{\displaystyle \int _{0}^{2\pi }\Re f(Re^{i\varphi })d\varphi =0}
due to
∫
0
2
π
cos
(
j
φ
+
φ
j
)
d
φ
=
[
1
j
sin
(
j
φ
+
φ
j
)
]
φ
=
0
φ
=
2
π
=
0
{\displaystyle \int _{0}^{2\pi }\cos(j\varphi +\varphi _{j})d\varphi =\left[{\frac {1}{j}}\sin \left(j\varphi +\varphi _{j}\right)\right]_{\varphi =0}^{\varphi =2\pi }=0}
and further for all
n
∈
N
{\displaystyle n\in \mathbb {N} }
∫
0
2
π
ℜ
f
(
R
e
i
φ
)
cos
(
n
φ
+
φ
n
)
d
φ
=
π
|
a
n
|
R
n
{\displaystyle \int _{0}^{2\pi }\Re f(Re^{i\varphi })\cos(n\varphi +\varphi _{n})d\varphi =\pi |a_{n}|R^{n}}
due to
∫
0
2
π
cos
(
j
φ
+
φ
j
)
cos
(
n
φ
+
φ
n
)
d
φ
=
π
δ
j
,
n
{\displaystyle \int _{0}^{2\pi }\cos(j\varphi +\varphi _{j})\cos(n\varphi +\varphi _{n})d\varphi =\pi \delta _{j,n}}
,
as can be seen using integration by parts twice and
1
=
sin
2
+
cos
2
{\displaystyle 1=\sin ^{2}+\cos ^{2}}
. By monotonicity of the integral, we now have
π
|
a
n
|
R
n
≤
∫
0
2
π
ℜ
f
(
R
e
i
φ
)
(
1
+
cos
(
n
φ
+
φ
n
)
)
d
φ
≤
2
π
M
{\displaystyle \pi |a_{n}|R^{n}\leq \int _{0}^{2\pi }\Re f(Re^{i\varphi })(1+\cos(n\varphi +\varphi _{n}))d\varphi \leq 2\pi M}
.
This proves the theorem in the case
s
0
=
0
=
f
(
0
)
{\displaystyle s_{0}=0=f(0)}
. For the general case, we define
g
(
s
)
:=
f
(
s
+
s
0
)
−
f
(
s
0
)
{\displaystyle g(s):=f(s+s_{0})-f(s_{0})}
.
Then
s
0
=
0
=
g
(
0
)
{\displaystyle s_{0}=0=g(0)}
, hence by the case we already proved
|
f
n
(
s
0
)
n
!
|
=
|
g
n
(
0
)
n
!
|
≤
2
R
n
max
|
s
|
=
R
ℜ
g
(
s
)
=
2
R
n
(
M
−
ℜ
f
(
s
0
)
)
{\displaystyle \left|{\frac {f^{n}(s_{0})}{n!}}\right|=\left|{\frac {g^{n}(0)}{n!}}\right|\leq {\frac {2}{R^{n}}}\max _{|s|=R}\Re g(s)={\frac {2}{R^{n}}}(M-\Re f(s_{0}))}
.
◻
{\displaystyle \Box }
Further preparations
edit
Definition 5.12 (exponent of convergence) :
Let
(
s
n
)
n
∈
N
{\displaystyle (s_{n})_{n\in \mathbb {N} }}
be a sequence of complex numbers not containing zero such that
∑
n
=
1
∞
1
|
s
n
|
β
{\displaystyle \sum _{n=1}^{\infty }{\frac {1}{|s_{n}|^{\beta }}}}
converges for a
β
>
0
{\displaystyle \beta >0}
. Then
σ
:=
inf
{
β
>
0
|
∑
n
=
1
∞
1
|
s
n
|
β
<
∞
}
{\displaystyle \sigma :=\inf \left\{\beta >0{\big |}\sum _{n=1}^{\infty }{\frac {1}{|s_{n}|^{\beta }}}<\infty \right\}}
is called the exponent of convergence of the sequence
(
s
n
)
n
∈
N
{\displaystyle (s_{n})_{n\in \mathbb {N} }}
.
Lemma 5.14 :