Useful summation formulas
Analytic number theory is so abysmally complex that we need a basic toolkit of summation formulas first in order to prove some of the most basic theorems of the theory.
Abel's summation formula Edit
Note: We need the Riemann integrability to be able to apply the fundamental theorem of calculus.
Proof 1 :
We prove the theorem by induction on
⌊
x
⌋
{\displaystyle \lfloor x\rfloor }
.
1.
⌊
x
⌋
=
1
{\displaystyle \lfloor x\rfloor =1}
:
First, we have in this case
∑
1
≤
n
≤
x
a
n
f
(
n
)
=
a
1
f
(
1
)
{\displaystyle \sum _{1\leq n\leq x}a_{n}f(n)=a_{1}f(1)}
.Then, we have
A
(
x
)
f
(
x
)
=
a
1
f
(
x
)
=
a
1
f
(
1
)
−
a
1
(
f
(
1
)
−
f
(
x
)
)
=
a
1
f
(
1
)
+
∫
1
x
A
(
y
)
f
′
(
y
)
d
y
{\displaystyle A(x)f(x)=a_{1}f(x)=a_{1}f(1)-a_{1}(f(1)-f(x))=a_{1}f(1)+\int _{1}^{x}A(y)f'(y)dy}
by the fundamental theorem of calculus.
2. Induction step:
Define
N
:=
⌊
x
⌋
{\displaystyle N:=\lfloor x\rfloor }
. We have
∑
1
≤
n
≤
x
a
n
f
(
n
)
=
∑
1
≤
n
≤
x
−
1
a
n
f
(
n
)
+
a
N
f
(
N
)
=
A
(
x
−
1
)
f
(
x
−
1
)
−
∫
1
x
−
1
A
(
y
)
f
′
(
y
)
d
y
+
a
N
f
(
N
)
{\displaystyle {\begin{aligned}\sum _{1\leq n\leq x}a_{n}f(n)&=\sum _{1\leq n\leq x-1}a_{n}f(n)+a_{N}f(N)\\&=A(x-1)f(x-1)-\int _{1}^{x-1}A(y)f'(y)dy+a_{N}f(N)\end{aligned}}}
by the induction hypothesis. Further,
−
∫
1
x
−
1
A
(
y
)
f
′
(
y
)
d
y
=
−
∫
1
x
A
(
y
)
f
′
(
y
)
d
y
+
∫
x
−
1
x
A
(
y
)
f
′
(
y
)
d
y
=
−
∫
1
x
A
(
y
)
f
′
(
y
)
d
y
+
∫
x
−
1
N
A
(
y
)
f
′
(
y
)
d
y
+
∫
N
x
A
(
y
)
f
′
(
y
)
d
y
=
−
∫
1
x
A
(
y
)
f
′
(
y
)
d
y
+
A
(
N
−
1
)
∫
x
−
1
N
f
′
(
y
)
d
y
+
A
(
N
)
∫
N
x
f
′
(
y
)
d
y
=
−
∫
1
x
A
(
y
)
f
′
(
y
)
d
y
+
A
(
N
−
1
)
(
f
(
N
)
−
f
(
x
−
1
)
)
+
A
(
N
)
(
f
(
x
)
−
f
(
N
)
)
{\displaystyle {\begin{aligned}-\int _{1}^{x-1}A(y)f'(y)dy&=-\int _{1}^{x}A(y)f'(y)dy+\int _{x-1}^{x}A(y)f'(y)dy\\&=-\int _{1}^{x}A(y)f'(y)dy+\int _{x-1}^{N}A(y)f'(y)dy+\int _{N}^{x}A(y)f'(y)dy\\&=-\int _{1}^{x}A(y)f'(y)dy+A(N-1)\int _{x-1}^{N}f'(y)dy+A(N)\int _{N}^{x}f'(y)dy\\&=-\int _{1}^{x}A(y)f'(y)dy+A(N-1)(f(N)-f(x-1))+A(N)(f(x)-f(N))\end{aligned}}}
.Putting things together, we obtain
∑
1
≤
n
≤
x
a
n
f
(
n
)
=
A
(
x
−
1
)
f
(
x
−
1
)
−
∫
1
x
A
(
y
)
f
′
(
y
)
d
y
+
A
(
N
−
1
)
(
f
(
N
)
−
f
(
x
−
1
)
)
+
A
(
N
)
(
f
(
x
)
−
f
(
N
)
)
+
a
N
f
(
N
)
{\displaystyle \sum _{1\leq n\leq x}a_{n}f(n)=A(x-1)f(x-1)-\int _{1}^{x}A(y)f'(y)dy+A(N-1)(f(N)-f(x-1))+A(N)(f(x)-f(N))+a_{N}f(N)}
and thus the desired formula.
The method of proof we applied here was using induction and then trying to express the terms from the induction hypothesis in terms of the terms from the desired formula.
◻
{\displaystyle \Box }
Proof 2 :
We prove the theorem by direct manipulation of the term on the left.
Define
k
:=
⌊
x
⌋
{\displaystyle k:=\lfloor x\rfloor }
.
∑
1
≤
n
≤
x
a
n
f
(
n
)
=
∑
1
≤
n
≤
x
(
A
(
n
)
−
A
(
n
−
1
)
)
f
(
n
)
=
∑
1
≤
n
≤
x
A
(
n
)
f
(
n
)
−
∑
0
≤
n
≤
x
−
1
A
(
n
)
f
(
n
+
1
)
=
∑
1
≤
n
≤
x
A
(
n
)
f
(
n
)
−
∑
1
≤
n
≤
x
−
1
A
(
n
)
f
(
n
+
1
)
=
∑
1
≤
n
≤
x
−
1
A
(
n
)
(
f
(
n
)
−
f
(
n
+
1
)
)
+
A
(
k
)
f
(
k
)
=
∑
1
≤
n
≤
x
−
1
A
(
n
)
(
−
∫
n
n
+
1
f
′
(
t
)
d
t
)
+
A
(
x
)
f
(
x
)
−
∫
k
x
A
(
t
)
f
′
(
t
)
d
t
{\displaystyle {\begin{aligned}\sum _{1\leq n\leq x}a_{n}f(n)&=\sum _{1\leq n\leq x}(A(n)-A(n-1))f(n)\\&=\sum _{1\leq n\leq x}A(n)f(n)-\sum _{0\leq n\leq x-1}A(n)f(n+1)=\sum _{1\leq n\leq x}A(n)f(n)-\sum _{1\leq n\leq x-1}A(n)f(n+1)\\&=\sum _{1\leq n\leq x-1}A(n)(f(n)-f(n+1))+A(k)f(k)\\&=\sum _{1\leq n\leq x-1}A(n)\left(-\int _{n}^{n+1}f'(t)dt\right)+A(x)f(x)-\int _{k}^{x}A(t)f'(t)dt\end{aligned}}}
◻
{\displaystyle \Box }
Proof 3 :
We prove the formula by the means of the Riemann-Stieltjes integral. Indeed, by integration by parts, we have
∑
1
≤
n
≤
x
a
n
f
(
n
)
=
∫
0
x
f
(
t
)
d
A
(
t
)
=
A
(
x
)
f
(
x
)
−
∫
0
x
A
(
t
)
d
f
(
t
)
=
A
(
x
)
f
(
x
)
−
∫
1
x
A
(
t
)
f
′
(
t
)
d
t
{\displaystyle {\begin{aligned}\sum _{1\leq n\leq x}a_{n}f(n)=\int _{0}^{x}f(t)dA(t)&=A(x)f(x)-\int _{0}^{x}A(t)df(t)\\&=A(x)f(x)-\int _{1}^{x}A(t)f'(t)dt\end{aligned}}}
.
◻
{\displaystyle \Box }
Corollary 1.2 :
∑
y
<
n
≤
x
a
n
f
(
n
)
=
A
(
x
)
f
(
x
)
−
A
(
y
)
f
(
y
)
−
∫
y
x
A
(
t
)
f
′
(
t
)
d
t
{\displaystyle \sum _{y<n\leq x}a_{n}f(n)=A(x)f(x)-A(y)f(y)-\int _{y}^{x}A(t)f'(t)dt}
.
Proof 1 :
We deduce the formula from integration by parts for the Riemann-Stieltjes integral.
∑
y
<
n
≤
x
a
n
f
(
n
)
=
∫
y
x
f
(
t
)
d
A
(
t
)
=
f
(
x
)
A
(
x
)
−
f
(
y
)
A
(
y
)
−
∫
y
x
A
(
t
)
d
f
(
t
)
=
f
(
x
)
A
(
x
)
−
f
(
y
)
A
(
y
)
−
∫
y
x
A
(
t
)
f
′
(
t
)
d
t
{\displaystyle {\begin{aligned}\sum _{y<n\leq x}a_{n}f(n)=\int _{y}^{x}f(t)dA(t)&=f(x)A(x)-f(y)A(y)-\int _{y}^{x}A(t)df(t)\\&=f(x)A(x)-f(y)A(y)-\int _{y}^{x}A(t)f'(t)dt\end{aligned}}}
◻
{\displaystyle \Box }
Proof 2 :
We directly manipulate the LHS (left hand side).
Define
k
:=
⌊
x
⌋
{\displaystyle k:=\lfloor x\rfloor }
and
m
:=
⌊
y
⌋
{\displaystyle m:=\lfloor y\rfloor }
.
∑
y
<
n
≤
x
a
n
f
(
n
)
=
∑
y
<
n
≤
x
(
A
(
n
)
−
A
(
n
−
1
)
)
f
(
n
)
=
∑
y
<
n
≤
x
A
(
n
)
f
(
n
)
−
∑
y
−
1
<
n
≤
x
−
1
A
(
n
)
f
(
n
+
1
)
=
∑
y
<
n
≤
x
−
1
A
(
n
)
(
f
(
n
)
−
f
(
n
+
1
)
)
−
A
(
m
)
f
(
m
)
+
A
(
k
)
f
(
k
)
=
∑
y
<
n
≤
x
−
1
A
(
n
)
−
(
∫
n
n
+
1
f
′
(
t
)
d
t
)
−
A
(
y
)
f
(
y
)
+
A
(
x
)
f
(
x
)
−
∫
y
m
A
(
t
)
f
′
(
t
)
d
t
−
∫
k
x
A
(
t
)
f
′
(
t
)
d
t
{\displaystyle {\begin{aligned}\sum _{y<n\leq x}a_{n}f(n)&=\sum _{y<n\leq x}(A(n)-A(n-1))f(n)\\&=\sum _{y<n\leq x}A(n)f(n)-\sum _{y-1<n\leq x-1}A(n)f(n+1)\\&=\sum _{y<n\leq x-1}A(n)(f(n)-f(n+1))-A(m)f(m)+A(k)f(k)\\&=\sum _{y<n\leq x-1}A(n)-\left(\int _{n}^{n+1}f'(t)dt\right)-A(y)f(y)+A(x)f(x)-\int _{y}^{m}A(t)f'(t)dt-\int _{k}^{x}A(t)f'(t)dt\end{aligned}}}
◻
{\displaystyle \Box }
Two further proofs are given in exercises 1.1.1 and 1.1.5.
We note that induction and direct manipulation are quicker proofs for theorem 1.1, while corollary 1.2 is quicker proven from theorem 1.1 or Riemann-Stieltjes integration.
Exercises Edit
Exercise 1.1.1 : Prove corollary 1.2 from theorem 1.1. Hint:
∑
y
<
n
≤
x
a
n
f
(
n
)
=
∑
1
≤
n
≤
x
a
n
f
(
n
)
−
∑
1
<
n
≤
y
a
n
f
(
n
)
{\displaystyle \sum _{y<n\leq x}a_{n}f(n)=\sum _{1\leq n\leq x}a_{n}f(n)-\sum _{1<n\leq y}a_{n}f(n)}
.
Exercise 1.1.2 : Compute
∑
n
=
1
N
n
3
{\displaystyle \sum _{n=1}^{N}n^{3}}
. Hint: Use
a
n
=
n
{\displaystyle a_{n}=n}
,
f
(
x
)
=
x
2
{\displaystyle f(x)=x^{2}}
, apply Abelian summation and split the resulting integral into pieces where
A
(
t
)
{\displaystyle A(t)}
is constant. Then apply a similar process.
Exercise 1.1.3 : Prove that the limit
lim
n
→
∞
(
−
log
(
n
)
+
∑
k
=
1
n
1
k
)
{\displaystyle \lim _{n\to \infty }\left(-\log(n)+\sum _{k=1}^{n}{\frac {1}{k}}\right)}
exists. This limit is called the Euler–Mascheroni constant . Hint: Use
a
n
=
1
{\displaystyle a_{n}=1}
and
f
(
x
)
=
1
x
{\displaystyle f(x)={\frac {1}{x}}}
.
Exercise 1.1.4 : Prove theorem 1.1 from corollary 1.2.
Exercise 1.1.5 : Prove corollary 1.2 using induction on
⌊
x
⌋
{\displaystyle \lfloor x\rfloor }
.Euler's summation formula Edit
Definition 1.3 :
For
x
∈
R
{\displaystyle x\in \mathbb {R} }
, we define
{
x
}
:=
x
−
⌊
x
⌋
{\displaystyle \{x\}:=x-\lfloor x\rfloor }
.
Theorem 1.4 (Euler's summation formula) :
Let
f
:
R
→
R
{\displaystyle f:\mathbb {R} \to \mathbb {R} }
be a differentiable function, such that
f
′
{\displaystyle f'}
is Riemann integrable. Then
∑
y
<
n
≤
x
f
(
n
)
=
{
x
}
f
(
x
)
−
{
y
}
f
(
y
)
+
∫
y
x
f
(
t
)
d
t
+
∫
y
x
{
t
}
f
′
(
t
)
d
t
{\displaystyle \sum _{y<n\leq x}f(n)=\{x\}f(x)-\{y\}f(y)+\int _{y}^{x}f(t)dt+\int _{y}^{x}\{t\}f'(t)dt}
.
Proof :
We prove the theorem from Corollary 1.2, setting
a
n
=
1
{\displaystyle a_{n}=1}
and using integration by parts (integration by parts is proven using the fundamental theorem of calculus).
Indeed,
∑
y
<
n
≤
x
f
(
n
)
=
⌊
x
⌋
f
(
x
)
−
⌊
y
⌋
f
(
y
)
−
∫
y
x
⌊
t
⌋
f
′
(
t
)
d
t
=
⌊
x
⌋
f
(
x
)
−
⌊
y
⌋
f
(
y
)
−
∫
y
x
t
f
′
(
t
)
d
t
+
∫
y
x
{
t
}
f
′
(
t
)
d
t
=
{
x
}
f
(
x
)
−
{
y
}
f
(
y
)
+
∫
y
x
f
(
t
)
d
t
+
∫
y
x
{
t
}
f
′
(
t
)
d
t
{\displaystyle {\begin{aligned}\sum _{y<n\leq x}f(n)&=\lfloor x\rfloor f(x)-\lfloor y\rfloor f(y)-\int _{y}^{x}\lfloor t\rfloor f'(t)dt\\&=\lfloor x\rfloor f(x)-\lfloor y\rfloor f(y)-\int _{y}^{x}tf'(t)dt+\int _{y}^{x}\{t\}f'(t)dt\\&=\{x\}f(x)-\{y\}f(y)+\int _{y}^{x}f(t)dt+\int _{y}^{x}\{t\}f'(t)dt\end{aligned}}}
,where in the last line we used integration by parts on the integral
∫
y
x
t
f
′
(
t
)
d
t
{\displaystyle \int _{y}^{x}tf'(t)dt}
.
◻
{\displaystyle \Box }
Exercises Edit
Prove corollary 1.5.
Euler–Maclaurin formula Edit
Proof 1 :
We prove the theorem by direct computation.
{\displaystyle {\begin{aligned}\end{aligned}}}
Proof 2 :
We prove the theorem from Euler's summation formula.
The Chebychev ψ and ϑ functions
Proposition (the Chebychev ψ function may be written as the sum of Chebyshev ϑ functions) :
We have the identity
ψ
(
x
)
=
∑
m
=
1
log
2
(
x
)
ϑ
(
x
1
/
m
)
{\displaystyle \psi (x)=\sum _{m=1}^{\log _{2}(x)}\vartheta (x^{1/m})}
.
Proposition (estimate of the distance between the Chebychev ψ and ϑ functions) :
Whenever
x
≥
3
,
594
,
641
{\displaystyle x\geq 3,594,641}
, we have
ψ
(
x
)
−
ϑ
(
x
)
=
x
+
O
(
x
ln
(
x
)
2
)
{\displaystyle \psi (x)-\vartheta (x)={\sqrt {x}}+O\left({\frac {\sqrt {x}}{\ln(x)^{2}}}\right)}
.
Note: The current proof gives an inferior error term. A subsequent version will redeem this issue. (Given the Riemann hypothesis, the error term can be made even smaller.)
Proof: We know that the formula
ψ
(
x
)
=
∑
m
=
1
log
2
(
x
)
ϑ
(
x
1
/
m
)
{\displaystyle \psi (x)=\sum _{m=1}^{\log _{2}(x)}\vartheta (x^{1/m})}
holds. Hence,
ψ
(
x
)
−
ϑ
(
x
)
=
∑
m
=
2
log
2
(
x
)
ϑ
(
x
1
/
m
)
{\displaystyle \psi (x)-\vartheta (x)=\sum _{m=2}^{\log _{2}(x)}\vartheta (x^{1/m})}
.By a result obtained by Pierre Dusart (based upon the computational verification of the Riemann hypothesis for small moduli), we have
|
θ
(
x
)
−
x
|
≤
0.2
ln
(
x
)
2
x
{\displaystyle \left|\theta (x)-x\right|\leq {\frac {0.2}{\ln(x)^{2}}}x}
whenever
x
≥
3
,
594
,
641
{\displaystyle x\geq 3,594,641}
. If
x
{\displaystyle x}
is in that range, we hence conclude
ψ
(
x
)
−
ϑ
(
x
)
=
∑
m
=
2
log
2
(
x
)
ϑ
(
x
1
/
m
)
≤
(
1
+
0.2
ln
(
x
)
2
)
∑
m
=
2
log
2
(
x
)
x
1
/
m
{\displaystyle \psi (x)-\vartheta (x)=\sum _{m=2}^{\log _{2}(x)}\vartheta (x^{1/m})\leq \left(1+{\frac {0.2}{\ln(x)^{2}}}\right)\sum _{m=2}^{\log _{2}(x)}x^{1/m}}
.By Euler's summation formula , we have
∑
m
=
2
log
2
(
x
)
x
1
/
m
=
∫
2
log
2
(
x
)
x
1
/
t
d
t
+
∫
2
log
2
(
x
)
{
t
}
(
d
d
t
x
1
/
t
)
d
t
+
{
log
2
(
x
)
}
x
1
/
log
2
(
x
)
−
{
2
}
x
1
/
2
{\displaystyle \sum _{m=2}^{\log _{2}(x)}x^{1/m}=\int _{2}^{\log _{2}(x)}x^{1/t}dt+\int _{2}^{\log _{2}(x)}\{t\}\left({\frac {d}{dt}}x^{1/t}\right)dt+\{\log _{2}(x)\}x^{1/\log _{2}(x)}-\{2\}x^{1/2}}
.Certainly
{
2
}
=
0
{\displaystyle \{2\}=0}
and
{
log
2
(
x
)
}
≤
1
{\displaystyle \{\log _{2}(x)\}\leq 1}
. Moreover,
x
1
/
log
2
(
x
)
=
2
{\displaystyle x^{1/\log _{2}(x)}=2}
. Now derivation shows that
−
exp
(
ln
(
x
)
t
)
t
2
ln
(
x
)
{\displaystyle -\exp \left({\frac {\ln(x)}{t}}\right){\frac {t^{2}}{\ln(x)}}}
is an anti-derivative of the function
x
1
/
t
−
2
t
ln
(
x
)
x
1
/
t
{\displaystyle x^{1/t}-{\frac {2t}{\ln(x)}}x^{1/t}}
of
t
{\displaystyle t}
. By the fundamental theorem of calculus, it follows that
∫
a
b
(
1
−
2
t
ln
(
x
)
)
x
1
/
t
d
t
=
[
−
exp
(
ln
(
x
)
t
)
t
2
ln
(
x
)
]
t
=
a
t
=
b
{\displaystyle \int _{a}^{b}\left(1-{\frac {2t}{\ln(x)}}\right)x^{1/t}dt=\left[-\exp \left({\frac {\ln(x)}{t}}\right){\frac {t^{2}}{\ln(x)}}\right]_{t=a}^{t=b}}
for real numbers
a
,
b
∈
R
{\displaystyle a,b\in \mathbb {R} }
such that
a
<
b
{\displaystyle a<b}
. This integral is not precisely the one we want to estimate. Hence, some analytical trickery will be necessary in order to obtain the estimate we want.
We start by noting that if only the bracketed term in the integral were absent, we would have the estimate we desire. In order to proceed, we replace
x
{\displaystyle x}
by the more general expression
x
y
{\displaystyle xy}
(where
y
≥
1
{\displaystyle y\geq 1}
), and obtain
∫
a
b
(
1
−
2
t
ln
(
x
y
)
)
x
1
/
t
y
1
/
t
d
t
=
[
−
exp
(
ln
(
x
y
)
t
)
t
2
ln
(
x
y
)
]
t
=
a
t
=
b
{\displaystyle \int _{a}^{b}\left(1-{\frac {2t}{\ln(xy)}}\right)x^{1/t}y^{1/t}dt=\left[-\exp \left({\frac {\ln(xy)}{t}}\right){\frac {t^{2}}{\ln(xy)}}\right]_{t=a}^{t=b}}
.The integrand is non-negative so long as
t
≤
ln
(
x
y
)
2
{\displaystyle t\leq {\frac {\ln(xy)}{2}}}
.Moreover, if
t
0
{\displaystyle t_{0}}
is strictly within that range, we obtain
∫
2
t
0
x
1
/
t
y
1
/
t
d
t
≤
(
1
−
2
t
0
ln
(
x
y
)
)
−
1
∫
2
t
0
(
1
−
2
t
ln
(
x
y
)
)
x
1
/
t
y
1
/
t
d
t
=
[
−
exp
(
ln
(
x
y
)
t
)
t
2
ln
(
x
y
)
]
t
=
2
t
=
t
0
{\displaystyle \int _{2}^{t_{0}}x^{1/t}y^{1/t}dt\leq \left(1-{\frac {2t_{0}}{\ln(xy)}}\right)^{-1}\int _{2}^{t_{0}}\left(1-{\frac {2t}{\ln(xy)}}\right)x^{1/t}y^{1/t}dt=\left[-\exp \left({\frac {\ln(xy)}{t}}\right){\frac {t^{2}}{\ln(xy)}}\right]_{t=2}^{t=t_{0}}}
.We now introduce a constant
K
∈
(
2
,
t
0
)
{\displaystyle K\in (2,t_{0})}
and obtain the integrals
∫
2
K
x
1
/
t
y
1
/
t
d
t
{\displaystyle \int _{2}^{K}x^{1/t}y^{1/t}dt}
and
∫
K
t
0
x
1
/
t
y
1
/
t
d
t
{\displaystyle \int _{K}^{t_{0}}x^{1/t}y^{1/t}dt}
.The first integral majorises the integral
y
1
/
K
∫
2
K
x
1
/
t
d
t
{\displaystyle y^{1/K}\int _{2}^{K}x^{1/t}dt}
,whereas the second integral majorises the integral
∫
K
t
0
x
1
/
t
d
t
{\displaystyle \int _{K}^{t_{0}}x^{1/t}dt}
.We obtain that
∫
2
t
0
x
1
/
t
d
t
≤
1
y
1
/
K
∫
2
K
x
1
/
t
y
1
1
/
t
d
t
+
∫
K
t
0
x
1
/
t
y
2
1
/
t
d
t
{\displaystyle \int _{2}^{t_{0}}x^{1/t}dt\leq {\frac {1}{y^{1/K}}}\int _{2}^{K}x^{1/t}y_{1}^{1/t}dt+\int _{K}^{t_{0}}x^{1/t}y_{2}^{1/t}dt}
.Now we would like to set
t
0
=
log
2
(
x
)
{\displaystyle t_{0}=\log _{2}(x)}
. To do so, we must ensure that
y
{\displaystyle y}
is sufficiently large so that
K
{\displaystyle K}
resp.
t
0
{\displaystyle t_{0}}
is strictly within the admissible interval.
The two summands on the left are now estimated using our computation above, where
t
0
{\displaystyle t_{0}}
is replaced by
K
{\displaystyle K}
for the first computation: Indeed,
∫
2
K
x
1
/
t
y
1
1
/
t
d
t
≤
(
1
−
2
K
ln
(
x
y
1
)
)
−
1
[
−
exp
(
ln
(
x
y
1
)
t
)
t
2
ln
(
x
y
1
)
]
t
=
2
t
=
K
{\displaystyle \int _{2}^{K}x^{1/t}y_{1}^{1/t}dt\leq \left(1-{\frac {2K}{\ln(xy_{1})}}\right)^{-1}\left[-\exp \left({\frac {\ln(xy_{1})}{t}}\right){\frac {t^{2}}{\ln(xy_{1})}}\right]_{t=2}^{t=K}}
and
∫
K
t
0
x
1
/
t
y
2
1
/
t
d
t
≤
(
1
−
2
t
0
ln
(
x
y
2
)
)
−
1
[
−
exp
(
ln
(
x
y
2
)
t
)
t
2
ln
(
x
y
2
)
]
t
=
K
t
=
t
0
{\displaystyle \int _{K}^{t_{0}}x^{1/t}y_{2}^{1/t}dt\leq \left(1-{\frac {2t_{0}}{\ln(xy_{2})}}\right)^{-1}\left[-\exp \left({\frac {\ln(xy_{2})}{t}}\right){\frac {t^{2}}{\ln(xy_{2})}}\right]_{t=K}^{t=t_{0}}}
.Putting the estimates together and setting
t
0
=
log
2
(
x
)
{\displaystyle t_{0}=\log _{2}(x)}
, we obtain
∫
2
log
2
(
x
)
x
1
/
t
d
t
≤
1
y
1
1
/
K
(
1
−
2
K
ln
(
x
y
1
)
)
−
1
[
−
exp
(
ln
(
x
y
1
)
t
)
t
2
ln
(
x
y
1
)
]
t
=
2
t
=
K
+
(
1
−
2
t
0
ln
(
x
y
2
)
)
−
1
[
−
exp
(
ln
(
x
y
2
)
t
)
t
2
ln
(
x
y
2
)
]
t
=
K
t
=
log
2
(
x
)
{\displaystyle \int _{2}^{\log _{2}(x)}x^{1/t}dt\leq {\frac {1}{y_{1}^{1/K}}}\left(1-{\frac {2K}{\ln(xy_{1})}}\right)^{-1}\left[-\exp \left({\frac {\ln(xy_{1})}{t}}\right){\frac {t^{2}}{\ln(xy_{1})}}\right]_{t=2}^{t=K}+\left(1-{\frac {2t_{0}}{\ln(xy_{2})}}\right)^{-1}\left[-\exp \left({\frac {\ln(xy_{2})}{t}}\right){\frac {t^{2}}{\ln(xy_{2})}}\right]_{t=K}^{t=\log _{2}(x)}}
whenever
K
≤
ln
(
x
y
1
)
2
{\displaystyle K\leq {\frac {\ln(xy_{1})}{2}}}
and
log
2
(
x
)
≤
ln
(
x
y
2
)
2
{\displaystyle \log _{2}(x)\leq {\frac {\ln(xy_{2})}{2}}}
.We now choose the ansatz
1
−
2
K
ln
(
x
y
1
)
=
C
{\displaystyle 1-{\frac {2K}{\ln(xy_{1})}}=C}
and
1
−
2
t
0
ln
(
x
y
2
)
=
D
{\displaystyle 1-{\frac {2t_{0}}{\ln(xy_{2})}}=D}
for constants
C
{\displaystyle C}
and
D
{\displaystyle D}
. These equations are readily seen to imply
y
1
=
1
x
exp
(
2
K
1
−
C
)
{\displaystyle y_{1}={\frac {1}{x}}\exp \left({\frac {2K}{1-C}}\right)}
and
y
2
=
1
x
exp
(
2
log
2
(
x
)
1
−
D
)
{\displaystyle y_{2}={\frac {1}{x}}\exp \left({\frac {2\log _{2}(x)}{1-D}}\right)}
.Note though that
y
1
≥
1
{\displaystyle y_{1}\geq 1}
and
y
2
≥
1
{\displaystyle y_{2}\geq 1}
is needed. The first condition yields
K
≥
1
−
C
2
ln
(
x
)
{\displaystyle K\geq {\frac {1-C}{2}}\ln(x)}
.The equations for
y
1
{\displaystyle y_{1}}
and
y
2
{\displaystyle y_{2}}
may be inserted into the above constraints on
K
{\displaystyle K}
and
log
2
(
x
)
{\displaystyle \log _{2}(x)}
; this yields
K
≤
2
K
1
−
C
{\displaystyle K\leq {\frac {2K}{1-C}}}
and
log
2
(
x
)
≤
2
log
2
(
x
)
1
−
D
{\displaystyle \log _{2}(x)\leq {\frac {2\log _{2}(x)}{1-D}}}
, that is,
C
≥
1
2
{\displaystyle C\geq {\frac {1}{2}}}
and
D
≥
1
2
{\displaystyle D\geq {\frac {1}{2}}}
.If all these conditions are true, the ansatz immediately yields
∫
2
log
2
(
x
)
x
1
/
t
d
t
≤
C
−
1
y
1
1
/
K
[
−
exp
(
ln
(
x
y
1
)
t
)
t
2
ln
(
x
y
1
)
]
t
=
2
t
=
K
+
D
−
1
[
−
exp
(
ln
(
x
y
2
)
t
)
t
2
ln
(
x
y
2
)
]
t
=
K
t
=
log
2
(
x
)
{\displaystyle \int _{2}^{\log _{2}(x)}x^{1/t}dt\leq {\frac {C^{-1}}{y_{1}^{1/K}}}\left[-\exp \left({\frac {\ln(xy_{1})}{t}}\right){\frac {t^{2}}{\ln(xy_{1})}}\right]_{t=2}^{t=K}+D^{-1}\left[-\exp \left({\frac {\ln(xy_{2})}{t}}\right){\frac {t^{2}}{\ln(xy_{2})}}\right]_{t=K}^{t=\log _{2}(x)}}
.We now amend our ansatz by further postulating
K
=
(
1
−
C
2
+
α
)
ln
(
x
)
{\displaystyle K=\left({\frac {1-C}{2}}+\alpha \right)\ln(x)}
.This yields
y
1
=
1
x
exp
(
(
1
−
C
+
2
α
)
ln
(
x
)
1
−
C
)
{\displaystyle y_{1}={\frac {1}{x}}\exp \left({\frac {(1-C+2\alpha )\ln(x)}{1-C}}\right)}
and
C
−
1
y
1
1
/
K
[
−
exp
(
ln
(
x
y
1
)
t
)
t
2
ln
(
x
y
1
)
]
t
=
2
t
=
K
=
C
−
1
y
1
1
/
K
[
−
exp
(
(
1
−
C
+
2
α
)
ln
(
x
)
1
−
C
t
)
t
2
ln
(
x
y
1
)
]
t
=
2
t
=
K
{\displaystyle {\frac {C^{-1}}{y_{1}^{1/K}}}\left[-\exp \left({\frac {\ln(xy_{1})}{t}}\right){\frac {t^{2}}{\ln(xy_{1})}}\right]_{t=2}^{t=K}={\frac {C^{-1}}{y_{1}^{1/K}}}\left[-\exp \left({\frac {\frac {(1-C+2\alpha )\ln(x)}{1-C}}{t}}\right){\frac {t^{2}}{\ln(xy_{1})}}\right]_{t=2}^{t=K}}
.From this we deduce that in order to obtain an asymptotically sharp error term, we need to set
α
=
0
{\displaystyle \alpha =0}
. But doing so yields the desired result.
◻
{\displaystyle \Box }
Arithmetic functions
In this chapter, we shall set up the basic theory of arithmetic functions. This theory will be seen in action in later chapters, but in particular in chapter 9.
Definitions Edit
Definition 2.1 :
An arithmetical function is a function
f
:
N
→
C
{\displaystyle f:\mathbb {N} \to \mathbb {C} }
.
Definition 2.2 (important arithmetical functions) :
The Kronecker delta :
δ
(
n
)
:=
{
1
n
=
1
0
n
≠
1
{\displaystyle \delta (n):={\begin{cases}1&n=1\\0&n\neq 1\end{cases}}}
Euler's totient function :
φ
(
n
)
:=
|
{
1
≤
k
≤
n
|
gcd
(
k
,
n
)
=
1
}
|
{\displaystyle \varphi (n):=\left|\{1\leq k\leq n|\gcd(k,n)=1\}\right|}
Möbius'
μ
{\displaystyle \mu }
-function :
μ
(
n
)
:=
{
0
there exists
m
∈
N
such that
m
2
|
n
(
−
1
)
r
n
is product of
r
pairwise different prime numbers
1
n
=
1
{\displaystyle \mu (n):={\begin{cases}0&{\text{ there exists }}m\in \mathbb {N} {\text{ such that }}m^{2}|n\\(-1)^{r}&n{\text{ is product of }}r{\text{ pairwise different prime numbers}}\\1&n=1\end{cases}}}
The von Mangoldt function :
Λ
(
n
)
:=
{
log
(
p
)
∃
p
prime
,
k
∈
N
:
n
=
p
k
0
otherwise
{\displaystyle \Lambda (n):={\begin{cases}\log(p)&\exists p{\text{ prime}},k\in \mathbb {N} :n=p^{k}\\0&{\text{otherwise}}\end{cases}}}
The monomials :
I
k
(
n
)
:=
n
k
{\displaystyle I_{k}(n):=n^{k}}
The number of distinct prime divisors :
ω
(
n
)
:=
r
,
n
=
p
1
k
1
⋯
p
r
k
r
{\displaystyle \omega (n):=r,n=p_{1}^{k_{1}}\cdots p_{r}^{k_{r}}}
,
k
1
,
…
,
k
r
∈
N
{\displaystyle k_{1},\ldots ,k_{r}\in \mathbb {N} }
The sum of prime factors with multiplicity :
Ω
(
n
)
:=
k
1
+
⋯
+
k
r
,
n
=
p
1
k
1
⋯
p
r
k
r
{\displaystyle \Omega (n):=k_{1}+\cdots +k_{r},n=p_{1}^{k_{1}}\cdots p_{r}^{k_{r}}}
,
k
1
,
…
,
k
r
∈
N
{\displaystyle k_{1},\ldots ,k_{r}\in \mathbb {N} }
The Liouville function :
λ
(
n
)
:=
(
−
1
)
Ω
(
n
)
{\displaystyle \lambda (n):=(-1)^{\Omega (n)}}
Exercises Edit
Exercise 2.1.1 : Compute
φ
(
20
)
{\displaystyle \varphi (20)}
,
φ
(
17
)
{\displaystyle \varphi (17)}
and
φ
(
22
)
{\displaystyle \varphi (22)}
.
Exercise 2.1.2 : Compute
μ
(
4278
)
{\displaystyle \mu (4278)}
. Hint:
4278
=
2
⋅
3
⋅
23
⋅
31
{\displaystyle 4278=2\cdot 3\cdot 23\cdot 31}
.
Exercise 2.1.3 : Compute
Λ
(
49
)
{\displaystyle \Lambda (49)}
up to three decimal places. Hint: Use a Taylor expansion.
Exercise 2.1.4 : Prove that for each
n
∈
N
{\displaystyle n\in \mathbb {N} }
and
k
1
,
k
2
,
k
3
,
k
4
∈
N
{\displaystyle k_{1},k_{2},k_{3},k_{4}\in \mathbb {N} }
μ
(
n
+
4
k
1
)
μ
(
n
+
4
k
2
+
1
)
μ
(
n
+
4
k
3
+
2
)
μ
(
n
+
4
k
4
+
3
)
=
0
{\displaystyle \mu (n+4k_{1})\mu (n+4k_{2}+1)\mu (n+4k_{3}+2)\mu (n+4k_{4}+3)=0}
.The convolution and the ring of arithmetic functions Edit
In the following theorem, we show that the arithmetical functions form an Abelian monoid , where the monoid operation is given by the convolution. Further, since the sum of two arithmetic functions is again an arithmetic function, the arithmetic functions form a commutative ring. In fact, as we shall also see, they form an integral domain.
Proof :
1.:
f
∗
g
(
n
)
=
∑
d
|
n
f
(
d
)
g
(
n
d
)
=
∑
d
|
n
f
(
ψ
(
d
)
)
g
(
ψ
(
n
d
)
)
=
∑
d
|
n
f
(
n
d
)
g
(
d
)
=
g
∗
f
(
n
)
{\displaystyle {\begin{aligned}f*g(n)&=\sum _{d|n}f(d)g\left({\frac {n}{d}}\right)=\sum _{d|n}f(\psi (d))g\left(\psi \left({\frac {n}{d}}\right)\right)\\&=\sum _{d|n}f\left({\frac {n}{d}}\right)g(d)=g*f(n)\end{aligned}}}
,where
ψ
(
d
)
:=
n
d
{\displaystyle \psi (d):={\frac {n}{d}}}
is a bijection from the set of divisors of
n
{\displaystyle n}
to itself.
2.:
f
∗
(
g
∗
h
)
=
f
∗
(
h
∗
g
)
=
∑
d
1
|
n
f
(
d
1
)
(
∑
d
2
|
n
d
1
g
(
n
d
1
d
2
)
h
(
d
2
)
)
=
∑
d
1
|
n
∑
d
2
|
n
d
1
f
(
d
1
)
g
(
n
d
1
d
2
)
h
(
d
2
)
=
∑
d
2
|
n
∑
d
1
|
n
d
2
f
(
d
1
)
g
(
n
d
1
d
2
)
h
(
d
2
)
{\displaystyle {\begin{aligned}f*(g*h)=f*(h*g)&=\sum _{d_{1}|n}f(d_{1})\left(\sum _{d_{2}|{\frac {n}{d_{1}}}}g\left({\frac {n}{d_{1}d_{2}}}\right)h(d_{2})\right)\\&=\sum _{d_{1}|n}\sum _{d_{2}|{\frac {n}{d_{1}}}}f(d_{1})g\left({\frac {n}{d_{1}d_{2}}}\right)h(d_{2})\\&=\sum _{d_{2}|n}\sum _{d_{1}|{\frac {n}{d_{2}}}}f(d_{1})g\left({\frac {n}{d_{1}d_{2}}}\right)h(d_{2})\end{aligned}}}
,where the last equality follows from the identity function
I
d
:
{
(
d
1
,
d
2
)
:
d
2
|
n
,
d
1
|
n
d
2
}
→
{
(
d
1
,
d
2
)
:
d
1
|
n
,
d
2
|
n
d
1
}
{\displaystyle Id:\left\{(d_{1},d_{2}):d_{2}|n,d_{1}{\big |}{\frac {n}{d_{2}}}\right\}\to \left\{(d_{1},d_{2}):d_{1}|n,d_{2}{\big |}{\frac {n}{d_{1}}}\right\}}
being a bijection. But
∑
d
2
|
n
∑
d
1
|
n
d
2
f
(
d
1
)
g
(
n
d
1
d
2
)
h
(
d
2
)
=
(
f
∗
g
)
∗
h
{\displaystyle \sum _{d_{2}|n}\sum _{d_{1}|{\frac {n}{d_{2}}}}f(d_{1})g\left({\frac {n}{d_{1}d_{2}}}\right)h(d_{2})=(f*g)*h}
and hence associativity.
3.:
δ
∗
f
(
n
)
=
∑
d
|
n
δ
(
d
)
f
(
n
d
)
=
f
(
n
)
{\displaystyle \delta *f(n)=\sum _{d|n}\delta (d)f\left({\frac {n}{d}}\right)=f(n)}
◻
{\displaystyle \Box }
Theorem 2.5 :
The ring of arithmetic functions is an integral domain.
Proof :
Let
f
,
g
≠
0
{\displaystyle f,g\neq 0}
be arithmetic functions, and let
n
,
k
∈
N
{\displaystyle n,k\in \mathbb {N} }
be minimal such that
f
(
n
)
≠
0
{\displaystyle f(n)\neq 0}
,
g
(
k
)
≠
0
{\displaystyle g(k)\neq 0}
. Then
f
∗
g
(
n
k
)
=
∑
d
|
n
k
f
(
d
)
g
(
n
k
d
)
=
f
(
n
)
g
(
k
)
{\displaystyle f*g(nk)=\sum _{d|nk}f(d)g\left({\frac {nk}{d}}\right)=f(n)g(k)}
.
◻
{\displaystyle \Box }
We shall now determine the units of the ring of arithmetic functions.
Theorem 2.6 :
Let
f
{\displaystyle f}
be an arithmetic function. Then
f
{\displaystyle f}
is invertible (with respect to convolution) if and only if
f
(
1
)
≠
0
{\displaystyle f(1)\neq 0}
.
Proof :
Assume first
f
(
1
)
=
0
{\displaystyle f(1)=0}
. Then for any arithmetic function
g
{\displaystyle g}
,
f
∗
g
(
1
)
=
0
≠
1
=
δ
(
1
)
{\displaystyle f*g(1)=0\neq 1=\delta (1)}
.
Assume now
f
(
1
)
≠
0
{\displaystyle f(1)\neq 0}
. Then
g
{\displaystyle g}
, given by the recursive formula
g
(
1
)
=
f
(
1
)
−
1
{\displaystyle g(1)=f(1)^{-1}}
,
g
(
n
)
=
−
f
(
1
)
−
1
∑
d
|
n
d
<
n
g
(
d
)
f
(
n
d
)
=
g
(
n
)
−
f
(
1
)
−
1
g
∗
f
(
n
)
{\displaystyle g(n)=-f(1)^{-1}\sum _{d|n \atop d<n}g(d)f\left({\frac {n}{d}}\right)=g(n)-f(1)^{-1}g*f(n)}
,
n
>
1
{\displaystyle n>1}
is an inverse (and thus the inverse) of
f
{\displaystyle f}
, since
f
∗
g
(
1
)
=
f
(
1
)
g
(
1
)
=
1
{\displaystyle f*g(1)=f(1)g(1)=1}
and for
n
>
1
{\displaystyle n>1}
inductively
f
∗
g
(
n
)
=
f
∗
g
(
n
)
−
f
(
1
)
−
1
f
∗
g
∗
f
(
n
)
=
f
∗
g
(
n
)
−
f
−
1
∑
d
|
n
d
<
n
f
∗
g
(
d
)
f
(
n
d
)
=
0
{\displaystyle {\begin{aligned}f*g(n)&=f*g(n)-f(1)^{-1}f*g*f(n)\\&=f*g(n)-f^{-1}\sum _{d|n \atop d<n}f*g(d)f\left({\frac {n}{d}}\right)=0\end{aligned}}}
◻
{\displaystyle \Box }
Exercises Edit
Exercise 2.2.1 :
Exercise 2.2.2 :Multiplicative functions Edit
Definition 2.7 :
An arithmetical function
f
:
N
→
C
{\displaystyle f:\mathbb {N} \to \mathbb {C} }
is called multiplicative iff it satisfies
∀
n
,
k
∈
N
:
gcd
(
n
,
k
)
=
1
⇒
f
(
n
k
)
=
f
(
n
)
f
(
k
)
{\displaystyle \forall n,k\in \mathbb {N} :\gcd(n,k)=1\Rightarrow f(nk)=f(n)f(k)}
, and
f
(
1
)
=
1
{\displaystyle f(1)=1}
.
Theorem 2.8 :
Let
f
,
g
{\displaystyle f,g}
be multiplicative arithmetical functions. Then
f
∗
g
{\displaystyle f*g}
is multiplicative.
Proof :
Let
gcd
(
k
,
n
)
=
1
{\displaystyle \gcd(k,n)=1}
. Then
f
∗
g
(
k
n
)
=
∑
d
|
(
k
n
)
f
(
d
)
g
(
k
n
d
)
=
∑
d
1
|
k
d
2
|
n
f
(
d
1
d
2
)
g
(
k
n
d
1
d
2
)
{\displaystyle f*g(kn)=\sum _{d|(kn)}f(d)g\left({\frac {kn}{d}}\right)=\sum _{d_{1}|k \atop d_{2}|n}f(d_{1}d_{2})g\left({\frac {kn}{d_{1}d_{2}}}\right)}
,since the function
θ
(
d
)
:=
(
gcd
(
d
,
n
)
,
gcd
(
d
,
k
)
)
{\displaystyle \theta (d):=(\gcd(d,n),\gcd(d,k))}
is a bijection from the divisors of
n
k
{\displaystyle nk}
to the Cartesian product of the divisors of
n
{\displaystyle n}
and the divisors of
k
{\displaystyle k}
; this is because multiplication is the inverse:
gcd
(
d
,
n
)
gcd
(
d
,
k
)
=
d
{\displaystyle \gcd(d,n)\gcd(d,k)=d}
,
(
gcd
(
d
1
d
2
,
n
)
,
gcd
(
d
1
d
2
,
k
)
)
=
(
d
1
,
d
2
)
{\displaystyle (\gcd(d_{1}d_{2},n),\gcd(d_{1}d_{2},k))=(d_{1},d_{2})}
.To rigorously prove this actually is an exercise in itself. But due to the multiplicativity of
f
{\displaystyle f}
and
g
{\displaystyle g}
,
∑
d
1
|
k
d
2
|
n
f
(
d
1
d
2
)
g
(
k
n
d
1
d
2
)
=
∑
d
1
|
k
d
2
|
n
f
(
d
1
)
g
(
k
d
1
)
f
(
d
2
)
g
(
n
d
2
)
=
f
∗
g
(
k
)
⋅
f
∗
g
(
n
)
{\displaystyle \sum _{d_{1}|k \atop d_{2}|n}f(d_{1}d_{2})g\left({\frac {kn}{d_{1}d_{2}}}\right)=\sum _{d_{1}|k \atop d_{2}|n}f(d_{1})g\left({\frac {k}{d_{1}}}\right)f(d_{2})g\left({\frac {n}{d_{2}}}\right)=f*g(k)\cdot f*g(n)}
.Furthermore,
f
∗
g
(
1
)
=
f
(
1
)
g
(
1
)
=
1
{\displaystyle f*g(1)=f(1)g(1)=1}
.
◻
{\displaystyle \Box }
Since
δ
{\displaystyle \delta }
is multiplicative, we conclude that the multiplicative functions form an Abelian submonoid of the arithmetic functions with convolution. Unfortunately, we do not have a subring since the sum of two multiplicative functions is never multiplicative (look at
n
=
1
{\displaystyle n=1}
).
Theorem 2.9 :
Let
f
{\displaystyle f}
be a multiplicative function such that
∑
j
=
1
∞
f
(
j
)
{\displaystyle \sum _{j=1}^{\infty }f(j)}
converges absolutely. Then
∑
j
=
1
∞
f
(
j
)
=
∏
p
prime
(
∑
k
=
0
∞
f
(
p
k
)
)
{\displaystyle \sum _{j=1}^{\infty }f(j)=\prod _{p{\text{ prime}}}\left(\sum _{k=0}^{\infty }f(p^{k})\right)}
.
Proof : Let
p
1
,
p
2
,
p
3
,
…
=
2
,
3
,
5
,
…
{\displaystyle p_{1},p_{2},p_{3},\ldots =2,3,5,\ldots }
be the ordered sequence of all prime numbers. For all
m
1
,
…
,
m
r
,
r
∈
N
{\displaystyle m_{1},\ldots ,m_{r},r\in \mathbb {N} }
we have
S
f
,
r
,
m
1
,
…
,
m
r
:=
∑
d
|
(
p
1
m
1
⋯
p
r
m
r
)
f
(
d
)
=
∑
0
≤
k
1
≤
m
1
⋯
∑
0
≤
k
r
≤
m
r
f
(
p
1
k
1
)
⋯
f
(
p
r
k
r
)
=
∏
l
=
1
r
(
∑
k
=
0
m
l
f
(
p
l
k
)
)
{\displaystyle S_{f,r,m_{1},\ldots ,m_{r}}:=\sum _{d|\left(p_{1}^{m_{1}}\cdots p_{r}^{m_{r}}\right)}f(d)=\sum _{0\leq k_{1}\leq m_{1}}\cdots \sum _{0\leq k_{r}\leq m_{r}}f\left(p_{1}^{k_{1}}\right)\cdots f\left(p_{r}^{k_{r}}\right)=\prod _{l=1}^{r}\left(\sum _{k=0}^{m_{l}}f(p_{l}^{k})\right)}
due to the multiplicativity of
f
{\displaystyle f}
. For each
r
{\displaystyle r}
, we successively take
m
1
→
∞
{\displaystyle m_{1}\to \infty }
, ...,
m
r
→
∞
{\displaystyle m_{r}\to \infty }
and then
r
→
∞
{\displaystyle r\to \infty }
. It follows from the definitions and the rule
x
n
→
x
⇒
y
x
n
→
y
x
{\displaystyle x_{n}\to x\Rightarrow yx_{n}\to yx}
that the right hand side converges to
∏
l
=
1
∞
(
∑
k
=
0
∞
f
(
p
l
k
)
)
{\displaystyle \prod _{l=1}^{\infty }\left(\sum _{k=0}^{\infty }f(p_{l}^{k})\right)}
.We claim that
lim
r
→
∞
lim
m
1
→
∞
⋯
lim
m
r
→
∞
S
f
,
r
,
m
1
,
…
,
m
r
=
∑
j
=
1
∞
f
(
j
)
{\displaystyle \lim _{r\to \infty }\lim _{m_{1}\to \infty }\cdots \lim _{m_{r}\to \infty }S_{f,r,m_{1},\ldots ,m_{r}}=\sum _{j=1}^{\infty }f(j)}
.Indeed, choose
N
∈
N
{\displaystyle N\in \mathbb {N} }
such that
∑
j
=
N
+
1
∞
|
f
(
j
)
|
<
ϵ
{\displaystyle \sum _{j=N+1}^{\infty }|f(j)|<\epsilon }
.Then by the fundamental theorem of arithmetic, there exists an
R
∈
N
{\displaystyle R\in \mathbb {N} }
and
M
1
,
…
,
M
R
∈
N
{\displaystyle M_{1},\ldots ,M_{R}\in \mathbb {N} }
such that
∀
j
∈
{
1
,
…
,
N
}
:
j
|
p
1
M
1
⋯
p
R
M
R
{\displaystyle \forall j\in \{1,\ldots ,N\}:j|p_{1}^{M_{1}}\cdots p_{R}^{M_{R}}}
.Then we have by the triangle inequality for
T
>
R
{\displaystyle T>R}
,
L
1
>
M
1
,
…
,
L
R
>
M
R
{\displaystyle L_{1}>M_{1},\ldots ,L_{R}>M_{R}}
and
L
R
+
1
,
…
,
L
T
{\displaystyle L_{R+1},\ldots ,L_{T}}
arbitrary that
|
S
f
,
T
,
L
1
,
…
,
L
T
−
∑
j
=
1
∞
f
(
j
)
|
≤
|
∑
d
|
(
p
1
L
1
⋯
p
T
L
T
)
d
≤
N
f
(
d
)
−
∑
j
=
1
N
f
(
j
)
|
+
|
∑
d
|
(
p
1
L
1
⋯
p
T
L
T
)
d
>
N
f
(
d
)
−
∑
j
=
N
+
1
∞
f
(
j
)
|
<
0
+
ϵ
.
{\displaystyle {\begin{aligned}\left|S_{f,T,L_{1},\ldots ,L_{T}}-\sum _{j=1}^{\infty }f(j)\right|&\leq \left|\sum _{d|\left(p_{1}^{L_{1}}\cdots p_{T}^{L_{T}}\right) \atop d\leq N}f(d)-\sum _{j=1}^{N}f(j)\right|+\left|\sum _{d|\left(p_{1}^{L_{1}}\cdots p_{T}^{L_{T}}\right) \atop d>N}f(d)-\sum _{j=N+1}^{\infty }f(j)\right|\\&<0+\epsilon .\end{aligned}}}
From this easily follows the claim.
It is left to show that the product on the left is independent of the order of multiplication. But this is clear since if the sequence
(
p
n
)
n
∈
N
{\displaystyle (p_{n})_{n\in \mathbb {N} }}
is enumerated differently, the argument works in just the same way and the left hand side remains the same.
◻
{\displaystyle \Box }
Definition 2.10 :
An arithmetical function
f
:
N
→
C
{\displaystyle f:\mathbb {N} \to \mathbb {C} }
is called strongly multiplicative iff it satisfies
∀
n
,
k
∈
N
:
f
(
n
k
)
=
f
(
n
)
f
(
k
)
{\displaystyle \forall n,k\in \mathbb {N} :f(nk)=f(n)f(k)}
, and
f
(
1
)
=
1
{\displaystyle f(1)=1}
.
Equivalently, a strongly multiplicative function is a monoid homomorphism
N
→
C
×
{\displaystyle \mathbb {N} \to \mathbb {C} ^{\times }}
.
Theorem 2.11 :
Let
f
{\displaystyle f}
be a strongly multiplicative function such that
∑
j
=
1
∞
f
(
j
)
{\displaystyle \sum _{j=1}^{\infty }f(j)}
converges absolutely. Then
∑
j
=
1
∞
f
(
j
)
=
∏
p
prime
1
1
−
f
(
p
)
{\displaystyle \sum _{j=1}^{\infty }f(j)=\prod _{p{\text{ prime}}}{\frac {1}{1-f(p)}}}
.
Proof :
Due to theorem 2.9, we have
∑
n
=
1
∞
f
(
n
)
=
∏
p
prime
(
∑
k
=
0
∞
f
(
p
k
)
)
{\displaystyle \sum _{n=1}^{\infty }f(n)=\prod _{p{\text{ prime}}}\left(\sum _{k=0}^{\infty }f(p^{k})\right)}
.Due to strong multiplicativity and the geometric series, the latter expression equals
∏
p
prime
1
1
−
f
(
p
)
{\displaystyle \prod _{p{\text{ prime}}}{\frac {1}{1-f(p)}}}
.
◻
{\displaystyle \Box }
Exercises Edit
Exercise 2.3.1 : Let
h
{\displaystyle h}
be an arithmetic function such that for all
m
,
n
∈
N
{\displaystyle m,n\in \mathbb {N} }
h
(
n
m
)
=
h
(
n
)
+
h
(
m
)
{\displaystyle h(nm)=h(n)+h(m)}
, and let
s
∈
C
∖
{
0
}
{\displaystyle s\in \mathbb {C} \setminus \{0\}}
. Prove that the function
f
(
n
)
:=
exp
(
log
(
s
)
h
(
n
)
)
{\displaystyle f(n):=\exp(\log(s)h(n))}
is multiplicative.Bell series Edit
Examples 2.13 :
We shall here compute the Bell series for some important arithmetic functions.
We note that in general for a completely multiplicative function
f
{\displaystyle f}
, we have
f
p
(
x
)
=
∑
j
=
0
∞
f
(
p
)
j
x
j
=
1
1
−
f
(
p
)
x
{\displaystyle f_{p}(x)=\sum _{j=0}^{\infty }f(p)^{j}x^{j}={\frac {1}{1-f(p)x}}}
.In particular, in this case the Bell series defines a function.
1. The Kronecker delta:
δ
p
(
x
)
=
∑
j
=
0
∞
δ
(
p
j
)
x
j
=
1
{\displaystyle \delta _{p}(x)=\sum _{j=0}^{\infty }\delta (p^{j})x^{j}=1}
2. Euler' totient function (we use lemma 9.?):
φ
p
(
x
)
=
∑
j
=
0
∞
φ
(
p
j
)
x
j
=
∑
j
=
0
∞
φ
(
p
j
)
x
j
=
1
+
∑
j
=
1
∞
(
p
j
−
p
j
−
1
)
x
j
=
1
+
x
1
−
p
x
{\displaystyle {\begin{aligned}\varphi _{p}(x)&=\sum _{j=0}^{\infty }\varphi (p^{j})x^{j}\\&=\sum _{j=0}^{\infty }\varphi (p^{j})x^{j}\\&=1+\sum _{j=1}^{\infty }(p^{j}-p^{j-1})x^{j}\\&={\frac {1+x}{1-px}}\end{aligned}}}
3. The Möbius
m
u
{\displaystyle mu}
function:
μ
p
(
x
)
=
∑
j
=
0
∞
μ
(
p
j
)
x
j
=
1
−
x
{\displaystyle \mu _{p}(x)=\sum _{j=0}^{\infty }\mu (p^{j})x^{j}=1-x}
4. The von Mangoldt function:
Λ
p
(
x
)
=
∑
j
=
0
∞
Λ
(
p
j
)
x
j
=
∑
j
=
0
∞
log
(
p
)
x
j
{\displaystyle \Lambda _{p}(x)=\sum _{j=0}^{\infty }\Lambda (p^{j})x^{j}=\sum _{j=0}^{\infty }\log(p)x^{j}}
5. The monomials:
(
I
k
)
p
=
∑
j
=
0
∞
p
k
j
x
j
=
1
1
−
p
k
x
{\displaystyle (I_{k})_{p}=\sum _{j=0}^{\infty }p^{kj}xj={\frac {1}{1-p^{k}x}}}
6. The number of distinct prime divisors:
ω
p
(
x
)
=
∑
j
=
1
∞
x
j
=
1
1
−
x
−
1
{\displaystyle \omega _{p}(x)=\sum _{j=1}^{\infty }x^{j}={\frac {1}{1-x}}-1}
7. The number of prime divisors including multiplicity:
Ω
p
(
x
)
=
∑
j
=
1
∞
j
x
j
=
1
x
∑
j
=
0
∞
(
x
j
)
′
=
1
x
(
∑
j
=
0
∞
x
j
)
′
=
1
x
(
1
1
−
x
)
′
=
−
1
x
1
(
1
−
x
)
2
{\displaystyle {\begin{aligned}\Omega _{p}(x)&=\sum _{j=1}^{\infty }jx^{j}\\&={\frac {1}{x}}\sum _{j=0}^{\infty }(x^{j})'\\&={\frac {1}{x}}\left(\sum _{j=0}^{\infty }x^{j}\right)'\\&={\frac {1}{x}}\left({\frac {1}{1-x}}\right)'\\&=-{\frac {1}{x}}{\frac {1}{(1-x)^{2}}}\end{aligned}}}
8. The Liouville function:
λ
p
(
x
)
=
∑
j
=
0
∞
(
−
1
)
j
x
j
=
1
1
+
x
{\displaystyle \lambda _{p}(x)=\sum _{j=0}^{\infty }(-1)^{j}x^{j}={\frac {1}{1+x}}}
Theorem 2.14 (compatibility of Bell series and convolution) :
Let
f
,
g
{\displaystyle f,g}
arithmetic functions, and
p
{\displaystyle p}
be a prime. Then
f
p
⋅
g
p
=
(
f
∗
g
)
p
{\displaystyle f_{p}\cdot g_{p}=(f*g)_{p}}
.
Proof :
f
p
(
x
)
g
p
(
x
)
=
(
∑
j
=
0
∞
f
(
p
j
)
x
j
)
(
∑
j
=
0
∞
g
(
p
j
)
x
j
)
=
∑
k
=
0
∞
x
k
(
∑
j
=
0
k
f
(
p
j
)
g
(
p
k
−
j
)
)
=
∑
k
=
0
∞
x
k
(
f
∗
g
)
(
p
k
)
{\displaystyle {\begin{aligned}f_{p}(x)g_{p}(x)&=\left(\sum _{j=0}^{\infty }f(p^{j})x^{j}\right)\left(\sum _{j=0}^{\infty }g(p^{j})x^{j}\right)\\&=\sum _{k=0}^{\infty }x^{k}\left(\sum _{j=0}^{k}f(p^{j})g(p^{k-j})\right)\\&=\sum _{k=0}^{\infty }x^{k}(f*g)(p^{k})\end{aligned}}}
◻
{\displaystyle \Box }
In case of multiplicativity, we have the following theorem:
Theorem 2.15 (Uniqueness theorem) :
Let
f
,
g
{\displaystyle f,g}
be multiplicative functions. Then
f
=
g
⇔
f
p
=
g
p
{\displaystyle f=g\Leftrightarrow f_{p}=g_{p}}
.
Proof :
⇒
{\displaystyle \Rightarrow }
is pretty obvious;
⇐
{\displaystyle \Leftarrow }
:
f
p
(
x
)
=
g
p
(
x
)
{\displaystyle f_{p}(x)=g_{p}(x)}
as formal power series is equivalent to saying
∀
k
∈
N
:
f
(
p
k
)
=
g
(
p
k
)
{\displaystyle \forall k\in \mathbb {N} :f(p^{k})=g(p^{k})}
. If now
n
=
p
1
k
1
⋯
p
r
k
r
{\displaystyle n=p_{1}^{k_{1}}\cdots p_{r}^{k_{r}}}
, then
f
(
n
)
=
f
(
p
1
k
1
⋯
p
r
k
r
)
=
f
(
p
1
k
1
)
⋯
f
(
p
r
k
r
)
=
g
(
p
1
k
1
)
⋯
g
(
p
r
k
r
)
=
g
(
n
)
{\displaystyle f(n)=f(p_{1}^{k_{1}}\cdots p_{r}^{k_{r}})=f(p_{1}^{k_{1}})\cdots f(p_{r}^{k_{r}})=g(p_{1}^{k_{1}})\cdots g(p_{r}^{k_{r}})=g(n)}
due to the multiplicativity of
f
{\displaystyle f}
and
g
{\displaystyle g}
.
◻
{\displaystyle \Box }
In chapter 9, we will use Bell series to obtain equations for number-theoretic functions.
Derivatives Edit
Definition 2.16 :
Let
f
{\displaystyle f}
be an arithmetic function. Then the derivative of
f
{\displaystyle f}
is defined to be the function
f
′
(
n
)
:=
f
(
n
)
log
(
n
)
{\displaystyle f'(n):=f(n)\log(n)}
.
Proof :
1. is easily checked.
2.:
(
f
∗
g
)
′
(
n
)
=
(
f
∗
g
)
(
n
)
log
(
n
)
=
∑
d
|
n
f
(
d
)
g
(
n
/
d
)
(
log
(
d
)
+
log
(
n
/
d
)
)
=
∑
d
|
n
f
(
d
)
g
(
n
/
d
)
log
(
d
)
+
∑
d
|
n
f
(
d
)
g
(
n
/
d
)
log
(
n
/
d
)
{\displaystyle {\begin{aligned}(f*g)'(n)&=(f*g)(n)\log(n)\\&=\sum _{d|n}f(d)g(n/d)(\log(d)+\log(n/d))\\&=\sum _{d|n}f(d)g(n/d)\log(d)+\sum _{d|n}f(d)g(n/d)\log(n/d)\end{aligned}}}
3.
We have
δ
′
=
0
{\displaystyle \delta '=0}
and
(
f
∗
f
−
1
)
=
δ
{\displaystyle (f*f^{-1})=\delta }
. Hence, by 2.
0
=
(
f
∗
f
−
1
)
′
=
f
′
∗
f
−
1
+
f
∗
(
f
−
1
)
′
{\displaystyle 0=(f*f^{-1})'=f'*f^{-1}+f*(f^{-1})'}
.Convolving with
f
−
1
{\displaystyle f^{-1}}
and using
f
−
1
∗
f
−
1
=
(
f
∗
f
)
−
1
{\displaystyle f^{-1}*f^{-1}=(f*f)^{-1}}
yields the desired formula.
◻
{\displaystyle \Box }
Note that a chain rule wouldn't make much sense, since
f
{\displaystyle f}
arithmetic may map anywhere but to
N
{\displaystyle \mathbb {N} }
and thus
g
∘
f
{\displaystyle g\circ f}
doesn't make a lot of sense in general.
Further reading Edit
Characters and Dirichlet characters
Definitions, basic properties Edit
Lemma 4.2 :
Let
G
{\displaystyle G}
be a finite group and let
f
:
G
→
C
{\displaystyle f:G\to \mathbb {C} }
be a character. Then
∀
σ
∈
G
:
|
f
(
σ
)
|
=
1
{\displaystyle \forall \sigma \in G:|f(\sigma )|=1}
.In particular,
∀
σ
∈
G
:
f
(
σ
)
≠
0
{\displaystyle \forall \sigma \in G:f(\sigma )\neq 0}
.
Proof :
Since
G
{\displaystyle G}
is finite, each
σ
∈
G
{\displaystyle \sigma \in G}
has finite order
n
:=
o
r
d
(
σ
)
{\displaystyle n:=\mathrm {ord} (\sigma )}
. Furthermore, let
ρ
∈
G
{\displaystyle \rho \in G}
such that
f
(
ρ
)
≠
0
{\displaystyle f(\rho )\neq 0}
; then
f
(
ρ
)
=
f
(
σ
)
f
(
σ
−
1
ρ
)
{\displaystyle f(\rho )=f(\sigma )f(\sigma ^{-1}\rho )}
and thus
f
(
σ
)
≠
0
{\displaystyle f(\sigma )\neq 0}
. Hence, we are allowed to cancel and
|
f
(
σ
)
|
=
|
f
(
σ
n
+
1
)
|
=
|
f
(
σ
)
|
n
+
1
⇒
|
f
(
σ
)
|
=
1
{\displaystyle |f(\sigma )|=|f(\sigma ^{n+1})|=|f(\sigma )|^{n+1}\Rightarrow |f(\sigma )|=1}
.
◻
{\displaystyle \Box }
Lemma 4.3 :
Let
G
{\displaystyle G}
be a finite group and let
f
,
g
:
G
→
C
{\displaystyle f,g:G\to \mathbb {C} }
be characters. Then the function
h
:
G
→
C
,
h
(
τ
)
:=
f
(
τ
)
⋅
g
(
τ
)
{\displaystyle h:G\to \mathbb {C} ,h(\tau ):=f(\tau )\cdot g(\tau )}
is also a character.
Proof :
h
(
σ
τ
)
=
f
(
σ
τ
)
g
(
σ
τ
)
=
f
(
σ
)
g
(
σ
)
f
(
τ
)
g
(
τ
)
=
h
(
σ
)
h
(
τ
)
≠
0
{\displaystyle h(\sigma \tau )=f(\sigma \tau )g(\sigma \tau )=f(\sigma )g(\sigma )f(\tau )g(\tau )=h(\sigma )h(\tau )\neq 0}
,since
C
{\displaystyle \mathbb {C} }
is a field and thus free of zero divisors.
◻
{\displaystyle \Box }
Lemma 4.4 :
Let
G
{\displaystyle G}
be a finite group and let
f
:
G
→
C
{\displaystyle f:G\to \mathbb {C} }
be a character. Then the function
g
:
G
→
C
,
g
(
τ
)
:=
1
f
(
τ
)
{\displaystyle g:G\to \mathbb {C} ,g(\tau ):={\frac {1}{f(\tau )}}}
is also a character.
Proof : Trivial, since
∀
τ
∈
G
:
f
(
τ
)
≠
0
{\displaystyle \forall \tau \in G:f(\tau )\neq 0}
as shown by the previous lemma.
◻
{\displaystyle \Box }
The previous three lemmas (or only the first, together with a few lemmas from elementary group theory) justify the following definition.
Definition 4.5
Let
G
{\displaystyle G}
be a finite group. Then the group
{
f
:
G
→
C
|
f
is a character
}
=
H
o
m
(
G
,
C
×
)
{\displaystyle \{f:G\to \mathbb {C} |f{\text{ is a character }}\}=\mathrm {Hom} (G,\mathbb {C} ^{\times })}
is called the character group of
G
{\displaystyle G}
.
Required algebra Edit
We need the following result from group theory:
Proof :
Since
G
{\displaystyle G}
is the disjoint union of the cosets of
H
{\displaystyle H}
,
N
{\displaystyle N}
is the disjoint union
⋃
j
=
0
n
−
1
τ
j
H
{\displaystyle \bigcup _{j=0}^{n-1}\tau ^{j}H}
, as
ρ
H
=
H
⇔
ρ
∈
H
{\displaystyle \rho H=H\Leftrightarrow \rho \in H}
and
τ
l
H
=
τ
m
H
⇔
τ
l
−
m
∈
H
⇔
k
|
(
l
−
m
)
{\displaystyle \tau ^{l}H=\tau ^{m}H\Leftrightarrow \tau ^{l-m}\in H\Leftrightarrow k|(l-m)}
. Hence, the cardinality of
N
{\displaystyle N}
equals
k
⋅
n
{\displaystyle k\cdot n}
.
Furthermore, if
τ
l
σ
,
τ
m
ρ
∈
N
{\displaystyle \tau ^{l}\sigma ,\tau ^{m}\rho \in N}
, then
τ
l
σ
(
τ
m
ρ
)
−
1
=
τ
l
−
m
σ
ρ
−
1
∈
N
{\displaystyle \tau ^{l}\sigma (\tau ^{m}\rho )^{-1}=\tau ^{l-m}\sigma \rho ^{-1}\in N}
, and hence
N
{\displaystyle N}
is a subgroup.
◻
{\displaystyle \Box }
Theorems about characters Edit
Dirichlet characters Edit
Dirichlet series
For the remainder of this book, we shall use Riemann 's convention of denoting complex numbers :
s
=
σ
+
i
t
{\displaystyle s=\sigma +it}
Definition Edit
Definition 5.1 :
Let
f
{\displaystyle f}
be an arithmetic function. Then the Dirichlet series associated to
f
{\displaystyle f}
is the series
∑
n
=
1
∞
f
(
n
)
n
s
{\displaystyle \sum _{n=1}^{\infty }{\frac {f(n)}{n^{s}}}}
,where
s
{\displaystyle s}
ranges over the complex numbers.
Convergence considerations Edit
Proof :
Denote by
S
{\displaystyle S}
the set of all real numbers
σ
{\displaystyle \sigma }
such that
∑
n
=
1
∞
|
f
(
n
)
n
s
|
{\displaystyle \sum _{n=1}^{\infty }\left|{\frac {f(n)}{n^{s}}}\right|}
diverges. Due to the assumption, this set is neither empty nor equal to
C
{\displaystyle \mathbb {C} }
. Further, if
σ
0
+
i
t
0
∉
S
{\displaystyle \sigma _{0}+it_{0}\notin S}
, then for all
σ
>
σ
0
{\displaystyle \sigma >\sigma _{0}}
and all
t
{\displaystyle t}
σ
+
i
t
∉
S
{\displaystyle \sigma +it\notin S}
, since
|
f
(
n
)
n
s
0
|
=
|
f
(
n
)
|
n
σ
0
≥
|
f
(
n
)
|
n
σ
=
|
f
(
n
)
n
s
|
{\displaystyle \left|{\frac {f(n)}{n^{s_{0}}}}\right|={\frac {|f(n)|}{n^{\sigma _{0}}}}\geq {\frac {|f(n)|}{n^{\sigma }}}=\left|{\frac {f(n)}{n^{s}}}\right|}
and due to the comparison test. It follows that
S
{\displaystyle S}
has a supremum. Let
σ
a
{\displaystyle \sigma _{a}}
be that supremum. By definition, for
σ
>
σ
a
{\displaystyle \sigma >\sigma _{a}}
we have convergence, and if we had convergence for
σ
<
σ
a
{\displaystyle \sigma <\sigma _{a}}
we would have found a lower upper bound due to the above argument, contradicting the definition of
σ
a
{\displaystyle \sigma _{a}}
.
◻
{\displaystyle \Box }
Theorem 5.3 (abscissa of conditional convergence) :
Theorem 8.4 (Euler product) :
Let
f
{\displaystyle f}
be a strongly multiplicative function, and let
s
∈
C
{\displaystyle s\in \mathbb {C} }
such that the corresponding Dirichlet series converges absolutely. Then for that series we have the formula
∑
n
=
1
∞
f
(
n
)
n
s
=
∏
p
prime
1
1
−
f
(
p
)
p
s
{\displaystyle \sum _{n=1}^{\infty }{\frac {f(n)}{n^{s}}}=\prod _{p{\text{ prime}}}{\frac {1}{1-{\frac {f(p)}{p^{s}}}}}}
.
Proof :
This follows directly from theorem 2.11 and the fact that
f
{\displaystyle f}
strongly multiplicative
⇒
{\displaystyle \Rightarrow }
f
(
n
)
n
s
{\displaystyle {\frac {f(n)}{n^{s}}}}
strongly multiplicative.
◻
{\displaystyle \Box }
Formulas for number-theoretic functions
Formulas for the Möbius μ function Edit
Lemma 2.9 :
∑
d
|
n
μ
(
d
)
n
d
=
∏
j
=
1
r
(
p
j
k
j
−
p
j
k
j
−
1
)
{\displaystyle \sum _{d|n}\mu (d){\frac {n}{d}}=\prod _{j=1}^{r}\left(p_{j}^{k_{j}}-p_{j}^{k_{j}-1}\right)}
.Proof :
For
κ
∈
Z
r
{\displaystyle \kappa \in \mathbb {Z} ^{r}}
a multiindex,
α
∈
{
0
,
1
}
r
{\displaystyle \alpha \in \{0,1\}^{r}}
and
Q
∈
C
r
{\displaystyle Q\in \mathbb {C} ^{r}}
a vector define
Q
α
:=
∏
j
=
1
r
q
j
α
j
{\displaystyle Q^{\alpha }:=\prod _{j=1}^{r}q_{j}^{\alpha _{j}}}
,
Q
κ
:=
(
q
1
k
1
,
…
,
q
r
k
r
)
{\displaystyle Q^{\kappa }:=(q_{1}^{k_{1}},\ldots ,q_{r}^{k_{r}})}
.Let
n
=
(
P
κ
)
1
=
p
1
k
1
⋯
p
r
k
r
{\displaystyle n=(P^{\kappa })^{1}=p_{1}^{k_{1}}\cdots p_{r}^{k_{r}}}
. Then
∏
j
=
1
r
(
p
j
k
j
−
p
j
k
j
−
1
)
=
∑
α
∈
{
0
,
1
}
r
(
P
κ
)
α
(
P
κ
−
1
)
1
−
α
(
−
1
)
|
1
−
α
|
=
∑
d
|
n
μ
(
d
)
n
d
{\displaystyle {\begin{aligned}\prod _{j=1}^{r}\left(p_{j}^{k_{j}}-p_{j}^{k_{j}-1}\right)&=\sum _{\alpha \in \{0,1\}^{r}}(P^{\kappa })^{\alpha }(P^{\kappa -1})^{1-\alpha }(-1)^{|}{1-\alpha |}\\&=\sum _{d|n}\mu (d){\frac {n}{d}}\end{aligned}}}
.
◻
{\displaystyle \Box }
Lemma 2.10 :
φ
=
μ
∗
I
1
{\displaystyle \varphi =\mu *I_{1}}
.Proof 1 :
We prove the lemma from lemma 2.14.
We have by lemma 2.14
φ
(
n
)
=
∑
k
=
1
n
δ
(
gcd
(
k
,
n
)
)
=
∑
k
=
1
n
∑
d
|
gcd
(
k
,
n
)
μ
(
d
)
=
∑
d
|
n
∑
k
=
1
n
[
d
|
k
]
μ
(
d
)
=
∑
d
|
n
∑
j
=
1
n
/
d
μ
(
d
)
{\displaystyle {\begin{aligned}\varphi (n)&=\sum _{k=1}^{n}\delta (\gcd(k,n))\\&=\sum _{k=1}^{n}\sum _{d|\gcd(k,n)}\mu (d)\\&=\sum _{d|n}\sum _{k=1}^{n}[d|k]\mu (d)\\&=\sum _{d|n}\sum _{j=1}^{n/d}\mu (d)\end{aligned}}}
◻
{\displaystyle \Box }
Proof 2 :
We prove the lemma from the product formula for Euler's totient function and lemma 2.9. Indeed, for
n
=
p
1
k
1
⋯
p
r
k
r
{\displaystyle n=p_{1}^{k_{1}}\cdots p_{r}^{k_{r}}}
φ
(
n
)
=
∏
j
=
1
r
(
p
j
k
j
−
p
j
k
j
−
1
)
=
μ
∗
I
1
{\displaystyle \varphi (n)=\prod _{j=1}^{r}(p_{j}^{k_{j}}-p_{j}^{k_{j}-1})=\mu *I_{1}}
.
◻
{\displaystyle \Box }
Lemma 2.14 :
E
∗
μ
=
δ
{\displaystyle E*\mu =\delta }
.Proof 1 :
We use the Möbius inversion formula.
Indeed,
E
(
n
)
=
∑
d
|
n
δ
(
d
)
=
1
{\displaystyle E(n)=\sum _{d|n}\delta (d)=1}
, and hence
δ
=
μ
∗
E
{\displaystyle \delta =\mu *E}
.
◻
{\displaystyle \Box }
Proof 2 :
We use multiplicativity.
Indeed, for a prime
p
{\displaystyle p}
,
k
∈
N
{\displaystyle k\in \mathbb {N} }
we have
E
∗
μ
(
p
k
)
=
∑
j
=
0
k
μ
(
p
j
)
=
μ
(
1
)
+
μ
(
p
)
=
0
{\displaystyle E*\mu (p^{k})=\sum _{j=0}^{k}\mu (p^{j})=\mu (1)+\mu (p)=0}
,and thus due to the multiplicativity of
μ
{\displaystyle \mu }
and
E
{\displaystyle E}
E
∗
μ
(
n
)
=
0
{\displaystyle E*\mu (n)=0}
if
n
{\displaystyle n}
contains at least one prime factor. Since further
E
∗
μ
(
1
)
=
1
{\displaystyle E*\mu (1)=1}
the claim follows.
◻
{\displaystyle \Box }
Proof 3 :
We prove the lemma by direct computation. Indeed, if
1
≠
n
=
P
κ
{\displaystyle 1\neq n=P^{\kappa }}
, then
E
∗
μ
(
n
)
=
∑
d
|
n
μ
(
d
)
=
∑
α
∈
{
0
,
1
}
r
μ
(
P
α
)
=
∑
β
∈
{
0
,
1
}
r
−
1
(
μ
(
P
(
0
,
β
)
)
+
μ
(
P
(
1
,
β
)
)
)
=
0
{\displaystyle {\begin{aligned}E*\mu (n)&=\sum _{d|n}\mu (d)\\&=\sum _{\alpha \in \{0,1\}^{r}}\mu (P^{\alpha })\\&=\sum _{\beta \in \{0,1\}^{r-1}}\left(\mu (P^{(0,\beta )})+\mu (P^{(1,\beta )})\right)=0\end{aligned}}}
.
◻
{\displaystyle \Box }
Proof 4 :
We prove the lemma from the Binomial theorem and combinatorics.
Let
n
=
p
1
k
1
⋯
p
r
k
r
{\displaystyle n=p_{1}^{k_{1}}\cdots p_{r}^{k_{r}}}
. From combinatorics we note that for
m
≤
r
{\displaystyle m\leq r}
, there exist
(
m
r
)
{\displaystyle {\binom {m}{r}}}
distinct ways to pick a subset
I
⊆
{
1
,
…
,
r
}
{\displaystyle I\subseteq \{1,\ldots ,r\}}
such that
|
I
|
=
m
{\displaystyle |I|=m}
. Define
α
I
=
(
α
1
,
…
,
α
r
)
∈
{
0
,
1
}
r
{\displaystyle \alpha _{I}=(\alpha _{1},\ldots ,\alpha _{r})\in \{0,1\}^{r}}
where
α
j
=
1
⇔
j
∈
I
{\displaystyle \alpha _{j}=1\Leftrightarrow j\in I}
. Then, by the Binomial theorem
E
∗
μ
(
n
)
=
∑
d
|
n
μ
(
d
)
=
∑
I
⊆
{
1
,
…
,
r
}
μ
(
P
α
I
)
=
∑
I
⊆
{
1
,
…
,
r
}
(
−
1
)
|
I
|
=
∑
j
=
0
r
(
j
r
)
(
−
1
)
j
1
j
=
(
1
−
1
)
r
=
0
{\displaystyle {\begin{aligned}E*\mu (n)&=\sum _{d|n}\mu (d)\\&=\sum _{I\subseteq \{1,\ldots ,r\}}\mu (P^{\alpha _{I}})\\&=\sum _{I\subseteq \{1,\ldots ,r\}}(-1)^{|I|}\\&=\sum _{j=0}^{r}{\binom {j}{r}}(-1)^{j}1^{j}=(1-1)^{r}=0\end{aligned}}}
.
◻
{\displaystyle \Box }
Formulas for Euler's totient function Edit
Lemma 2.11 (Gauß 1801) :
∀
n
∈
N
:
n
=
∑
d
|
n
φ
(
d
)
{\displaystyle \forall n\in \mathbb {N} :n=\sum _{d|n}\varphi (d)}
.Proof 1 :
We use the Möbius inversion formula, proven below without using this lemma, and lemma 2.10.
We have
∑
d
|
n
φ
(
d
)
=
S
φ
(
n
)
{\displaystyle \sum _{d|n}\varphi (d)=S_{\varphi }(n)}
and hence
φ
=
μ
∗
S
φ
{\displaystyle \varphi =\mu *S_{\varphi }}
by the Möbius inversion formula. On the other hand,
φ
(
n
)
=
μ
∗
I
1
(
n
)
{\displaystyle \varphi (n)=\mu *I_{1}(n)}
by lemma 2.10.
Hence, we obtain
μ
∗
S
φ
=
μ
∗
I
1
{\displaystyle \mu *S_{\varphi }=\mu *I_{1}}
, and by cancellation of
μ
{\displaystyle \mu }
(the arithmetic functions form an integral domain) we get the lemma.
◻
{\displaystyle \Box }
Proof 2 :
We use the converse of the Möbius inversion formula, proven below without using this lemma, and lemma 2.10.
Since
φ
(
n
)
=
μ
∗
I
1
(
n
)
{\displaystyle \varphi (n)=\mu *I_{1}(n)}
by lemma 2.10, we obtain from the converse of the Möbius inversion formula that
I
1
(
n
)
=
S
φ
(
n
)
{\displaystyle I_{1}(n)=S_{\varphi }(n)}
.
◻
{\displaystyle \Box }
Proof 3 :
We prove the lemma by double counting.
We first note that there are
n
{\displaystyle n}
many fractions of the form
m
n
{\displaystyle {\frac {m}{n}}}
,
1
≤
m
≤
n
{\displaystyle 1\leq m\leq n}
.
We now prove that there are also
∑
d
|
n
φ
(
d
)
{\displaystyle \sum _{d|n}\varphi (d)}
many fractions of this form. Indeed, each fraction
m
n
{\displaystyle {\frac {m}{n}}}
,
1
≤
m
≤
n
{\displaystyle 1\leq m\leq n}
can be reduced to
b
d
{\displaystyle {\frac {b}{d}}}
, where
gcd
(
b
,
d
)
=
1
{\displaystyle \gcd(b,d)=1}
.
d
{\displaystyle d}
is a divisor of
n
{\displaystyle n}
, since it is obtained by dividing
n
{\displaystyle n}
. Furthermore, for each divisor
d
{\displaystyle d}
of
n
{\displaystyle n}
there exist precisely
φ
(
d
)
{\displaystyle \varphi (d)}
many such fractions by definition of
φ
{\displaystyle \varphi }
.
◻
{\displaystyle \Box }
Proof 4 :
We prove the lemma by the means of set theory.
Define
S
n
,
d
:=
{
1
≤
l
≤
n
|
gcd
(
d
,
l
)
=
1
}
{\displaystyle S_{n,d}:=\{1\leq l\leq n|\gcd(d,l)=1\}}
. Then
S
n
,
d
=
{
d
k
|
1
≤
k
≤
n
/
d
,
gcd
(
k
,
n
)
=
1
}
=
d
S
n
/
d
,
1
{\displaystyle S_{n,d}=\{dk|1\leq k\leq n/d,\gcd(k,n)=1\}=dS_{n/d,1}}
. Since
|
S
n
/
d
,
1
|
=
φ
(
n
/
d
)
{\displaystyle |S_{n/d,1}|=\varphi (n/d)}
and
{
1
,
…
,
n
}
{\displaystyle \{1,\ldots ,n\}}
is the disjoint union of the sets
S
n
,
d
,
d
|
n
{\displaystyle S_{n,d},d|n}
, we thus have
n
=
∑
d
|
n
|
S
n
,
d
|
=
∑
d
|
n
d
φ
(
n
d
)
{\displaystyle n=\sum _{d|n}|S_{n,d}|=\sum _{d|n}d\varphi \left({\frac {n}{d}}\right)}
.
◻
{\displaystyle \Box }
The next theorem comprises one of the most important examples for a multiplicative function.
Theorem 2.12 (Euler 1761) :
Euler's totient function is multiplicative.
Proof 1 :
We prove the theorem using double counting (due to Kronecker ).
By definition of
φ
{\displaystyle \varphi }
, there are
φ
(
m
)
φ
(
n
)
{\displaystyle \varphi (m)\varphi (n)}
sums of the form
k
m
+
l
n
,
1
≤
k
≤
m
,
1
≤
l
≤
n
{\displaystyle {\frac {k}{m}}+{\frac {l}{n}},1\leq k\leq m,1\leq l\leq n}
,where both summands are reduced. We claim that there is a bijection
{
k
m
+
l
n
|
1
≤
k
≤
m
,
1
≤
l
≤
n
,
k
m
,
l
n
reduced
}
→
{
r
m
n
|
1
≤
r
≤
r
m
n
reduced
}
{\displaystyle \left\{{\frac {k}{m}}+{\frac {l}{n}}{\big |}1\leq k\leq m,1\leq l\leq n,{\frac {k}{m}},{\frac {l}{n}}{\text{ reduced}}\right\}\to \left\{{\frac {r}{mn}}{\big |}1\leq r\leq {\frac {r}{mn}}{\text{ reduced}}\right\}}
.From this would follow
φ
(
m
)
φ
(
n
)
=
φ
(
m
n
)
{\displaystyle \varphi (m)\varphi (n)=\varphi (mn)}
.
We claim that such a bijection is given by
k
m
+
l
n
↦
n
k
+
m
l
mod
m
n
n
m
{\displaystyle {\frac {k}{m}}+{\frac {l}{n}}\mapsto {\frac {nk+ml\mod mn}{nm}}}
.
Well-definedness: Let
k
m
{\displaystyle {\frac {k}{m}}}
,
l
n
{\displaystyle {\frac {l}{n}}}
be reduced. Then
k
m
+
l
n
=
k
n
+
l
m
mod
m
n
n
m
{\displaystyle {\frac {k}{m}}+{\frac {l}{n}}={\frac {kn+lm\mod mn}{nm}}}
is also reduced, for if
p
|
(
n
m
)
{\displaystyle p|(nm)}
, then without loss of generality
p
|
n
{\displaystyle p|n}
, and from
p
|
(
k
n
+
l
m
−
c
n
m
)
{\displaystyle p|(kn+lm-cnm)}
follows
p
|
l
{\displaystyle p|l}
or
p
|
m
{\displaystyle p|m}
. In both cases we obtain a contradiction, either to
gcd
(
m
,
n
)
=
1
{\displaystyle \gcd(m,n)=1}
or to
l
n
{\displaystyle {\frac {l}{n}}}
is reduced.
Surjectivity: Let
r
m
n
{\displaystyle {\frac {r}{mn}}}
be reduced. Using the Euclidean algorithm, we find
a
,
b
∈
N
{\displaystyle a,b\in \mathbb {N} }
such that
a
n
+
b
m
=
1
{\displaystyle an+bm=1}
. Then
r
a
n
+
r
b
m
=
r
{\displaystyle ran+rbm=r}
. Define
k
=
r
a
mod
m
{\displaystyle k=ra\mod m}
,
l
=
r
b
mod
n
{\displaystyle l=rb\mod n}
. Then
k
n
+
l
m
=
(
r
a
+
t
m
)
n
+
(
r
b
+
s
n
)
m
≡
r
mod
m
n
{\displaystyle kn+lm=(ra+tm)n+(rb+sn)m\equiv r\mod mn}
.Injectivity: Let
k
n
+
l
m
≡
k
′
n
+
l
′
m
mod
m
n
{\displaystyle kn+lm\equiv k'n+l'm\mod mn}
. We show
k
=
k
′
{\displaystyle k=k'}
; the proof for
l
=
l
′
{\displaystyle l=l'}
is the same.
Indeed, from
k
n
+
l
m
≡
k
′
n
+
l
′
m
mod
m
n
{\displaystyle kn+lm\equiv k'n+l'm\mod mn}
follows
k
n
≡
k
′
n
mod
m
{\displaystyle kn\equiv k'n\mod m}
, and since
gcd
(
m
,
n
)
=
1
{\displaystyle \gcd(m,n)=1}
,
n
{\displaystyle n}
is invertible modulo
m
{\displaystyle m}
, which is why we may multiply this inverse on the right to obtain
k
≡
k
′
mod
m
{\displaystyle k\equiv k'\mod m}
. Since
1
≤
k
,
k
′
≤
m
{\displaystyle 1\leq k,k'\leq m}
, the claim follows.
◻
{\displaystyle \Box }
Proof 2 :
We prove the theorem from the Chinese remainder theorem .
Let
n
=
p
1
k
1
⋯
p
r
k
r
{\displaystyle n=p_{1}^{k_{1}}\cdots p_{r}^{k_{r}}}
. From the Chinese remainder theorem, we obtain a ring isomorphism
Z
/
n
→
Z
/
p
1
k
1
×
⋯
×
Z
/
p
r
k
r
{\displaystyle \mathbb {Z} /n\to \mathbb {Z} /p_{1}^{k_{1}}\times \cdots \times \mathbb {Z} /p_{r}^{k_{r}}}
,which induces a group isomorphism
(
Z
/
n
)
×
→
(
Z
/
p
1
k
1
)
×
×
⋯
×
(
Z
/
p
r
k
r
)
×
{\displaystyle (\mathbb {Z} /n)^{\times }\to \left(\mathbb {Z} /p_{1}^{k_{1}}\right)^{\times }\times \cdots \times \left(\mathbb {Z} /p_{r}^{k_{r}}\right)^{\times }}
.Hence,
|
(
Z
/
n
)
∗
|
=
∏
j
=
1
r
|
(
Z
/
p
j
k
j
)
∗
|
{\displaystyle \left|(\mathbb {Z} /n)^{*}\right|=\prod _{j=1}^{r}\left|\left(\mathbb {Z} /p_{j}^{k_{j}}\right)^{*}\right|}
, and from
∀
m
∈
N
:
φ
(
m
)
=
|
(
Z
/
m
)
∗
|
{\displaystyle \forall m\in \mathbb {N} :\varphi (m)=\left|(\mathbb {Z} /m)^{*}\right|}
follows the claim.
◻
{\displaystyle \Box }
Proof 3 : We prove the theorem from lemma 2.11 and induction (due to Hensel ).
Let
m
,
n
∈
N
{\displaystyle m,n\in \mathbb {N} }
such that
gcd
(
m
,
n
)
=
1
{\displaystyle \gcd(m,n)=1}
. By lemma 2.11, we have
m
=
∑
e
|
m
φ
(
e
)
{\displaystyle m=\sum _{e|m}\varphi (e)}
and
n
=
∑
d
|
m
φ
(
d
)
{\displaystyle n=\sum _{d|m}\varphi (d)}
and hence
m
n
=
φ
(
m
)
φ
(
n
)
+
φ
(
m
)
∑
d
|
n
,
d
<
n
φ
(
d
)
+
φ
(
n
)
∑
e
|
m
,
e
<
m
φ
(
e
)
+
∑
d
|
n
,
e
|
m
d
<
n
,
e
<
m
φ
(
d
)
φ
(
e
)
{\displaystyle mn=\varphi (m)\varphi (n)+\varphi (m)\sum _{d|n,d<n}\varphi (d)+\varphi (n)\sum _{e|m,e<m}\varphi (e)+\sum _{d|n,e|m \atop d<n,e<m}\varphi (d)\varphi (e)}
.Furthermore, by lemma 2.11 and the bijection from the proof of theorem 2.8,
m
n
=
∑
f
|
m
n
φ
(
f
)
=
∑
e
|
m
,
d
|
n
φ
(
e
d
)
{\displaystyle mn=\sum _{f|mn}\varphi (f)=\sum _{e|m,d|n}\varphi (ed)}
.By induction on
e
d
,
e
n
,
m
d
<
m
n
{\displaystyle ed,en,md<mn}
we thus have
m
n
=
φ
(
m
n
)
+
φ
(
m
)
∑
d
|
n
φ
(
d
)
+
φ
(
n
)
∑
e
|
m
φ
(
e
)
+
∑
e
|
m
,
d
|
n
e
<
m
,
d
<
n
φ
(
e
)
φ
(
d
)
{\displaystyle mn=\varphi (mn)+\varphi (m)\sum _{d|n}\varphi (d)+\varphi (n)\sum _{e|m}\varphi (e)+\sum _{e|m,d|n \atop e<m,d<n}\varphi (e)\varphi (d)}
.
◻
{\displaystyle \Box }
Proof 4 : We prove the theorem from lemma 2.11 and the Möbius inversion formula.
Indeed, from lemma 2.10 and the Möbius inversion formula, we obtain
φ
=
μ
∗
I
1
{\displaystyle \varphi =\mu *I_{1}}
,which is why
φ
{\displaystyle \varphi }
is multiplicative as the convolution of two multiplicative functions.
◻
{\displaystyle \Box }
Proof 5 : We prove the theorem from Euler's product formula.
Indeed, if
m
=
P
κ
{\displaystyle m=P^{\kappa }}
and
n
=
Q
ι
{\displaystyle n=Q^{\iota }}
and
gcd
(
m
,
n
)
=
1
{\displaystyle \gcd(m,n)=1}
, then
P
∩
Q
=
∅
{\displaystyle P\cap Q=\emptyset }
and hence
∏
p
|
n
(
p
κ
p
−
p
κ
p
−
1
)
∏
q
|
n
(
q
ι
−
q
ι
q
−
1
)
=
φ
(
m
n
)
{\displaystyle \prod _{p|n}(p^{\kappa _{p}}-p^{\kappa _{p}-1})\prod _{q|n}(q^{\iota }-q^{\iota _{q}-1})=\varphi (mn)}
.
◻
{\displaystyle \Box }
Theorem 2.15 (Möbius inversion formula) :
Let
f
{\displaystyle f}
be an arithmetical function and define
S
f
(
n
)
:=
∑
d
|
n
f
(
d
)
=
f
∗
E
(
n
)
{\displaystyle S_{f}(n):=\sum _{d|n}f(d)=f*E(n)}
.Then
f
=
μ
∗
S
f
=
∑
d
|
n
μ
(
d
)
S
f
(
n
d
)
{\displaystyle f=\mu *S_{f}=\sum _{d|n}\mu (d)S_{f}\left({\frac {n}{d}}\right)}
.
Proof :
By lemma 2.14 and associativity of convolution,
μ
∗
S
f
=
μ
∗
f
∗
E
=
μ
∗
E
∗
f
=
δ
∗
f
=
f
{\displaystyle \mu *S_{f}=\mu *f*E=\mu *E*f=\delta *f=f}
.
◻
{\displaystyle \Box }
Theorem 2.16 (Product formula for Euler's totient function) :
Let
n
=
p
1
k
1
⋯
p
r
k
r
{\displaystyle n=p_{1}^{k_{1}}\cdots p_{r}^{k_{r}}}
, where
p
1
,
…
,
p
r
{\displaystyle p_{1},\ldots ,p_{r}}
are pairwise different prime numbers and
k
1
,
…
,
k
r
∈
N
{\displaystyle k_{1},\ldots ,k_{r}\in \mathbb {N} }
(recall that every number has such a decomposition by the fundamental theorem of arithmetic . Then
φ
(
n
)
=
n
∏
j
=
1
r