CS2955-Prob&Comp-Notes

简单整理一下 CS2955 这门四周小学期课的笔记，不是很细致。

Intro

Why Randomness?

Simplicity
Efficiency
Better in adversarial situation: 由于随机性很难造数据卡掉

Toy Example

array $A: A[i]=0$ for $\frac{n}{3}$ indices $i$. Output some $i$ s.t. $A[i]=0$.

Deterministic: check $\frac{2n}{3}-1$ positions. Runtime $\Omega(n)$.
Randomized: choose a uniformly random $i$. $Pr\{A[i]=0\} = \frac{1}{3}$. Repeat 10 times.
Failure Prob $(\frac{1}{3})^{10}$ Runtime $O(1)$.

Monte-Carlo & Las Vegas

Monte-Carlo: finite time, but the output could be WRONG.
Las Vegas: output is always CORRECT. (Runtime could be infinite, but the $\mathbb{E}(\text{runtime})$ must be finite).

可以被 Monte-Carlo 解决的复杂性类: BPP (Bounded error Probabilistic Polynomial time).

可以被 Las Vegas 解决的复杂性类: ZPP (Zero-error Probabilistic Polynomial time).

$$ZPP \subseteq BPP$$ (Las Vegas 无限长运行时间，可以设一个 Threshold 截断. )

e.g. 一个 Las Vegas 算法有期望运行时间 $\mathbb{E}[T]$, 设置一个 Threshold $T_0 = 3\mathbb{E}[T]$ ，那么有
$$
Pr(T > T_0) \le \frac{1}{3}
$$
在此 Threshold 截断得到的新算法，w.p. $\frac{1}{3}$ 没有输出或者输出错误.

Polynomial Identity Testing (P.I.T)

Problem Description

Alice has
$$
f(x) = \sum_{i=0}^{n-1} a_i x^i
$$
Bob has
$$
g(x) = \sum_{i=0}^{n-1} b_i x^i
$$
Alice send sth to Bob, and Bob answer the question "whether $f(x)$ and $g(x)$ are identical".

A Simple Deterministic Algorithm

Alice send all the coefficients $a_i$ to Bob.

Assumption: $a_i$, $b_i$ are of size poly($n$), so we can use $\log(n)$ bits to represent them. Totally it takes

$O(n \log n)$.

Randomized Algorithm

Idea: Select a random $r$. check if $f(r) = g(r)$.
Analysis:
- Depends on $r$ (trade off): large range of $r$ $\Longrightarrow$ smaller failure prob; large communication complexity.
- If $f(x)=g(x)$, this algorithm is always correct.
- If $f(x) \neq g(x)$, the algorithm fails only when $r$ is a root of $f(x)-g(x)$.$f(x)-g(x)$ has at most $n$ roots hence if we choose $r$ from $\{1,2, \dots, 100n\}$, the failure prob is $\frac{1}{100}$.
Communication Complexity:
- $r$ : $O(\log n)$
- $f(r)$: $O(n \log n)$

Fingerprinting

Choose a random prime $p$, if $a \equiv b \mod p$ then there is a good chance $a = b$.

Fingerprints: $a\ \text{mod}\ p$, $b\ \text{mod}\ p$

New Algorithm (Random with Fingerprinting)

Check if
$$
f(r) \equiv g(r) \mod p
$$
Only need to pass $f(r)\ \text{mod}\ p$.

Analysis: Failure probability comes from
- Case1: $r$ is a root of $f(x)-g(x)$
- Case 2: $p$ is a factor of $f(r)-g(r)$
  - $f(r)-g(r)$ is a number with size $n^n$, has $\le n \log n$ prime factors.
  - There are $O(\frac{n^2}{\ln n})$ primes $\le n^2$.Then choose a prime $p \le n^2$.
    $$
    Pr{p \mid f(r)-g(r)} = \frac{n \ln n}{\frac{n^2}{\ln n}} = O(\frac{\ln^2 n}{n})
    $$
Facts used in Analysis:
- (Prime Number Thm) 设 $\pi(x)$ 为不超过 $x$ 的素数个数,
  $$
  \pi(x) \sim \frac{x}{\ln x}
  $$
- $N$ has $\le \ln N$ prime factors. 考虑前 $m$ 个质数连乘, 数量级为
  $$
  e^{(1+o(1))m \ln m} = \#m
  $$
  $m \sim \frac{\ln N}{\ln m} \le \ln N$.
上面的部分简单推导设 $\theta(x) := \sum_{k=1}^{\pi(x)} \ln p_k$ (Chebyshev Function), 素数定理等价于
$$
\theta(n) \sim n
$$
又有
$$
p_m \sim m \ln m
$$
那么 $\theta(p_m) \sim p_m$, $\ln (\#m) \sim p_m \sim m \ln m$
Communication Complexity
- $r$ : $O(\log n)$
- $p$: $O(\log n)$
- $f(r)\ \text{mod}\ p$: $O(\log n)$

Repeat $O(\log (\frac{1}{\delta}))$ times to achieve suc prob $\ge 1-\delta$.

Multiple Variables

Input: $f, g \in \mathbb{F}[x_1, x_2, \dots, x_n]$ of total degree $d$.
$$
f(x_1, \dots, x_n) = \sum_{i_1, \dots, i_n \ge 0,\ i_1+ \dots+i_n \le d} a_{i_1, \dots, i_n} x_1^{i_1} x_2^{i_2} \dots x_n^{i_n}
$$
Output: $f$ and $g$ are identical?

这个问题等价于输入 $f \in \mathbb{F}[x_1, x_2, \dots, x_n]$, 输出 $f$ 是否等于 $0$.

输入 $f$ 方式：Blackbox Model: Given $x_1, \dots, x_n$, return $f(x_1, \dots, x_n)$.

Schwartz-Zippel Lemma

Let $P \in \mathbb{F}[x_1, x_2, \dots, x_n$, $P \neq 0$, total degree $d$ over field $\mathbb{F}$. Let $S$ be a finite subset of $\mathbb{F}$ and let $r_1, r_2, \dots, r_n$ be selected at random independently and uniformly from $S$. Then
$$
Pr[P(r_1, r_2, \dots, r_n) = 0] \le \frac{d}{|S|}
$$

证明略。（其实我笔记上也有，但是为了内容简洁就略掉吧。）

Application: Bipartite Matching

check whether G has a perfect matching (algebra method)

$A: n \times n$ Tutte matrix:
$$
A_{ij} = \begin{cases}
x_{ij} & (i,j) \in E \\
0 & (i, j) \not\in E
\end{cases}
$$
Claim: $\text{det} A = 0$ iff G has no perfect matching. 因为把行列式写开（对排列 $\pi$ sigma），每个排列描述了一个完美匹配。

Finding a Perfect Matching

Trivial Alg

爆搜。每次找一个配对，然后把这条边以及这两个点都删了，查找删掉的图是否有完美匹配。

Isolation Lemma

Let $S$ be a family of subsets on a ground set $U$ with $|U|=n$. For each $x \in U$, assign a weight

$w(x)$ from $\{1, 2, \dots, 100n\}$ uniformly.

Then there exists a unique minimum weight subset in $S$ w.p. $\ge 0.99$.

这一事实说明了最小权完美匹配大概率唯一。

Counterintuitive:

一共有 $2^n$ 个 subsets, 权值 $\le 100n^2$，那么每个权值感觉上对应了 $\frac{2^n}{100n^2}$ 个子集，为什么最小权只对应一个子集？

Parallel Alg

给每条边 $(i, j) \in E$ 随机赋一个权重 $2^{w_{ij}}$, 那么最小完美匹配的权重 $W$ 即为整除行列式的最大的 $2^W$.

算法：先计算出 $W$. 我们的目标是找一组完美匹配的边。怎样判断一条边 $(i, j)$ 是否是完美匹配呢？答案是只要

$$
\text{det}(A') \frac{2^{w_{ij}}}{2^W}
$$

是 odd （奇）的即可。$A'$ 是去掉这条边（以及两端点）后的图的 Tutte 矩阵。（回忆行列式写开来，被除剩奇数说明有一项是 $2^W/2^W=1$，即最小权刚好是这个）

Balls and Bins

TODO

Derandomization

TODO

Concentration Inequalities

TODO

Random Vectors

TODO

没有风会经过的旅店

没有风会经过的旅店

Intro

Why Randomness?

Toy Example

Monte-Carlo & Las Vegas

Polynomial Identity Testing (P.I.T)

Problem Description

A Simple Deterministic Algorithm

Randomized Algorithm

Fingerprinting

New Algorithm (Random with Fingerprinting)

Multiple Variables

Schwartz-Zippel Lemma

Application: Bipartite Matching

Finding a Perfect Matching

Trivial Alg

Isolation Lemma

Counterintuitive:

Parallel Alg

Balls and Bins

Derandomization

Concentration Inequalities

Random Vectors

大数定律！

Computer Networking A Top-Down Approach 笔记

Sirius

Comments | NOTHING