1. Events
• $\mathbb{P}[A\ or\ B] \le \mathbb{P}[A] + \mathbb{P}[B]$
• $\mathbb{P}[A\bigcup B] \le \mathbb{P}[A] + \mathbb{P}[B]$
• If $A \Rightarrow B$, then $\mathbb{P} \le \mathbb{P}[B]$
2.  Expectations
• $E[X] = \int xf(x)dx$
• $E[c] = c$
• $E[X + c] = E[X] + c$
• $E[cX] = cE[X]$
• $E[X+Y] = E[X] + E[Y]$
• $E[XY] = E[X]E[Y]$ if $X$ and $Y$ are independent
• $E[\vec{X}] = (E[X_1] \ E[X_2] \ ... \ E[X_d])^T$
• If $X \ge 0$, $E[X] =\int_0^\infty \mathbb{P}[X\ge 0]dt$
3. Variance
• $Var(X) = E[(X - E[X])^2]$
• $Var(X) = E[X^2] - (E[X])^2$
• $Var(c) = 0$
• $Var(X + c) = Var(X)$
• $Var(cX) = c^2Var(X)$
• $Var(X+Y) = Var(X)Var(Y)$ if $X$ and $Y$ are independent
• $Var(\sum\limits_{i=1}^NX_i) = \sum\limits_{i=1}^NVar(X_i)$ if the $X_i...X_n$ are independent
4. Covariance
• $Cov(X,Y) = E[(X - E[X])(Y - E[Y])] = E[XY] - E[X][Y]$
• $Var(X) = Cov(X,X)$
• $Cov(X,c) = 0$
• $Cov(aX, bY) = abCov(X,Y)$
• $Cov(\vec{X}) = E[(\vec{X} - E[\vec{X}])(\vec{X} - E[\vec{X}])^T] = E[\vec{X}\vec{X}^T] - E[\vec{X}](E[\vec{X}])^T$
• $Cov(\vec{X}|\vec{Y}) = E[(\vec{X} - E[\vec{X}|\vec{Y}])(\vec{X} - E[\vec{X}|\vec{Y}])^T|\vec{Y}]$
5. Mean Square Error: $MSE(\hat{\theta}) = E[(\hat{\theta} - \theta)^2] = E[(\hat{\theta}-E[\hat{\theta}])^2] + (E[\hat{\theta}]-\theta)^2 = Var(\hat{\theta}) + Bias(\hat{\theta})^2$, more detail explanation can be found on wikipedia.
6. Law of Total Expectation $E[X] = E[E[X|Y]]$
7. Inequalities
• Jensen: for convex $f$, $f(E[X]) \le E[f(X)]$
• Markov: If $X \ge 0$ then for all $t >0$, $\mathbb[P][X\ge t] \le \frac{E[X]}{t}$
• Chebyshev-Cantelli: for $t>0$, $\mathbb{P}[|X-E[X]|\ge t] \le \frac{Var(X)}{t^2}$
• Chebyshev’s Association: let $f$ and $g$ be nondecreasing (nonincreasing) real-valued functions defined on the real line. If $X$ is a real-valued random variable then, $E[f(X)g(X)] \ge E[f(X)]E[g(X)]$ ($E[f(X)g(X)] \le E[f(X)]E[g(X)]$)
• Harris’: extends Chebyshev’s Association to functions $f,g:\mathbb{R}^n\rightarrow \mathbb{R}$
• Chernoff: for all $t\in \mathbb{R}$, $\mathbb{P}[X\ge t] \le \inf\limits_{\lambda \ge 0}E[e^{\lambda (X-t)}]$
• Cauchy-Schwarz: if $E[X^2]<\infty$ $E[Y^2]<\infty$, then $|E[XY]|\le \sqrt{E[X^2]E[Y^2]}$
• Hoeffding’s tail: Let $X_1,...,X_n$ be independent bounded random variable such that $X_i$ falls in the interval $[a_i,b_i]$ with probability one. Then for any $t>0$, $\mathbb{P}[S_n-E[S_n]\ge t] \le \exp(\frac{-2t^2}{\sum\limits_{i=1}^n(b_i-a_i)^2})$ and $\mathbb{P}[S_n-E[S_n]\le -t] \le \exp(\frac{-2t^2}{\sum\limits_{i=1}^n(b_i-a_i)^2})$
8. Other
• If $\mathbb{P}[X > t]\le F(t)$, then with probability at least $1 - \delta$, $X\le F^{-1}(\delta)$.
