Basic information of Statistics

- 2월 23, 2021

이번 글에서는 통계학에서 기본적으로 사용되는 용어, 개념들을 정리하였다.

Sample space (표본공간)

: The sample space $S$ is a set that contains all possible experimental outcomes.

Experiment (실험)

: Any process for which more than one outcome is possible. (Any process that generates data)

Event (사건)

: A subset of the sample space $S$

Random variable

: A function that assigns real number to each element of sample space.

$$Real\;numbers = f(Elements\;of\;the\;sample\;space)$$

확률 변수의 종류

1. Discrete random variables (이산확률변수)

: Outputs of a random variable are finite discrete (countable) values. (0, 1, 2, ...)

2. Continuous random variables (연속확률변수)

: Outputs of a random variable are continuous (uncountable) values.

Probability function ($f$)

$$p = f(x)$$

$$x\;:\;Real\;numbers\;that\;the\;Random\;variables\;can\;take$$

$$0 \leq p \leq 1$$

$x$가 이산확률변수이면, $f$는 Probability mass function (p.m.f. : 확률질량함수) 이고

$x$가 연속확률변수이면, $f$는 Probability density function (p.d.f. : 확률밀도함수) 이다.

Probability mass function (pmf : 확률 질량 함수)

For a discrete random variable $X$
Let $x$ be a possible value of $X$
The probability that is assigned to the values of $X$, $P[X=x]$
A discrete random variable $X$ has probability mass function (p.m.f.)

$$f(x) = P[X = x]$$

Expectation (E, mean)

For a discrete random variable $X$ with p.m.f. $p(x)$

$$E[X] = \sum_i x_i p (x_i) $$

$E[X]$ is an expected value of a random variable $X$.

More precisely, $E[X]$ is weighted average of the possible values of $X$, where each value is weighted by the probability that it will occur.

$E[c] = c,\;\;\;\;c\;:\;constant$
$E[cX] = cE[x]$
$E[cX+d] = cE[x] + d,\;\;\;\;d\;:\;constant$
$E\big[\sum_{i=1}^n X_i \big] = \sum_{i=1}^n E[X_i] $

Variance (V, Var)

For a discrete random variable $X$ with mean $\mu$, the variance of $X$ denoted by $V(X)$ is defined by

$$V[X] = E[(X-\mu )^2] $$

Variance can be interpreted as the expected squared deviation about its mean.

Variance shows how much variation (or dispersion) from the expected value.

Variance cannot be negative.

$$\begin{align*} V(X) &= E[(X-E[X] )^2]\\ &= E[X^2 - 2XE[X] + E[X]^2 ] \\ &= E[X^2] - 2E[X]^2 - E[X]^2 \\ &= E[X^2] - E[X]^2 \end{align*}$$

$V(c) = 0$
$V(cX) = c^2V(X)$
$V(cX+d) = c^2V(X)$

The squared root of the $V(X)$ is called the standard deviation of random variable X.

$$SD[X] = \sigma = \sqrt{V(X)} $$

Mean Vectors

Let $X$ and $Y$ be random matrices of the same dimension, and $A$ and $B$ be matrices of constant.

$$E(X) = \begin{bmatrix}E(X_1)\\E(X_1)\\ \vdots \\ E(X_p) \end{bmatrix} = \begin{bmatrix} \mu_1 \\ \mu_2 \\ \vdots \\ \mu_p \end{bmatrix} = \mathbf{\mu} $$

$$E(X+Y) = E(X)+E(Y)$$

$$E(AXB) = AE(X)B$$

Covariance Matrix (공분산 행렬)

$$\begin{align*}\Sigma(X) &= E[(X-\mu)(X-\mu)^T] \\ & \\ &= \begin{bmatrix} E[(X_1-\mu_1)(X_1-\mu_1)] & E[(X_1-\mu_1)(X_2-\mu_2)] & \cdots & E[(X_1-\mu_1)(X_p-\mu_p)]\\ E[(X_2-\mu_2)(X_1-\mu_1)] & E[(X_2-\mu_2)(X_2-\mu_2)] & \cdots & E[(X_2-\mu_2)(X_p-\mu_p)]\\ \vdots & \vdots & \ddots & \vdots \\ E[(X_p-\mu_p)(X_1-\mu_1)] & E[(X_p-\mu_p)(X_2-\mu_2)] & \cdots & E[(X_p-\mu_p)(X_p-\mu_p)] \end{bmatrix} \\ & \\ &= \begin{bmatrix} \sigma_{11} & \sigma_{12} & \cdots & \sigma_{1p} \\ \sigma_{21} & \sigma_{22} & \cdots & \sigma_{2p} \\ \vdots & \vdots & \ddots & \vdots \\ \sigma_{p1} & \sigma_{p2} & \cdots & \sigma_{pp} \end{bmatrix} \end{align*}$$

Correlation Matrix (상관계수 행렬)

$$\begin{align*} \rho_X &= \begin{bmatrix} \frac{\sigma_{11}}{\sqrt{\sigma_{11}}\sqrt{\sigma_{11}}} & \frac{\sigma_{12}}{\sqrt{\sigma_{11}}\sqrt{\sigma_{22}}} & \cdots & \frac{\sigma_{1p}}{\sqrt{\sigma_{11}}\sqrt{\sigma_{pp}}} \\ \frac{\sigma_{21}}{\sqrt{\sigma_{22}}\sqrt{\sigma_{11}}} & \frac{\sigma_{22}}{\sqrt{\sigma_{22}}\sqrt{\sigma_{22}}} & \cdots & \frac{\sigma_{2p}}{\sqrt{\sigma_{22}}\sqrt{\sigma_{pp}}} \\ \vdots &\vdots & \ddots & \vdots \\ \frac{\sigma_{1p}}{\sqrt{\sigma_{11}}\sqrt{\sigma_{pp}}} & \frac{\sigma_{2p}}{\sqrt{\sigma_{22}}\sqrt{\sigma_{pp}}} & \cdots & \frac{\sigma_{pp}}{\sqrt{\sigma_{pp}}\sqrt{\sigma_{pp}}} \end{bmatrix} \\ & \\ &= \begin{bmatrix} 1& \rho_{12} & \cdots & \rho_{1p} \\ \rho_{21} & 1 & \cdots & \rho_{2p} \\ \vdots & \vdots & \ddots & \vdots \\ \rho_{p1} & \rho_{p2} & \cdots & 1 \end{bmatrix} \end{align*}$$

Let $X_1$ and $X_2$ are random variables and $a,\,b,\,c$ are constant values.

$$E(cX_1) = cE(X_1) = x\mu_1$$

$$Var(cX_1) = E(cX_1 - c\mu_1)^2 = c^2 Var(X_1) = c^2 \sigma$$

$$Cov(aX_1,bX_2) = ab\,Cov(X_1,X_2)$$

For the linear combination $aX_1 + bX_2$

$$E(aX_1 + bX_2) = aE(X_1) + bE(X_2) = a \mu_1 + b \mu_2 $$

$$\begin{align*} Var(aX_1 + bX_2) &= a^2 Var(X_1) + b^2 Var(X_2) + 2ab \, Cov(X_1,X_2) \\ &= a^2 \sigma_{11} + b^2 \sigma_{22} + 2ab\sigma_{12} \end{align*}$$

With $C^T = [a,b]$, $aX_1 + bX_2$ can be written as

$$aX_1 + bX_2 = \begin{bmatrix} a & b \end{bmatrix} \begin{bmatrix} X_1 \\ X_2 \end{bmatrix} = C^T X $$

$$E(aX_1 + bX_2) = E(C^T X) = C^T E(X) = \begin{bmatrix} a & b \end{bmatrix} \begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix} = a \mu_1 + b \mu_2$$

$$ Var(aX_1 + bX_2) = V(C^T X) = C^TV(X)C = a^2 \sigma_{11} + b^2 \sigma_{22} + 2ab\sigma_{12}$$

The linear combination $C^TX = C_1X_1 + C_2X_2 +\cdots + C_pX_p$

$$E[C^TX] = C^T\mu_X$$

$$Var[C^TX] = C^T\Sigma_X C$$

In general, consider the $q$ linear combinations of the $p$ random variables $X_1,X_2,\cdots,X_p$. Let $C$ be a matrix of constant.

$$Z = \begin{bmatrix}Z_1\\Z_2\\\vdots \\ Z_q \end{bmatrix} = \begin{bmatrix} C_{11} & C_{12} & \cdots & C_{1p} \\ C_{21} & C_{22} & \cdots & C_{2p} \\ \vdots & \vdots & \ddots & \vdots \\ C_{q1} & C_{q2} & \cdots & C_{qp} \end{bmatrix} \begin{bmatrix} X_1 \\ X_2 \\ \vdots \\ X_p \end{bmatrix} = CX$$

$$\mu_Z = E(Z) = E(CX) = CE(X) = C \mu_X$$

$$\Sigma_X = Cov(Z) = Cov(CX) = C\,Cov(X)C^T = C \Sigma_X C^T$$

※ 이 글은 고려대학교 산업경영공학과 김성범 교수님의 예측모델 강의를 정리하고, 공부한 내용을 바탕으로 작성되었습니다.

이 블로그 검색

yupsung Computer Science blog