Basic information of Statistics
이번 글에서는 통계학에서 기본적으로 사용되는 용어, 개념들을 정리하였다.
- Sample space (표본공간)
: The sample space S is a set that contains all possible experimental outcomes.
- Experiment (실험)
: Any process for which more than one outcome is possible. (Any process that generates data)
- Event (사건)
: A subset of the sample space S
- Random variable
: A function that assigns real number to each element of sample space.
Real\;numbers = f(Elements\;of\;the\;sample\;space)
확률 변수의 종류
1. Discrete random variables (이산확률변수)
: Outputs of a random variable are finite discrete (countable) values. (0, 1, 2, ...)
2. Continuous random variables (연속확률변수)
: Outputs of a random variable are continuous (uncountable) values.
- Probability function (f)
p = f(x)
x\;:\;Real\;numbers\;that\;the\;Random\;variables\;can\;take
0 \leq p \leq 1
x가 이산확률변수이면, f는 Probability mass function (p.m.f. : 확률질량함수) 이고
x가 연속확률변수이면, f는 Probability density function (p.d.f. : 확률밀도함수) 이다.
Probability mass function (pmf : 확률 질량 함수)
- For a discrete random variable X
- Let x be a possible value of X
- The probability that is assigned to the values of X, P[X=x]
- A discrete random variable X has probability mass function (p.m.f.)
f(x) = P[X = x]
Expectation (E, mean)
For a discrete random variable X with p.m.f. p(x)
E[X] = \sum_i x_i p (x_i)
E[X] is an expected value of a random variable X.
More precisely, E[X] is weighted average of the possible values of X, where each value is weighted by the probability that it will occur.
- E[c] = c,\;\;\;\;c\;:\;constant
- E[cX] = cE[x]
- E[cX+d] = cE[x] + d,\;\;\;\;d\;:\;constant
- E\big[\sum_{i=1}^n X_i \big] = \sum_{i=1}^n E[X_i]
Variance (V, Var)
For a discrete random variable X with mean \mu, the variance of X denoted by V(X) is defined by
V[X] = E[(X-\mu )^2]
Variance can be interpreted as the expected squared deviation about its mean.
Variance shows how much variation (or dispersion) from the expected value.
Variance cannot be negative.
\begin{align*} V(X) &= E[(X-E[X] )^2]\\ &= E[X^2 - 2XE[X] + E[X]^2 ] \\ &= E[X^2] - 2E[X]^2 - E[X]^2 \\ &= E[X^2] - E[X]^2 \end{align*}
- V(c) = 0
- V(cX) = c^2V(X)
- V(cX+d) = c^2V(X)
The squared root of the V(X) is called the standard deviation of random variable X.
SD[X] = \sigma = \sqrt{V(X)}
Mean Vectors
Let X and Y be random matrices of the same dimension, and A and B be matrices of constant.
E(X) = \begin{bmatrix}E(X_1)\\E(X_1)\\ \vdots \\ E(X_p) \end{bmatrix} = \begin{bmatrix} \mu_1 \\ \mu_2 \\ \vdots \\ \mu_p \end{bmatrix} = \mathbf{\mu}
E(X+Y) = E(X)+E(Y)
E(AXB) = AE(X)B
Covariance Matrix (공분산 행렬)
\begin{align*}\Sigma(X) &= E[(X-\mu)(X-\mu)^T] \\ & \\ &= \begin{bmatrix} E[(X_1-\mu_1)(X_1-\mu_1)] & E[(X_1-\mu_1)(X_2-\mu_2)] & \cdots & E[(X_1-\mu_1)(X_p-\mu_p)]\\ E[(X_2-\mu_2)(X_1-\mu_1)] & E[(X_2-\mu_2)(X_2-\mu_2)] & \cdots & E[(X_2-\mu_2)(X_p-\mu_p)]\\ \vdots & \vdots & \ddots & \vdots \\ E[(X_p-\mu_p)(X_1-\mu_1)] & E[(X_p-\mu_p)(X_2-\mu_2)] & \cdots & E[(X_p-\mu_p)(X_p-\mu_p)] \end{bmatrix} \\ & \\ &= \begin{bmatrix} \sigma_{11} & \sigma_{12} & \cdots & \sigma_{1p} \\ \sigma_{21} & \sigma_{22} & \cdots & \sigma_{2p} \\ \vdots & \vdots & \ddots & \vdots \\ \sigma_{p1} & \sigma_{p2} & \cdots & \sigma_{pp} \end{bmatrix} \end{align*}
Correlation Matrix (상관계수 행렬)
\begin{align*} \rho_X &= \begin{bmatrix} \frac{\sigma_{11}}{\sqrt{\sigma_{11}}\sqrt{\sigma_{11}}} & \frac{\sigma_{12}}{\sqrt{\sigma_{11}}\sqrt{\sigma_{22}}} & \cdots & \frac{\sigma_{1p}}{\sqrt{\sigma_{11}}\sqrt{\sigma_{pp}}} \\ \frac{\sigma_{21}}{\sqrt{\sigma_{22}}\sqrt{\sigma_{11}}} & \frac{\sigma_{22}}{\sqrt{\sigma_{22}}\sqrt{\sigma_{22}}} & \cdots & \frac{\sigma_{2p}}{\sqrt{\sigma_{22}}\sqrt{\sigma_{pp}}} \\ \vdots &\vdots & \ddots & \vdots \\ \frac{\sigma_{1p}}{\sqrt{\sigma_{11}}\sqrt{\sigma_{pp}}} & \frac{\sigma_{2p}}{\sqrt{\sigma_{22}}\sqrt{\sigma_{pp}}} & \cdots & \frac{\sigma_{pp}}{\sqrt{\sigma_{pp}}\sqrt{\sigma_{pp}}} \end{bmatrix} \\ & \\ &= \begin{bmatrix} 1& \rho_{12} & \cdots & \rho_{1p} \\ \rho_{21} & 1 & \cdots & \rho_{2p} \\ \vdots & \vdots & \ddots & \vdots \\ \rho_{p1} & \rho_{p2} & \cdots & 1 \end{bmatrix} \end{align*}
Let X_1 and X_2 are random variables and a,\,b,\,c are constant values.
E(cX_1) = cE(X_1) = x\mu_1
Var(cX_1) = E(cX_1 - c\mu_1)^2 = c^2 Var(X_1) = c^2 \sigma
Cov(aX_1,bX_2) = ab\,Cov(X_1,X_2)
For the linear combination aX_1 + bX_2
E(aX_1 + bX_2) = aE(X_1) + bE(X_2) = a \mu_1 + b \mu_2
\begin{align*} Var(aX_1 + bX_2) &= a^2 Var(X_1) + b^2 Var(X_2) + 2ab \, Cov(X_1,X_2) \\ &= a^2 \sigma_{11} + b^2 \sigma_{22} + 2ab\sigma_{12} \end{align*}
With C^T = [a,b], aX_1 + bX_2 can be written as
aX_1 + bX_2 = \begin{bmatrix} a & b \end{bmatrix} \begin{bmatrix} X_1 \\ X_2 \end{bmatrix} = C^T X
E(aX_1 + bX_2) = E(C^T X) = C^T E(X) = \begin{bmatrix} a & b \end{bmatrix} \begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix} = a \mu_1 + b \mu_2
Var(aX_1 + bX_2) = V(C^T X) = C^TV(X)C = a^2 \sigma_{11} + b^2 \sigma_{22} + 2ab\sigma_{12}
The linear combination C^TX = C_1X_1 + C_2X_2 +\cdots + C_pX_p
E[C^TX] = C^T\mu_X
Var[C^TX] = C^T\Sigma_X C
In general, consider the q linear combinations of the p random variables X_1,X_2,\cdots,X_p. Let C be a matrix of constant.
Z = \begin{bmatrix}Z_1\\Z_2\\\vdots \\ Z_q \end{bmatrix} = \begin{bmatrix} C_{11} & C_{12} & \cdots & C_{1p} \\ C_{21} & C_{22} & \cdots & C_{2p} \\ \vdots & \vdots & \ddots & \vdots \\ C_{q1} & C_{q2} & \cdots & C_{qp} \end{bmatrix} \begin{bmatrix} X_1 \\ X_2 \\ \vdots \\ X_p \end{bmatrix} = CX
\mu_Z = E(Z) = E(CX) = CE(X) = C \mu_X
\Sigma_X = Cov(Z) = Cov(CX) = C\,Cov(X)C^T = C \Sigma_X C^T
※ 이 글은 고려대학교 산업경영공학과 김성범 교수님의 예측모델 강의를 정리하고, 공부한 내용을 바탕으로 작성되었습니다.
댓글
댓글 쓰기