- Regression Analysis
- In order to explain a variable, to analyze statistical causal relationships between related variables
- independent variable: causes
- dependent variable: outcomes
- Regression Model
- Simple Regression Model
- 𝑿 ⇨ 𝒀
- Observation: (𝑿₁,𝒀₁), (𝑿₂,𝒀₂), ... , (𝑿𝘯,𝒀𝘯) (𝑛 is observation number)
- Simple Regression Model:
- 𝒀𝑖 = 𝜷₀ + 𝜷₁𝑿𝑖 + ℇ𝑖, 𝑖 = 1,2, ... , 𝑛
- ℇ𝑖: error term.
- Assume that it follows a normal distribution with mean 0 and variance 𝛔²
- ℇ𝑖~𝙉𝙤𝙧(0,𝛔²)
- 𝑿 is not random variable, but a given value
- so, three parameters need to be estimated
- 𝜷₁: slope of the linear equation
- 𝜷₀: intercept
- 𝛔²: variance of the error term
- Estimation of intercept 𝜷₀ and slope 𝜷₁
- Using least squares method
- to minimize the objective function 𝐐
- objective function 𝐐
- sum of the square of the difference between the observed value of dependent variable 𝒀, and the fitted value provided by the model on the linear line 𝜷₀ + 𝜷₁𝑿𝑖
- 𝐐 = ∑(𝒀𝑖 - 𝜷₀ - 𝜷₁𝑿𝑖)²
- How to?
- (𝑿,𝒀) is observed value, so let 𝐐 be a function of 𝜷₀ and 𝜷₁
- and partially differentiate 𝐐 with respect to 𝜷₀
= -2∑(𝒀𝑖 - 𝜷₀ - 𝜷₁𝑿𝑖) = 0 - and partially differentiate 𝐐 with respect to 𝜷₁
= -2∑(𝒀𝑖 - 𝜷₀ - 𝜷₁𝑿𝑖)𝑿𝑖 = 0 - estimated equation: 𝒀-hat = 𝜷₀-hat + 𝜷₁-hat * 𝑿
- Estimation of variance of the error term 𝛔²
- Using sample variance of the residuals
- residual
substract the estimated value from the observed value of 𝒀
𝒆𝑖 = 𝒀𝑖 - 𝒀-hat = 𝒀𝑖 - 𝜷₀-hat + 𝜷₁-hat * 𝑿𝑖 - SSE
resudual/error sum of squares
= ∑(𝒀𝑖 - 𝒀𝑖-hat)² - estimate 𝛔² by using MSE
𝛔²-hat = MSE(Mean Squared Error) = SSE / 𝑛-2
(𝑛-2) is degree of freedom
2019년 3월 29일 금요일
[K-MOOC] Data Analytics for Forecasting and Classification: 1-1. Regression analysis, Simple regression model, Model estimation
피드 구독하기:
댓글 (Atom)

댓글 없음: