[K-MOOC] Data Analytics for Forecasting and Classification: 1-1. Regression analysis, Simple regression model, Model estimation - Hubert Life (out2)

2019년 3월 29일 금요일

[K-MOOC] Data Analytics for Forecasting and Classification: 1-1. Regression analysis, Simple regression model, Model estimation

on 오전 5:56 in analysis, Bioinformatics, estimation, K-MOOC, regression, study

Regression Analysis

In order to explain a variable, to analyze statistical causal relationships between related variables
independent variable: causes
dependent variable: outcomes

Regression Model

Simple Regression Model

𝑿 ⇨ 𝒀
Observation: (𝑿₁,𝒀₁), (𝑿₂,𝒀₂), ... , (𝑿𝘯,𝒀𝘯) (𝑛 is observation number)
Simple Regression Model:

𝒀𝑖 = 𝜷₀ + 𝜷₁𝑿𝑖 + ℇ𝑖, 𝑖 = 1,2, ... , 𝑛

ℇ𝑖: error term.

Assume that it follows a normal distribution with mean 0 and variance 𝛔²
ℇ𝑖~𝙉𝙤𝙧(0,𝛔²)

𝑿 is not random variable, but a given value
so, three parameters need to be estimated

𝜷₁: slope of the linear equation
𝜷₀: intercept
𝛔²: variance of the error term

Estimation of intercept 𝜷₀ and slope 𝜷₁

Using least squares method
to minimize the objective function 𝐐
objective function 𝐐

sum of the square of the difference between the observed value of dependent variable 𝒀, and the fitted value provided by the model on the linear line 𝜷₀ + 𝜷₁𝑿𝑖
𝐐 = ∑(𝒀𝑖 - 𝜷₀ - 𝜷₁𝑿𝑖)²

How to?

(𝑿,𝒀) is observed value, so let 𝐐 be a function of 𝜷₀ and 𝜷₁
and partially differentiate 𝐐 with respect to 𝜷₀
= -2∑(𝒀𝑖 - 𝜷₀ - 𝜷₁𝑿𝑖) = 0
and partially differentiate 𝐐 with respect to 𝜷₁
= -2∑(𝒀𝑖 - 𝜷₀ - 𝜷₁𝑿𝑖)𝑿𝑖 = 0
estimated equation: 𝒀-hat = 𝜷₀-hat + 𝜷₁-hat * 𝑿

Estimation of variance of the error term 𝛔²

Using sample variance of the residuals

residual
substract the estimated value from the observed value of 𝒀
𝒆𝑖 = 𝒀𝑖 - 𝒀-hat = 𝒀𝑖 - 𝜷₀-hat + 𝜷₁-hat * 𝑿𝑖
SSE
resudual/error sum of squares
= ∑(𝒀𝑖 - 𝒀𝑖-hat)²
estimate 𝛔² by using MSE
𝛔²-hat = MSE(Mean Squared Error) = SSE / 𝑛-2
(𝑛-2) is degree of freedom

댓글 없음:

피드 구독하기: 댓글 (Atom)