Chapter 2 Linear Regression Model

An Economic Model, are random variables:

Assume: Y-Normal Distribution
Y-normal

  1. How about ? This can be seen as same variance but shifted towards the new mean $2000.

NOTE: From f(y) we can get confidence intervals, point estimates and probabilities.

Simple linear Regression (SLP)

Parameters of the model: intercept, = slope =

datapoints If the constant variance assumption holds, the data are said to be homoskedastic, otherwise, they are said to be hetroskedastic.

Characteristic


  1. Independence → , but ≠ Independence

Optional ~ compare:
meals

Error term

Decompose the observation y into 2 components.

  1. = Systematic components of .

  2. = Random Components

  3. Data = Model + Uncertainty,

Error term has expectation of Zero

, take expected value on both side:

Notes: x is not a r.v.
scatter of error

and have the same variance,
normal

Var(error)= Var(y)

Ordinary Least square, estimate the Model Parameters

Error is the sum of distance between the data points and the Line.

  1. Find the best fit line such that is a minimum

  2. Result: , where LS Residuals

  3. Let SSE(sum square error) = .

  4. Thus, find such that is minimum

  5. Solution(Least Square Estimators), = centroid.

    Notes: avoid vertical slope, take at least 2 x value.

hat y

SLP Application and goal to predict

Food expenditure example, income in $100s,

Interpretation:

  1. Slope , if weekly income increases by $10(x=1), we expect food expenditure increase by $10.21;

  2. y-intercept , no Interpretation since

  3. Mathematically, is computable, but is valid? Let’s say Range of x is
    in/out sample

Elasticity

, where is the slope along a specified curve.

For the linear relationship

Error will depend on choice of , and , thus

Estimators

Gauss Markov Theorem

Assuming SR1-SR5 hold, are the best linear unbiased estimators.

Variance, Covariance and the Least Square estimators

Unbiased estimator of the error term:

Non linearities in Simple Regression

Log -Linear function,

Quadratic Function,

Notes: β_1 > 0, and β_3 < 0, diminishing marginal effect

Regression with indicator variables, Dummy variable:

Interpretation:

, Standard error to and are and respectively, total numbers of observation is 526, , and

  1. Male = y-intercept = Avg. hourly wages for men;
    Female = Avg. hourly wages for women;

  2. Is the hourly wage difference between men and women statistically significant?
    Yes. , reject 0,

p-value=0