深入机器学习系列14-FactorAnalysis

1. Introduction

An extension ofprincipal component analysis(PCA)in the sense of approximating covariance matrix.

Goal

To describe the covariance relationships among many variables in terms of a few underlying unobservable random variables, called factors.

To reduce dimensions and solve the problem with n

2. Orthogonal Factor Model(正交因子模型)

A Factor Analysis Example

We have a training data. Here is its scatter plot.

Generate a k dimension variable

There exists a transformation matrixwhich maps F into n dimension space:

Add a meanon

For real instance has errors, add error

Factor Analysis Model

Suppose

The factor model postulates thatis linearly related to a few unobservable random variables, calledcommon factors(共同因子), through

whereis the matrix offactor loading(因子载荷),is the loading of variableon factor,,are called errors orspecific factors(特殊因子).

Assume:

If, it becomes oblique factor model(斜交因子模型)

Define thecommunity(变量共同度,或公因子方差):

Define thespecific variance(特殊因子方差):

Ambiguity of L

Let T be any m × m orthogonal matrix. Then, we can express

where,

Since,,andform another pair of factor and factor loading matrix.

After rotation, communitydoesn’t change.

3. Estimation

3.1 Principal Component Method

1) Get correlation matrix

2) Spectral Decompositions

3) Determine

Rule of thumb: choose

4) Estimation

The contribution to the total sample variance tr(S) from the first common factor is then(公共因子的方差贡献)

In general, the proportion of total sample variance(after standardization) due to thefactor=

3.2 Maximum Likelihood Method

1) Joint distribution:

2) Marginal distribution:

3) Conditional distribution:

4) Log likelihood:

EM estimation

E Step:

M Step:

Parameter Iteration:

Get more detail on【机器学习-斯坦福】因子分析(Factor Analysis)

4. Factor Rotation

An orthogonal matrix, and let.

Goal:to rotatesuch that a ‘simple’ structure is achieved.

Kaiser (1958)’svarimaxcriterion(方差最大旋转) :

1) define

2) chooses.t.

5. Factor Scores

Weighted Least Squares Method

Suppose that,, andare known.

Then

Regression Method

From the mean of the conditional distribution ofis

  • 发表于:
  • 原文链接:http://kuaibao.qq.com/s/20180108G0EPLX00?refer=cp_1026

扫码关注云+社区