What is PCA ?

figure cited here, recommend reading: A step by step explanation of Principal Component Analysis

PCA,Principal Component Analysis, is a dimensionality-reduction method. It can reduce the number of variables of a data set, using one or more components to represent the original data.

Principal components are constructed as linear combinations of the initial variables.

Geometrically speaking, principal components are new axes with the most spread out projection of all the data points.

The more spread out, the more variance they carry, the more information they can keep, so PCA can reduce the dimensionality and preserve as much information as possible.

Step 1: Standardization

This step transforms all the variables to the same scale, because PCA is quite sensitive regarding the variances of the initial variables.

Step 2: Compute the Covariance Matrix

This matrix can reflect relationships among all the variables, and high correlation means redundant information.

Step 3: Compute the eigenvectors and eigenvalues of the covariance matrix

The eigenvectors of the Covariance matrix are Principal Components,since these directions have the most variance, and eigenvalues are the amount of variance carried in each Principal Component.

Step 4: Keep p components

Rank the eigenvalues from highest to lowest, for example, PC1 may carry 95% of the variance and PC2 carries 5%. We can keep all components or discard some of lesser significance ones.



0 条评论
登录 后参与评论


  • 【LEETCODE】模拟面试-215. Kth Largest Element in an Array

    图:新生大学 https://leetcode.com/problems/kth-largest-element-in-an-array/ Find the ...

  • The Chinese zodiac

    haoLan: The Chinese zodiac, explained Asking a zodiac sign is a polite way of as...

  • 用数学为爱情保鲜

    16/5/22 数学的力量 爱情数学 心得: 数学的力量是很强大的,它存在于我们的生活中,影响着我们的生活,无处不在。 说得简单一些,数学就是一门研究...

  • CF---(452)A. Eevee

    A. Eevee time limit per test 1 second memory limit per test 256 megabytes in...

  • 从“London”出发,8步搞定自然语言处理(Python代码)


  • Using factor analysis for decomposition分解之因子分析

    Factor analysis is another technique we can use to reduce dimensionality. Howeve...

  • 【每日一题】问题 1111: Cylinder

    Using a sheet of paper and scissors, you can cut out two faces to form a cylinde...

    编程范 源代码公司
  • Educational Codeforces Round 44 (Rated for Div. 2)A. Chess Placing

    You are given a chessboard of size 1 × n. It is guaranteed that n is even. The c...

  • JDBC读取数据优化-fetch size


  • 与异构服务器在线联合布置和分配虚拟网络功能(Networking and Internet Architecture)