TF-IDF算法代码示例
0.引入依赖
import numpy as np # 数值计算、矩阵运算、向量运算
import pandas as pd # 数值分析、科学计算
1.定义数据和预处理
# 定义文档
docA = 'The cat sat on my bed'
docB = 'The dog sat on my knees'
# 切割文档
bowA = docA.split(' ')
bowB = docB.split(' ')
# bowA # ['The', 'cat', 'sat', '