我可以得到示例代码来读取csv的数据。我的要求是,我需要生成的培训和测试数据从CSV在TensorFlow。
包含列车和测试数据的CSV。我指的是火车的前10排,下10排的测试,谢谢。
发布于 2017-08-18 20:58:12
TensorFlow的工作人员已经创建了一个优秀教程来实现这个功能。介绍了如何从csv中读取人口普查数据,将其转化为张量,并利用高级估计API对机器学习模型进行拟合和评价。
然而,当我尝试使用urllib
函数时,我确实遇到了一个错误,我稍微修改了代码,以便使用pandas
直接读取数据。
原始代码
import tempfile
import urllib
train_file = tempfile.NamedTemporaryFile()
test_file = tempfile.NamedTemporaryFile()
urllib.urlretrieve("https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data", train_file.name)
urllib.urlretrieve("https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.test", test_file.name)
import pandas as pd
CSV_COLUMNS = [
"age", "workclass", "fnlwgt", "education", "education_num",
"marital_status", "occupation", "relationship", "race", "gender",
"capital_gain", "capital_loss", "hours_per_week", "native_country",
"income_bracket"]
df_train = pd.read_csv(train_file.name, names=CSV_COLUMNS, skipinitialspace=True)
df_test = pd.read_csv(test_file.name, names=CSV_COLUMNS, skipinitialspace=True, skiprows=1)
修改代码
import pandas as pd
COLUMNS = ["age", "workclass", "fnlwgt", "education", "education_num",
"marital_status", "occupation", "relationship", "race", "gender",
"capital_gain", "capital_loss", "hours_per_week", "native_country",
"income_bracket"]
df_train = pd.read_csv('http://mlr.cs.umass.edu/ml/machine-learning-databases/adult/adult.data'
, names=COLUMNS
, skipinitialspace=True)
df_test = pd.read_csv('http://mlr.cs.umass.edu/ml/machine-learning-databases/adult/adult.test'
, names=COLUMNS
, skipinitialspace=True
, skiprows=1)
https://stackoverflow.com/questions/45764762
复制相似问题