我是ML.NET的初学者,我的数据有点问题。当我将它们放入mlContext.Fit(.)中时,收到的错误如下:
Column 'Temperature' has values of I4which is not the same as earlier observed type of R4.
这是我的密码:
try
{
var mlContext = new MLContext();
var reader = mlContext.Data.CreateTextReader<TrainData>(separatorChar: ',', hasHeader: false);
var trainData = _context.Datas.Last();
IDataView trainingdataView = reader.Read(Path.Combine(hostingEnvironment.WebRootPath, "data010220192341.txt"));
var pipeline = mlContext.Transforms.Conversion.MapValueToKey("Delay")
.Append(mlContext.Transforms.Categorical.OneHotEncoding("StationDepart"))
.Append(mlContext.Transforms.Categorical.OneHotEncoding("StationArrival"))
.Append(mlContext.Transforms.Categorical.OneHotEncoding("Day"))
.Append(mlContext.Transforms.Categorical.OneHotEncoding("Train"))
.Append(mlContext.Transforms.Categorical.OneHotEncoding("WeatherText"))
.Append(mlContext.Transforms.Categorical.OneHotEncoding("HasPrecipitation"))
.Append(mlContext.Transforms.Categorical.OneHotEncoding("PrecipitationType"))
.Append(mlContext.Transforms.Concatenate("Features", "StationDepart", "StationArrival", "Day", "Train", "WeatherText", "Temperature", "Humidity", "HasPrecipitation", "PrecipitationType", "Time"))
.Append(mlContext.MulticlassClassification.Trainers.StochasticDualCoordinateAscent(labelColumn: "Delay", featureColumn: "Features"))
.Append(mlContext.Transforms.Conversion.MapKeyToValue("PredictedTime"));
var model = pipeline.Fit(trainingdataView);
var prediction = model.CreatePredictionEngine<TrainData, TrainPrediction>(mlContext).Predict(
new TrainData()
{
StationDepart = "Charleroi-Sud",
StationArrival = "Mons",
Day = "Friday",
Train = "BE.NMBS.IC3825",
WeatherText = "Partly cloudy",
Temperature = -1,
Humidity=0,
HasPrecipitation = false,
PrecipitationType=null,
Time=0444
});
return prediction.PredictedTime.ToString();
}
catch (Exception e)
{
return e.Message;
}
所以当我得到一个文本文件上的数据之后,我对string列进行编码,并且在我尝试训练模型之后,但是现在我收到了错误。我的数据是:
星期四,南部,星期一,星期四,BE.NMBS.IC3831,部分晴天,-2,0,假,1044,0 Charleroi-南部,星期一,星期四,BE.NMBS.IC932,多云,-2,0,假,1112,0 Charleroi-Sud,Mons,星期四,BE.NMBS.IC3832,大部分多云,-1,0,假,1144,0 Charleroi-Sud,Mons,周四,BE.NMBS.IC933,cloudy,-1,0,假,1212,0 Charleroi-Sud,Mons,星期四,BE.NMBS.IC3842,多数多云,-1,0,假,2144,0 Charleroi-Sud,星期一,星期四,BE.NMBS.IC943,多数多云,-1,0,假,2212,0 Charleroi-Sud,星期一,星期四,BE.NMBS.IC3843,多数多云,-1,0,000,假,假,2247,0 Charleroi-Sud,星期一,星期五,BE.NMBS.IC3825,部分多云,-1,0,0 Charleroi-Sud,星期一,星期五,BE.NMBS.IC3826,大部分多云,-1,0,假,0544,0 Charleroi-Sud,星期一,星期五,BE.NMBS.IC927,多云,-1,0,假,062,0
所以,正如你所看到的,数据和温度之间的所有',‘之间是一个整数。在TrainData内部,就像这样:
public class TrainData
{
[LoadColumn(0)]
public string StationDepart { get; set; }
[LoadColumn(1)]
public string StationArrival { get; set; }
[LoadColumn(2)]
public string Day { get; set; }
[LoadColumn(3)]
public string Train { get; set; }
[LoadColumn(4)]
public string WeatherText { get; set; }
[LoadColumn(5)]
public int Temperature { get; set; }
[LoadColumn(6)]
public int Humidity { get; set; }
[LoadColumn(7)]
public bool HasPrecipitation { get; set; }
[LoadColumn(8)]
public string PrecipitationType { get; set; }
[LoadColumn(9)]
public int Time { get; set; }
[LoadColumn(10)]
public int Delay { get; set; }
}
发布于 2019-02-04 13:59:50
问题是时间、延迟和温度必须在浮点上而不是在int中。
public class TrainData
{
[LoadColumn(0)]
public string StationDepart { get; set; }
[LoadColumn(1)]
public string StationArrival { get; set; }
[LoadColumn(2)]
public string Day { get; set; }
[LoadColumn(3)]
public string Train { get; set; }
[LoadColumn(4)]
public string WeatherText { get; set; }
[LoadColumn(5)]
public float Temperature { get; set; }
[LoadColumn(6)]
public float Humidity { get; set; }
[LoadColumn(7)]
public bool HasPrecipitation { get; set; }
[LoadColumn(8)]
public string PrecipitationType { get; set; }
[LoadColumn(9)]
public float Time { get; set; }
[LoadColumn(10)]
public float Delay { get; set; }
}
https://stackoverflow.com/questions/54488258
复制相似问题