我有一个数据帧,如下所示,称为training
+------------------+------+
| features| MEDV|
+------------------+------+
| [6.575,4.98,15.3]|504000|
| [6.421,9.14,17.8]|453600|
| [7.185,4.03,17.8]|728700|
| [6.998,2.94,18.7]|701400|我对此数据集运行线性回归
from pyspark.ml.regression import LinearRegression
lr=LinearRegression(featuresCol='features',
predictionCol='predictions')
lrModel=lr.fit(training)错误:
Py4JJavaError: An error occurred while calling o51.fit.
: java.lang.IllegalArgumentException: label does not exist. Available: features, MEDV
at org.apache.spark.sql.types.StructType.$anonfun$apply$1(StructType.scala:275)
at scala.collection.immutable.Map$Map2.getOrElse(Map.scala:147)
at org.apache.spark.sql.types.StructType.apply(StructType.scala:274)
at org.apache.spark.ml.util.SchemaUtils$.checkNumericType(SchemaUtils.scala:75)
at org.apache.spark.ml.PredictorParams.validateAndTransformSchema(Predictor.scala:53)
at org.apache.spark.ml.PredictorParams.validateAndTransformSchema$(Predictor.scala:46)
at org.apache.spark.ml.regression.LinearRegression.org$apache$spark$ml$regression$LinearRegressionParams$$super$validateAndTransformSchema(LinearRegression.scala:176)
at org.apache.spark.ml.regression.LinearRegressionParams.validateAndTransformSchema(LinearRegression.scala:119)
at org.apache.spark.ml.regression.LinearRegressionParams.validateAndTransformSchema$(LinearRegression.scala:107)
at org.apache.spark.ml.regression.LinearRegression.validateAndTransformSchema(LinearRegression.scala:176)
at org.apache.spark.ml.Predictor.transformSchema(Predictor.scala:178)
at org.apache.spark.ml.PipelineStage.transformSchema(Pipeline.scala:75)
at org.apache.spark.ml.Predictor.fit(Predictor.scala:134)
at org.apache.spark.ml.Predictor.fit(Predictor.scala:116)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)这个不存在的标签是什么?
发布于 2020-06-15 19:28:04
标签列的参数名称称为labelCol。labelCol的默认值为label。这就是Spark试图读取一个不存在的名为label的列的原因。
用labelCol='MEDV'替换predictionCol='predictions'应该可以解决这个问题。
这里是API文档的link。
https://stackoverflow.com/questions/62387133
复制相似问题