我的要求如下
连接两个数据帧,如下所示:
     var c = a.join(b,keys,"fullouter")c.printSchema()如下:
     |-- add: string (nullable = true)
     |-- sub: string (nullable = true)
     |-- delete: string (nullable = true)
     |-- mul: long (nullable = true)
     |-- ADD: string (nullable = true)
     |-- SUB: string (nullable = true)
     |-- DELETE: string (nullable = true)
     |-- MUL: long (nullable = true)
      It's good until here.现在我正在执行一个when列when条件,如下所示
     val d = c.withColumn("column", when(c("a.add") === c("b.ADD"), 
   "Neardata"))错误信息如下:
    Exception in thread "main" org.apache.spark.sql.AnalysisException: 
    Cannot resolve column name "a.add"我也试过了,如下
     val d = c.withColumn("column", when(col("a.add") === col("b.ADD"), "Neardata"))
    Again error.
   Please suggest.发布于 2020-04-23 04:54:02
您必须使用datframe.as("a")和dataframe1.as("b")定义别名。
示例:
  import spark.sqlContext.implicits._
  val data = List(("James","","Smith","36636","M",60000),
    ("Michael","Rose","","40288","M",70000),
    ("Robert","","Williams","42114","",400000),
    ("Maria","Anne","Jones","39192","F",500000),
    ("Jen","Mary","Brown","","F",0))
  val cols = Seq("first_name","middle_name","last_name","dob","gender","salary")
  val df = spark.createDataFrame(data).toDF(cols:_*).as("a")
  val df2 = df.withColumn("a.new_gender", when(col("a.gender") === "M","Male")
    .when(col("a.gender") === "F","Female")
    .otherwise("Unknown")).show输出:
+----------+-----------+---------+-----+------+------+------------+
|first_name|middle_name|last_name|  dob|gender|salary|a.new_gender|
+----------+-----------+---------+-----+------+------+------------+
|     James|           |    Smith|36636|     M| 60000|        Male|
|   Michael|       Rose|         |40288|     M| 70000|        Male|
|    Robert|           | Williams|42114|      |400000|     Unknown|
|     Maria|       Anne|    Jones|39192|     F|500000|      Female|
|       Jen|       Mary|    Brown|     |     F|     0|      Female|
+----------+-----------+---------+-----+------+------+------------+我认为如果没有别名,你就可以像这样访问...这可能就是原因。
  val df2 = df.withColumn("df.new_gender", when(col("df.gender") === "M","Male")
    .when(col("df.gender") === "F","Female")
    .otherwise("Unknown")).showhttps://stackoverflow.com/questions/61374524
复制相似问题