代码:
df1 = df.withColumn("Col3",
when(col("Col2")=="Tree",exp(-50*col("Col1"))))错误消息:列:‘TypeError’类型的属性不可调用
如何在列上使用指数函数?我需要计算包含和不包含列的指数函数(1 - exp(-50))
发布于 2021-07-28 20:05:47
对我来说效果很好。我看不出有什么问题。
可能你的col1是string,你需要将它们转换成float。
字符串到浮点数的转换-
sparkDF.withColumn('col1',F.col('col1').cast(DoubleType()))sample_col1_values = list(map(float,"""5.000001911924775E-4| |6.999999313363147E-4| |7.000007191664842E-4| |8.999982752162926E-4| |9.000003378596106E-4
| |4.000000531183995E-4| |3.000005084098568E-4| |2.999999164269311E-4| |2.999999999999999E-4""".split("| |")))
sample_col2_values = ['Tree','Tree','Tree','Root','Root','Root','Leaf','Leaf','Leaf']
input_list = [(x,y) for x,y in zip(sample_col1_values,sample_col2_values)]
sparkDF = sql.createDataFrame(input_list,['col1','col2'])
sparkDF = sparkDF.withColumn("col3",
F.when(F.col("col2")=="Tree"
,F.exp(-50*F.col("col1")))
)
sparkDF.show()
+--------------------+----+------------------+
| col1|col2| col3|
+--------------------+----+------------------+
|5.000001911924776E-4|Tree|0.9753099027047368|
|6.999999313363147E-4|Tree|0.9656054195726678|
|7.000007191664842E-4|Tree|0.9656053815360145|
|8.999982752162926E-4|Root| null|
|9.000003378596106E-4|Root| null|
|4.000000531183995E-4|Root| null|
|3.000005084098568E-4|Leaf| null|
|2.999999164269311E-4|Leaf| null|
|2.999999999999999E-4|Leaf| null|
+--------------------+----+------------------+https://stackoverflow.com/questions/68559949
复制相似问题