发布于 2022-04-09 16:46:22
这对你有用
for i in range(len(df)):
splitted_value = df["salary"].iloc[i].split()
salary_type = (splitted_value[-1]+"ly").title()
if "-" in splitted_value:
ranged_salary = [int(x.replace("$","").replace(",","")) for x in splitted_value if "$" in x]
salary = sum(ranged_salary)/len(ranged_salary)
else:
salary = int(splitted_value[-3].replace("$","").replace(",",""))
df.loc[i,"salary_value"] = salary
df.loc[i,"salary_type"] = salary_type
发布于 2022-04-09 16:50:11
这是一个有趣的问题,但下次请提供输入数据,以便我们可以复制/粘贴。
您需要的是将薪资数据的字符串转换为值和薪资类型的函数。
对字符串中的字符进行解析以查找数字,并在遇到-
(破折号)字符时使用布尔开关,以防需要计算平均值。
lst = [
"Up to $80,000 a year",
"$8,500 - $10,500 a month",
"$25 - $40 an hour",
"$1,546 a week"
]
def convert(salary_data: str):
value = ""
value_max = ""
need_average = False
# iterate over the characters in the string
for c in salary_data:
if c.isdigit():
if need_average:
value_max += c
else:
value += c
elif c == "-":
# switch to adding to value_max after finding the dash
need_average = True
if not need_average:
# slight cheating for the f-string below
value_max = value
value = f"{(int(value) + int(value_max)) / 2:.2f}"
if "hour" in salary_data:
salary_type = "hourly"
elif "week" in salary_data:
salary_type = "weekly"
elif "month" in salary_data:
salary_type = "monthly"
else:
# use this as fallback
salary_type = "yearly"
return value, salary_type
for element in lst:
value, salary_type = convert(element)
print(value, salary_type)
输出
80000.00 yearly
9500.00 monthly
32.50 hourly
1546.00 weekly
https://stackoverflow.com/questions/71812599
复制相似问题