我有一个大字符串,我必须把它转换成一个数据帧。例如,字符串是:
meals_string =“开胃菜南方炒鹌鹑配绿、黑浆果、山核桃和蓝芝士14.00公园大道切碎的沙拉山羊Feta Cheese、Nigoise Olive、腌制的白色.主菜辣味硬壳加拿大鲑鱼、马铃薯油炸片、腌制黄瓜、27点香菇烤虾、烤番茄Vinaigrette & Sweet Corn 29.50”
meals = meals_string.splitlines(),这就给了我一个列表,但是我不得不用3列将字符串转换成数据格式:分类;Meal_name;价格。
发布于 2018-01-21 23:09:25
可以为您的字符串构建一个相对简单的解析器,并直接传递给pandas.DataFrame,如下所示:
代码:
def meal_string_parser(meal_string):
category = ''
meal = []
price = 0
for word in meal_string.split():
if word:
try:
price = float(word)
yield category, ' '.join(meal), price
meal = []
except ValueError:
# this is not a number, so not a price
if word.upper() == word and word.isalnum():
# found category
category = word
else:
meal.append(word)
if meal:
yield category, ' '.join(meal), price测试代码:
meals_string = """
APPETIZERS
Southern Fried Quail with Greens,Huckleberries,Pecans & Blue Cheese 14.00
Park Avenue Cafe Chopped Salad Goat Feta Cheese,Nigoise Olives,Marinated White 13.00
ENTREES
Horseradish Crusted Canadian Salmon,Potato Fritters, Marinated Cucumbers,Chive Vinaigrette 27.00
Sautéed Prawns with Mushroom Tortellini,Grilled Tomato Vinaigrette & Sweet Corn 29.50
"""
import pandas as pd
df = pd.DataFrame(meal_string_parser(meals_string),
columns='Category Meal_name Price'.split())
print(df)结果:
Category Meal_name Price
0 APPETIZERS Southern Fried Quail with Greens,Huckleberries... 14.0
1 APPETIZERS Park Avenue Cafe Chopped Salad Goat Feta Chees... 13.0
2 ENTREES Horseradish Crusted Canadian Salmon,Potato Fri... 27.0
3 ENTREES Sautéed Prawns with Mushroom Tortellini,Grille... 29.5https://stackoverflow.com/questions/48367861
复制相似问题