首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >Python -将字典值列表缩减为更小的列表

Python -将字典值列表缩减为更小的列表
EN

Stack Overflow用户
提问于 2018-07-18 05:33:30
回答 3查看 293关注 0票数 0

我有一个字典,它的I是配方I,值是配料列表:

代码语言:javascript
复制
recipe_dictionary  = { 134: ['salt', 'chicken', 'tomato paste canned'],
                       523: ['toast whole grain', 'feta cheese' 'egg', 'salt'], 
                       12: ['chicken', 'rice', 'parsley']}

我还有一个静态列表,其中包含了我不想在一天中重复的成分:

代码语言:javascript
复制
non_repeatable_ingredients = ['egg', 'chicken', 'beef']

现在,我遍历字典的每个值,然后遍历配料名称,将每个名称与non_repeatable_ingredients列表进行比较,然后创建一个共享单词列表。因此,我的缩小大小的字典将如下所示:

代码语言:javascript
复制
   reduced_recipe_dictionary  = { 134: ['chicken'],
                                  523, ['egg'], 
                                  12: ['chicken']

这个过程需要很长的时间,因为我真正的字典和配料列表很长。有没有比下面的更快的方法呢?

这是get_reduced_meal_plans_dictionry方法:

代码语言:javascript
复制
reduced_meal_plans_dictionary = {}

# For each recipe
for recipe in meal_plans_dictionary:

    # Temp list for overlapp ingredients found for each recipe
    overlapped_ingredients_list = []

    # For each complete name of ingredient in the recipe
    for ingredient_complete_name in meal_plans_dictionary[recipe]:

        # Clean up the ingredient name as it sometimes involves comma, parentheses or spaces
        ingredient_string = ingredient_complete_name.replace(',', '').replace('(', '').replace(')', '').lower().strip()

        # Compare each ingredient name against the list of ingredients that shall not repeated in a day
        for each in PROTEIN_TAGS:

            # Compute the partial similarity
            partial_similarity = fuzz.partial_ratio(ingredient_string, each.lower())

            # If above 90, means one of the ingredients in the PROTEIN_TAGS exists in this recipe
            if partial_similarity > 90:
                # Make a list of such ingredients for this recipe
                overlapped_ingredients_list.append(each.lower())

    # Place the recipe ID as the key and the reduced overlapped list as the value
    reduced_meal_plans_dictionary[recipe] = overlapped_ingredients_list

我使用替换和相似比率,因为配料名称不像我的示例那样清晰;例如,我可以将鸡蛋或煮鸡蛋作为一种配料。

谢谢。

EN

回答 3

Stack Overflow用户

回答已采纳

发布于 2018-07-18 05:59:50

使用正则表达式和defaultdict的组合,您可以确切地得到您正在寻找的东西。这种方法使用正则表达式来减少所需的for循环的数量。

注意,我已经调整了key 12,以显示它将获得两个匹配。

代码语言:javascript
复制
recipe_dictionary  = { 134: ['salt', 'chicken', 'tomato paste canned'],
                        523: ['toast whole grain', 'feta cheese', 'egg', 'salt'],
                        12: ['whole chicken', 'rice', 'parsley', 'egg']}
non_repeatable_ingredients = ['egg', 'chicken', 'beef']
non_repeat = '(' + '|'.join(non_repeatable_ingredients) + ')'

d = defaultdict(list)
for k, j in recipe_dictionary.items():
     for i in j:
            m = re.search(non_repeat, i)
            if m:
                d[k].append(m.groups()[0])
d
defaultdict(list, {134: ['chicken'], 523: ['egg'], 12: ['chicken', 'egg']})
票数 0
EN

Stack Overflow用户

发布于 2018-07-18 05:46:51

使用集合而不是列表怎么样,因为每个食谱都有独特的成分,顺序并不重要。

集合可以在O(1)固定时间内搜索,而列表可以在O(n)时间内搜索。

Here are some examples

例如:

代码语言:javascript
复制
recipe_dictionary = { 
    134: set(['salt', 'chicken', 'tomato paste canned']),
    523: set(['toast whole grain', 'feta cheese' 'egg', 'salt']), 
    12: set(['chicken', 'rice', 'parsley'])
}

non_repeatable_ingredients = set(['egg', 'chicken', 'beef'])

您可以测试元素在集合中的存在情况,如下所示:

代码语言:javascript
复制
for ingredient in recipe_dictionary[134]:
    if ingredient in non_repeatable_ingredients:
        # do something
票数 1
EN

Stack Overflow用户

发布于 2018-07-18 06:12:42

代码语言:javascript
复制
>>> reduced_recipe_dictionary = {k: list(filter(lambda x: x in non_repeatable_ingredients, v)) for k,v in recipe_dictionary.items()}
>>> reduced_recipe_dictionary
{134: ['chicken'], 523: ['egg'], 12: ['egg']}
>>> 

如果您没有与non_repeatable_ingredients列表中的项目相匹配的干净配料,您可以使用fuzzywuzzy模块中的fuzz.partial_ratio来获取最匹配的配料(例如,比例大于80%的配料)。提前执行pip install fuzzywuzzy进行安装

代码语言:javascript
复制
>>> from fuzzywuzzy import fuzz
>>> reduced_recipe_dictionary = {k: list(filter(lambda x: fuzz.partial_ratio(v,x) >80, non_repeatable_ingredients)) for k,v in recipe_dictionary.items()}
>>> reduced_recipe_dictionary
{134: ['chicken'], 523: ['egg'], 12: ['chicken']}
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/51390499

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档