blocks|key|1336446|text|collections模块中的Counter+class就是专门为解决此类问题而构建的：|type|unstyled|depth|inlineStyleRanges|offset|length|style|CODE|entityRanges|data|1336447|from+collections+import+Counter
words+=+"apple+banana+apple+strawberry+banana+lemon"
Counter(words.split())
#+Counter({'apple':+2,+'banana':+2,+'strawberry':+1,+'lemon':+1})|code-block|syntax|javascript|1336448|entityMap|0|LINK|mutability|MUTABLE|url|http://docs.python.org/library/collections.html#collections.Counter^0|0|B|F|7|F|D|0|0|0^^$0|@$1|2|3|4|5|6|7|S|8|@$9|T|A|U|B|C]|$9|V|A|W|B|C]]|D|@$9|X|A|Y|1|Z]]|E|$]]|$1|F|3|G|5|H|7|10|8|@]|D|@]|E|$I|J]]|$1|K|3|-4|5|6|7|11|8|@]|D|@]|E|$]]]|L|$M|$5|N|O|P|E|$Q|R]]]]

The <a href="http://docs.python.org/library/collections.html#collections.Counter" rel="noreferrer"><code>Counter</code> class</a> in the <code>collections</code> module is purpose built to solve this type of problem:

<pre><code>from collections import Counter
words = "apple banana apple strawberry banana lemon"
Counter(words.split())
# Counter({'apple': 2, 'banana': 2, 'strawberry': 1, 'lemon': 1})
</code></pre>

blocks|key|1336335|text|defaultdict出手相救！|type|unstyled|depth|inlineStyleRanges|entityRanges|offset|length|data|1336336|from+collections+import+defaultdict

words+=+"apple+banana+apple+strawberry+banana+lemon"

d+=+defaultdict(int)
for+word+in+words.split():
++++d[word]+%2B=+1|code-block|syntax|javascript|1336337|它的运行时间为O(n)。|1336338|entityMap|0|LINK|mutability|MUTABLE|url|http://docs.python.org/library/collections.html#defaultdict-objects^0|0|B|0|0|0|0^^$0|@$1|2|3|4|5|6|7|S|8|@]|9|@$A|T|B|U|1|V]]|C|$]]|$1|D|3|E|5|F|7|W|8|@]|9|@]|C|$G|H]]|$1|I|3|J|5|6|7|X|8|@]|9|@]|C|$]]|$1|K|3|-4|5|6|7|Y|8|@]|9|@]|C|$]]]|L|$M|$5|N|O|P|C|$Q|R]]]]

<a href="http://docs.python.org/library/collections.html#defaultdict-objects" rel="noreferrer">defaultdict</a> to the rescue!

<pre><code>from collections import defaultdict

words = "apple banana apple strawberry banana lemon"

d = defaultdict(int)
for word in words.split():
 d[word] += 1
</code></pre>

This runs in O(n).

blocks|key|1336534|text|freqs+=+{}
for+word+in+words:
++++freqs[word]+=+freqs.get(word,+0)+%2B+1+#+fetch+and+increment+OR+initialize|type|code-block|depth|inlineStyleRanges|entityRanges|data|syntax|javascript|1336535|我认为这与Triptych的解决方案的结果相同，但没有导入集合。也有点像Selinap的解决方案，但更具可读性。几乎与Thomas+Weigel的解决方案相同，但没有使用异常。|unstyled|1336536|然而，这可能比使用集合库中的defaultdict()慢。因为该值被获取、递增，然后再次赋值。而不是仅仅递增。但是，在内部使用%2B=可能会做同样的事情。|1336537|entityMap^0|0|0|0^^$0|@$1|2|3|4|5|6|7|K|8|@]|9|@]|A|$B|C]]|$1|D|3|E|5|F|7|L|8|@]|9|@]|A|$]]|$1|G|3|H|5|F|7|M|8|@]|9|@]|A|$]]|$1|I|3|-4|5|F|7|N|8|@]|9|@]|A|$]]]|J|$]]

<pre><code>freqs = {}
for word in words:
 freqs[word] = freqs.get(word, 0) + 1 # fetch and increment OR initialize
</code></pre>

I think this results to the same as Triptych's solution, but without importing collections. Also a bit like Selinap's solution, but more readable imho. Almost identical to Thomas Weigel's solution, but without using Exceptions.

This could be slower than using defaultdict() from the collections library however. Since the value is fetched, incremented and then assigned again. Instead of just incremented. However using += might do just the same internally.

blocks|key|1336401|text|标准方法：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1336402|from+collections+import+defaultdict

words+=+"apple+banana+apple+strawberry+banana+lemon"
words+=+words.split()
result+=+defaultdict(int)
for+word+in+words:
++++result[word]+%2B=+1

print+result|code-block|syntax|javascript|1336403|Groupby+oneliner：|1336404|from+itertools+import+groupby

words+=+"apple+banana+apple+strawberry+banana+lemon"
words+=+words.split()

result+=+dict((key,+len(list(group)))+for+key,+group+in+groupby(sorted(words)))
print+result|1336405|entityMap^0|0|0|0|0^^$0|@$1|2|3|4|5|6|7|M|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|N|8|@]|9|@]|A|$E|F]]|$1|G|3|H|5|6|7|O|8|@]|9|@]|A|$]]|$1|I|3|J|5|D|7|P|8|@]|9|@]|A|$E|F]]|$1|K|3|-4|5|6|7|Q|8|@]|9|@]|A|$]]]|L|$]]

Standard approach:

<pre><code>from collections import defaultdict

words = "apple banana apple strawberry banana lemon"
words = words.split()
result = defaultdict(int)
for word in words:
 result[word] += 1

print result
</code></pre>

Groupby oneliner:

<pre><code>from itertools import groupby

words = "apple banana apple strawberry banana lemon"
words = words.split()

result = dict((key, len(list(group))) for key, group in groupby(sorted(words)))
print result
</code></pre>

blocks|key|1551902|text|如果您不想使用标准的字典方法(循环遍历列表，递增适当的字典。键)，您可以尝试这样做：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1551903|>>>+from+itertools+import+groupby
>>>+myList+=+words.split()+#+['apple',+'banana',+'apple',+'strawberry',+'banana',+'lemon']
>>>+[(k,+len(list(g)))+for+k,+g+in+groupby(sorted(myList))]
[('apple',+2),+('banana',+2),+('lemon',+1),+('strawberry',+1)]|code-block|syntax|javascript|1551904|它的运行时间为O(n+log+n)。|1551905|entityMap^0|0|0|0^^$0|@$1|2|3|4|5|6|7|K|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|L|8|@]|9|@]|A|$E|F]]|$1|G|3|H|5|6|7|M|8|@]|9|@]|A|$]]|$1|I|3|-4|5|6|7|N|8|@]|9|@]|A|$]]]|J|$]]

If you don't want to use the standard dictionary method (looping through the list incrementing the proper dict. key), you can try this:

<pre><code>&gt;&gt;&gt; from itertools import groupby
&gt;&gt;&gt; myList = words.split() # ['apple', 'banana', 'apple', 'strawberry', 'banana', 'lemon']
&gt;&gt;&gt; [(k, len(list(g))) for k, g in groupby(sorted(myList))]
[('apple', 2), ('banana', 2), ('lemon', 1), ('strawberry', 1)]
</code></pre>

It runs in O(n log n) time.

blocks|key|1552071|text|不带defaultdict：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1552072|words+=+"apple+banana+apple+strawberry+banana+lemon"
my_count+=+{}
for+word+in+words.split():
++++try:+my_count[word]+%2B=+1
++++except+KeyError:+my_count[word]+=+1|code-block|syntax|javascript|1552073|entityMap^0|0|0^^$0|@$1|2|3|4|5|6|7|I|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|J|8|@]|9|@]|A|$E|F]]|$1|G|3|-4|5|6|7|K|8|@]|9|@]|A|$]]]|H|$]]

Without defaultdict:

<pre><code>words = "apple banana apple strawberry banana lemon"
my_count = {}
for word in words.split():
 try: my_count[word] += 1
 except KeyError: my_count[word] = 1
</code></pre>

blocks|key|1336679|text|user_input+=+list(input().split('+'))

for+word+in+user_input:

++++print('{}+{}'.format(word,+user_input.count(word)))|type|code-block|depth|inlineStyleRanges|entityRanges|data|syntax|javascript|1336680|unstyled|entityMap^0|0^^$0|@$1|2|3|4|5|6|7|G|8|@]|9|@]|A|$B|C]]|$1|D|3|-4|5|E|7|H|8|@]|9|@]|A|$]]]|F|$]]

<pre><code>user_input = list(input().split(' '))

for word in user_input:

 print('{} {}'.format(word, user_input.count(word)))
</code></pre>

blocks|key|1552259|text|words+=+"apple+banana+apple+strawberry+banana+lemon"
w=words.split()
e=list(set(w))+++++++
word_freqs+=+{}
for+i+in+e:
++++word_freqs[i]=w.count(i)
print(word_freqs)+++|type|code-block|depth|inlineStyleRanges|entityRanges|data|syntax|javascript|1552260|希望这能有所帮助！|unstyled|1552261|entityMap^0|0|0^^$0|@$1|2|3|4|5|6|7|I|8|@]|9|@]|A|$B|C]]|$1|D|3|E|5|F|7|J|8|@]|9|@]|A|$]]|$1|G|3|-4|5|F|7|K|8|@]|9|@]|A|$]]]|H|$]]

<pre><code>words = "apple banana apple strawberry banana lemon"
w=words.split()
e=list(set(w)) 
word_freqs = {}
for i in e:
 word_freqs[i]=w.count(i)
print(word_freqs) 
</code></pre>

Hope this helps!

blocks|key|1552138|text|你不能只用count吗？|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1552139|words+=+'the+quick+brown+fox+jumps+over+the+lazy+gray+dog'
words.count('z')
#output:+1|code-block|syntax|javascript|1552140|entityMap^0|0|0^^$0|@$1|2|3|4|5|6|7|I|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|J|8|@]|9|@]|A|$E|F]]|$1|G|3|-4|5|6|7|K|8|@]|9|@]|A|$]]]|H|$]]

Can't you just use count?

<pre><code>words = 'the quick brown fox jumps over the lazy gray dog'
words.count('z')
#output: 1
</code></pre>

blocks|key|1552187|text|我碰巧做了一些Spark练习，这是我的解决方案。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1552188|tokens+=+['quick',+'brown',+'fox',+'jumps',+'lazy',+'dog']

print+{n:+float(tokens.count(n))/float(len(tokens))+for+n+in+tokens}|code-block|syntax|javascript|1552189|上面的**#output+**|1552190|{'brown':+0.16666666666666666,+'lazy':+0.16666666666666666,+'jumps':+0.16666666666666666,+'fox':+0.16666666666666666,+'dog':+0.16666666666666666,+'quick':+0.16666666666666666}|1552191|entityMap^0|0|0|0|0^^$0|@$1|2|3|4|5|6|7|M|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|N|8|@]|9|@]|A|$E|F]]|$1|G|3|H|5|6|7|O|8|@]|9|@]|A|$]]|$1|I|3|J|5|D|7|P|8|@]|9|@]|A|$E|F]]|$1|K|3|-4|5|6|7|Q|8|@]|9|@]|A|$]]]|L|$]]

I happened to work on some Spark exercise, here is my solution.

<pre><code>tokens = ['quick', 'brown', 'fox', 'jumps', 'lazy', 'dog']

print {n: float(tokens.count(n))/float(len(tokens)) for n in tokens}
</code></pre>

**#output of the above **

<pre><code>{'brown': 0.16666666666666666, 'lazy': 0.16666666666666666, 'jumps': 0.16666666666666666, 'fox': 0.16666666666666666, 'dog': 0.16666666666666666, 'quick': 0.16666666666666666}
</code></pre>

blocks|key|1552246|text|使用reduce()将列表转换为单个字典。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1552247|from+functools+import+reduce

words+=+"apple+banana+apple+strawberry+banana+lemon"
reduce(+lambda+d,+c:+d.update([(c,+d.get(c,0)%2B1)])+or+d,+words.split(),+{})|code-block|syntax|javascript|1552248|返回|1552249|{'strawberry':+1,+'lemon':+1,+'apple':+2,+'banana':+2}|1552250|entityMap^0|0|0|0|0^^$0|@$1|2|3|4|5|6|7|M|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|N|8|@]|9|@]|A|$E|F]]|$1|G|3|H|5|6|7|O|8|@]|9|@]|A|$]]|$1|I|3|J|5|D|7|P|8|@]|9|@]|A|$E|F]]|$1|K|3|-4|5|6|7|Q|8|@]|9|@]|A|$]]]|L|$]]

Use reduce() to convert the list to a single dict.
<pre><code>from functools import reduce

words = &quot;apple banana apple strawberry banana lemon&quot;
reduce( lambda d, c: d.update([(c, d.get(c,0)+1)]) or d, words.split(), {})
</code></pre>
returns
<pre><code>{'strawberry': 1, 'lemon': 1, 'apple': 2, 'banana': 2}
</code></pre>

blocks|key|1336690|text|list+=+input()++#+Providing+user+input+passes+multiple+tests
text+=+list.split()

for+word+in+text:
++++freq+=+text.count(word)+
++++print(word,+freq)|type|code-block|depth|inlineStyleRanges|entityRanges|data|syntax|javascript|1336691|unstyled|entityMap^0|0^^$0|@$1|2|3|4|5|6|7|G|8|@]|9|@]|A|$B|C]]|$1|D|3|-4|5|E|7|H|8|@]|9|@]|A|$]]]|F|$]]

<pre><code>list = input() # Providing user input passes multiple tests
text = list.split()

for word in text:
 freq = text.count(word) 
 print(word, freq)
</code></pre>

Assume I have a list of words, and I want to find the number of times each word appears in that list.

An obvious way to do this is:

<pre><code>words = "apple banana apple strawberry banana lemon"
uniques = set(words.split())
freqs = [(item, words.split().count(item)) for item in uniques]
print(freqs)
</code></pre>

But I find this code not very good, because the program runs through the word list twice, once to build the set, and a second time to count the number of appearances.

Of course, I could write a function to run through the list and do the counting, but that wouldn't be so Pythonic. So, is there a more efficient and Pythonic way?

Item frequency count in Python

Spark 

假设我有一个单词列表，我想找出每个单词在该列表中出现的次数。一个显而易见的方法是：words = "apple banana apple strawberry banana lemon"uniques = set(words.split())freqs = [(item, words.split().count(item)) for item in uniques]print(freqs)但是我发

问Python中的项目频率计数
EN

回答 12

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python中的项目频率计数EN

回答 12

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python中的项目频率计数
EN