这是完整的代码:
s = 'life is short, stunt it!!?'
from string import punctuation
tbl = str.maketrans({ord(ch):" " for ch in punctuation})
print(s.translate(tbl).split())我想知道tbl = str.maketrans({ord(ch):" " for ch in punctuation})在这段代码中是什么意思,还是一般的意思?
发布于 2015-12-04 17:49:52
它构建一个将标点符号字符转换为空格的字典,翻译字符串(有效地删除标点符号),然后拆分空格以生成单词列表。
一步一步……首先构建一个字符转换字典,其中键是标点符号,替换字符是空格。这使用字典理解来构建字典:
from string import punctuation
s = 'life is short, stunt it!!?'
D = {ord(ch):" " for ch in punctuation}
print(D)结果:
{64: ' ', 124: ' ', 125: ' ', 91: ' ', 92: ' ', 93: ' ', 94: ' ', 95: ' ', 96: ' ', 33: ' ', 34: ' ', 35: ' ', 36: ' ', 37: ' ', 38: ' ', 39: ' ', 40: ' ', 41: ' ', 42: ' ', 43: ' ', 44: ' ', 45: ' ', 46: ' ', 47: ' ', 123: ' ', 126: ' ', 58: ' ', 59: ' ', 60: ' ', 61: ' ', 62: ' ', 63: ' '}这一步是多余的。尽管字典看起来不同,但字典是无序的,键和值是相同的。正如translate所要求的那样,maketrans所能做的就是将字符键转换为序数值,但是在创建字典时已经做到了这一点。它还有其他用例,这里没有用到,所以maketrans可以去掉。
tbl = str.maketrans(D)
print(tbl)
print(D == tbl)结果:
{64: ' ', 60: ' ', 61: ' ', 91: ' ', 92: ' ', 93: ' ', 94: ' ', 95: ' ', 96: ' ', 33: ' ', 34: ' ', 35: ' ', 36: ' ', 37: ' ', 38: ' ', 39: ' ', 40: ' ', 41: ' ', 42: ' ', 43: ' ', 44: ' ', 45: ' ', 46: ' ', 47: ' ', 59: ' ', 62: ' ', 58: ' ', 123: ' ', 124: ' ', 125: ' ', 126: ' ', 63: ' '}
True现在进行翻译:
s = s.translate(tbl)
print(s)结果:
life is short stunt it 拆分成一个单词列表:
print(s.split())结果:
['life', 'is', 'short', 'stunt', 'it']发布于 2015-12-04 18:00:08
{ord(ch):" " for ch in punctuation}是一个dictionary comprehension。
这些类似于(并基于) list comprehensions。
Blog post I wrote explaining list comprehensions
您可以从Python shell运行以下代码,以查看每行代码的作用:
>>> from string import punctuation
>>> punctuation
'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'
>>> punctuation_to_spaces = {ord(ch): " " for ch in punctuation}
>>> punctuation_to_spaces
{64: ' ', 124: ' ', 125: ' ', 91: ' ', 92: ' ', 93: ' ', 94: ' ', 95: ' ', 96: ' ', 33: ' ', 34: ' ', 35: ' ', 36: ' ', 37: ' ', 38: ' ', 39: ' ', 40: ' ', 41: ' ', 42: ' ', 43: ' ', 44: ' ', 45: ' ', 46: ' ', 47: ' ', 123: ' ', 126: ' ', 58: ' ', 59: ' ', 60: ' ', 61: ' ', 62: ' ', 63: ' '}
>>> punctuation_removal = str.maketrans(punctuation_to_spaces)
>>> punctuation_removal
{64: ' ', 60: ' ', 61: ' ', 91: ' ', 92: ' ', 93: ' ', 94: ' ', 95: ' ', 96: ' ', 33: ' ', 34: ' ', 35: ' ', 36: ' ', 37: ' ', 38: ' ', 39: ' ', 40: ' ', 41: ' ', 42: ' ', 43: ' ', 44: ' ', 45: ' ', 46: ' ', 47: ' ', 59: ' ', 62: ' ', 58: ' ', 123: ' ', 124: ' ', 125: ' ', 126: ' ', 63: ' '}
>>> s = 'life is short, stunt it!!?'
>>> s.translate(punctuation_removal)
'life is short stunt it '该字典理解行基本上是将标点符号作为键的ASCII值和作为值的空格字符组成的字典。然后,对s字符串的.translate调用将使用该字典将标点符号转换为空格。
ord函数将每个标点符号字符转换为其ASCII值。
请注意,使用ord和maketrans是多余的。这两种解决方案中的任何一种都可以很好地工作,并且不会被双重翻译:
tbl = str.maketrans({ch:" " for ch in punctuation})
print(s.translate(tbl).split())或
tbl = {ord(ch):" " for ch in punctuation}
print(s.translate(tbl).split())https://stackoverflow.com/questions/34085201
复制相似问题