我有一些用恼人的转义字符填充的抓取数据:
{"website": "http://www.zebrawebworks.com/zebra/bluetavern/day.cfm?&year=2018&month=7&day=10", "headliner": ["\"Roda Vibe\" with the Tallahassee Choro Society"], "data": [" \r\n ", "\r\n\t\r\n\r\n\t", "\r\n\t\r\n\t\r\n\t", "\r\n\t", "\r\n\t", "\r\n\t", "8:00 PM", "\r\n\t\r\n\tFEE: $2 \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 ", "\r\n\tEvery 2nd & 4th Tuesday of the month, the Choro Society returns to Blue Tavern with that subtly infectious Brazilian rhythm and beautiful melodies that will stay with you for days. The perfect antidote to Taylor Swift. $2 for musicians; tips appreciated. ", "\r\n\t", "\r\n\t\r\n\t", "\r\n\t", "\r\n\t", "\r\n\t\r\n\t\r\n\r\n\t\r\n\t", "\r\n\t\r\n\t\t", "\r\n", "\r\n", "\r\n", "\r\n"]},
我正在尝试编写一个函数来删除这些字符,但我的两种策略都不起作用:
# strategy 1
escapes = ''.join([chr(char) for char in range(1, 32)])
table = {ord(char): None for char in escapes}
for item in concert['data']:
item = item.translate(table)
# strategy 2
for item in concert['data']:
for char in item:
char = char.replace("\r", "").replace("\t", "").replace("\n", "")
为什么我的数据仍然充满了我尝试了两种不同方法删除的转义字符?
https://stackoverflow.com/questions/50958937
复制相似问题