首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
社区首页 >问答首页 >检查某个位置名称的谷歌地图拼写

检查某个位置名称的谷歌地图拼写
EN

Stack Overflow用户
提问于 2017-03-30 16:55:11
回答 1查看 195关注 0票数 0

我想用谷歌地图获得欧洲一些机场之间的旅行时间。问题是,有些名字的拼写不准确,所以,我想首先检查错误的名称,并得到谷歌的版本的名称。从这个question中,我可以使用selenium来完成这个任务,但是我的代码有一些问题: 1)输出并不总是完整的,(最后两个结果只有一个字母表) 2)它在列表的末尾抛出一个异常(参见下面)。请帮我修一下密码。自动化是唯一的方法,因为机场清单很长。

输入:

代码语言:javascript
代码运行次数:0
运行
复制
wronglySpelled = ['Treviso (San Angelo) Airport', 'Milano - Malpensa
       Airport', 'Venezia - Tessera Airport', 'Milano - Linate Airport',
       'Treviso (San Angelo) Airport', 'Treviso (San Angelo) Airport',
      'Milano - Malpensa Airport', 'Venezia - Tessera Airport', 'Guernsey
       Channel Is. Airport', 'Jersey Channel Is. Airport','Treviso (San
                     Angelo) Airport']

代码:

代码语言:javascript
代码运行次数:0
运行
复制
def setup():
    driver = webdriver.Chrome()
    driver.get("http://maps.google.com")
    driver.maximize_window() # For maximizing window
    driver.implicitly_wait(20) # gives an implicit wait for 20 seconds
    return driver

def correct_name(driver, name_to_check):
    searchBox = driver.find_element_by_name('q')
    searchBox.send_keys(name_to_check)
    correct_name = driver.find_element_by_class_name('suggest-bold')
    return correct_name.text.encode('utf-8')

driver = setup()
for item in wronglySpelled:
    print item,':', correct_name(driver, item)
    time.sleep(5)
    driver.find_element_by_id('searchboxinput').clear()
driver.quit()

错误消息:

代码语言:javascript
代码运行次数:0
运行
复制
Traceback (most recent call last):
  File "C:/...", line 60, in <module>
    print item,':', correct_name(driver, item)
  File "C:/...", line 41, in correct_name
    correct_name = driver.find_element_by_class_name('suggest-bold')
  File "C:\...", line 415, in find_element_by_class_name
    return self.find_element(by=By.CLASS_NAME, value=name)
  File "C:\...", line 756, in find_element
    'value': value})['value']
  File "C:\...", line 238, in execute
    self.error_handler.check_response(response)
  File "C:\...", line 193, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"class name","selector":"suggest-bold"}
  (Session info: chrome=56.0.2924.87)
  (Driver info: chromedriver=2.28.455520 (cc17746adff54984afff480136733114c6b3704b),platform=Windows NT 10.0.14393 x86_64)

产出:

代码语言:javascript
代码运行次数:0
运行
复制
## Formatted as Input name : Google maps version
Treviso (San Angelo) Airport : Aeroporto di Treviso Canova
Milano - Malpensa Airport : Milano Malpensa Airport
Venezia - Tessera Airport : Venice Marco Polo Airport
Milano - Linate Airport : Aeroporto Milano Linate
Treviso (San Angelo) Airport : Aeroporto di Treviso Canova
Treviso (San Angelo) Airport : Aeroporto di Treviso Canova
Milano - Malpensa Airport : m
Venezia - Tessera Airport : V
Guernsey Channel Is. Airport :
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2017-03-31 15:45:51

正如@JeffC在他的评论中提到的那样,您最好使用Google来完成您想要做的任何事情。

你的剧本里有几件事:

  • 我发现隐含的等待大多是毫无价值的。在我看来,学习明确等待、它们如何工作以及它们什么时候有用是值得的。因为Google倾向于使用AJAX在其大量信息中使用的高度动态的网站,所以如果您使用显式等待,像Maps这样的东西往往工作得最好。
  • 你的脚本抛出了NoSuchElementException,因为谷歌对这个词没有任何建议,所以你的搜索词,suggest-bold从来没有匹配任何东西。通常,当我看到一个NSEE,它是一个危险的标志,我需要重新评估我的搜索方式。
  • 我认为(但不是肯定的) Milano - Malpensa Airport : m输出是selenium在元素完全加载之前从元素中收集文本的结果。
  • 我个人不惜一切代价避免硬睡眠(time.sleep(5)),因为如果您遇到任何网络延迟,它们很容易导致脚本中断。这是显式等待闪耀的另一个区域,因为他们可以尝试并尝试找到一个元素,直到它实际加载,然后继续。

话虽如此,我还是这样做的:

代码语言:javascript
代码运行次数:0
运行
复制
from explicit import waiter, ID
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait as Wait
from selenium.common.exceptions import StaleElementReferenceException

NO_SUGGESTION = 'Add a missing place to Google Maps.'

original_names = [
    'Treviso (San Angelo) Airport',
    'Milano - Malpensa Airport',
    'Venezia - Tessera Airport',
    'Milano - Linate Airport',
    'Treviso (San Angelo) Airport',
    'Treviso (San Angelo) Airport',
    'Milano - Malpensa Airport',
    'Venezia - Tessera Airport',
    'Guernsey Channel Is. Airport',
    'Jersey Channel Is. Airport',
    'Treviso (San Angelo) Airport'
]


def get_name_suggestion(driver, name):
    # Find the search box, clear it, write the name
    waiter.find_write(driver, 'searchboxinput', name, by=ID, clear_first=True)

    class SuggestionLoads(object):
        def __init__(self):
            self._last_seen = None

        def __call__(self, driver):
            """ Custom expected condition.
                Returns either the first suggested name, or '<No Suggestion>'
                Raises a TimeoutException in the event the page source is different
            """
            suggestion_icon = 'div.suggest-icon-container'
            suggest_css = 'div.suggest-left-cell > span.suggest-query'
            try:

                # Only want suggestions that have the location icon next to them, and not the
                # magnifying glass. Return False if we don't find any so as to retry
                icons = driver.find_elements_by_css_selector(suggestion_icon)
                if len(icons) < 1:
                    return False

                elems = driver.find_elements_by_css_selector(suggest_css)

                if len(elems) == 0:
                    # No suggestions have loaded yet, return False so the Wait can retry
                    return False

                suggest_text = elems[0].text
                if len(suggest_text) == 1:
                    # Sometimes we catch text mid-update. Return False to retry
                    # and hopefully get the whole suggestion
                    return False
                elif suggest_text == NO_SUGGESTION:
                    # Google has no suggestion for us, return NO_SUGGESTION, which the Wait will
                    # evaluate as True and exit
                    return '<No Suggestion>'
                else:
                    # We found a valid suggestion. We need to make sure nothing else is going to
                    # get AJAXed in, so compare it to or _last_seen property. If they match,
                    # everything has stabilized and return the string, which will be evaluated as
                    # True and cause the Wait to exit
                    # If you don't do this, you wind up with jung suggestions like "Traffic"
                    if suggest_text == self._last_seen:
                        return suggest_text
                    else:
                        self._last_seen = suggest_text
                        return False

            except StaleElementReferenceException:
                # Because the DOM is constantly updating, there is a pretty decent chance that a
                # SERE will get thrown. Catch it if it does and return False so the Wait
                # can try again
                return False

    return Wait(driver, 30).until(SuggestionLoads())


def main():
    driver = webdriver.Chrome()
    try:
        driver.get("http://maps.google.com")
        driver.maximize_window()
        for orig_name in original_names:
            suggested_name = get_name_suggestion(driver, orig_name)
            print "{0}: {1}".format(orig_name, suggested_name)
    finally:  # This is useful to make sure the browsers get closed, even if an exception is thrown
        driver.quit()


if __name__ == "__main__":
    main()

返回:

代码语言:javascript
代码运行次数:0
运行
复制
 (.venv27) ➜  tmp python google_maps.py
 Treviso (San Angelo) Airport: Aeroporto di Treviso Canova
 Milano - Malpensa Airport: Milano Malpensa Airport
 Venezia - Tessera Airport: Venice Marco Polo Airport
 Milano - Linate Airport: Aeroporto Milano Linate
 Treviso (San Angelo) Airport: Aeroporto di Treviso Canova
 Treviso (San Angelo) Airport: Aeroporto di Treviso Canova
 Milano - Malpensa Airport: Milano Malpensa Airport
 Venezia - Tessera Airport: Venice Marco Polo Airport
 Guernsey Channel Is. Airport: <No Suggestion>
 Jersey Channel Is. Airport: <No Suggestion>
 Treviso (San Angelo) Airport: Aeroporto di Treviso Canova

完全披露者:explicit是我维护的一个库,可以从PyPI:pip install explicit获得。它的目的是使使用显式等待更容易,但您可以将其替换为花园品种等待。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/43123993

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档