我想通过Docker容器在AWS Lamda上运行selenium脚本。
我使用AWS EC2构建,然后通过在本地测试容器。一旦测试成功,容器将在ECR上注册,以便输入AWS Lambda。
尽管RIE在EC2上的本地测试总是成功的,但我无法使Lambda正常工作。Lambda测试目前总是失败,有以下错误消息:
{
"errorMessage": "Message: session not created\nfrom tab crashed\n (Session info: headless chrome=93.0.4577.63)\n",
"errorType": "SessionNotCreatedException",
"stackTrace": [
" File \"/var/task/app.py\", line 32, in handler\n driver = webdriver.Chrome(\n",
" File \"/var/task/selenium/webdriver/chrome/webdriver.py\", line 76, in __init__\n RemoteWebDriver.__init__(\n",
" File \"/var/task/selenium/webdriver/remote/webdriver.py\", line 157, in __init__\n self.start_session(capabilities, browser_profile)\n",
" File \"/var/task/selenium/webdriver/remote/webdriver.py\", line 252, in start_session\n response = self.execute(Command.NEW_SESSION, parameters)\n",
" File \"/var/task/selenium/webdriver/remote/webdriver.py\", line 321, in execute\n self.error_handler.check_response(response)\n",
" File \"/var/task/selenium/webdriver/remote/errorhandler.py\", line 242, in check_response\n raise exception_class(message, screen, stacktrace)\n"
]
}
在这里,您可以找到我实际使用的所有代码:
Dockerfile
FROM public.ecr.aws/lambda/python:3.8
#Download and install Chrome
RUN curl https://dl.google.com/linux/direct/google-chrome-stable_current_x86_64.rpm > ./google-chrome-stable_current_x86_64.rpm
RUN yum install -y ./google-chrome-stable_current_x86_64.rpm
RUN rm ./google-chrome-stable_current_x86_64.rpm
#Download and install chromedriver
RUN yum install -y unzip
RUN curl http://chromedriver.storage.googleapis.com/`curl -sS chromedriver.storage.googleapis.com/LATEST_RELEASE`/chromedriver_linux64.zip > /tmp/chromedriver.zip
RUN unzip /tmp/chromedriver.zip chromedriver -d /usr/local/bin/
RUN rm /tmp/chromedriver.zip
RUN yum remove -y unzip
#Upgrade pip and install python dependences
RUN pip3 install --upgrade pip
RUN pip3 install selenium --target "${LAMBDA_TASK_ROOT}"
#Copy app.py
COPY app.py ${LAMBDA_TASK_ROOT}
CMD ["app.handler"]
app.py
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
def handler(event, context):
chrome_options = Options()
chrome_options.add_argument("--allow-running-insecure-content")
chrome_options.add_argument("--ignore-certificate-errors")
chrome_options.add_argument("--headless")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-dev-shm-usage")
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--disable-dev-tools")
chrome_options.add_argument("--no-zygote")
chrome_options.add_argument("--v=99")
chrome_options.add_argument("--single-process")
chrome_options.binary_location = '/usr/bin/google-chrome-stable'
capabilities = webdriver.DesiredCapabilities().CHROME
capabilities['acceptSslCerts'] = True
capabilities['acceptInsecureCerts'] = True
driver = webdriver.Chrome(
executable_path='/usr/local/bin/chromedriver',
options=chrome_options,
desired_capabilities=capabilities)
if driver:
response = {
"statusCode": 200,
"body": json.dumps("Selenium Driver Initiated")
}
return response
基于RIE的本地容器测试
$ docker run -p 9000:8080 aws-scraper
results in > time="2021-09-03T15:24:13.269" level=info msg="exec '/var/runtime/bootstrap' (cwd=/var/task, handler=)"
$ curl -XPOST "http://localhost:9000/2015-03-31/functions/function/invocations" -d '{}'
results in > {"statusCode": 200, "body": "\"Selenium Driver Initiated\""}[
我真的搞不懂。我也试着跟踪Selenium works on AWS EC2 but not on AWS Lambda,但没有结果。
任何帮助都会受到极大的欢迎。提前谢谢你。
发布于 2021-09-15 14:45:53
通过从这个repo:https://github.com/rchauhan9/image-scraper-lambda-container.git借用dockerfile和选项来解决这个问题
Dockerfile现在看起来如下所示:
# Define global args
ARG FUNCTION_DIR="/home/app/"
ARG RUNTIME_VERSION="3.9"
ARG DISTRO_VERSION="3.12"
# Stage 1
FROM python:${RUNTIME_VERSION}-alpine${DISTRO_VERSION} AS python-alpine
RUN apk add --no-cache \
libstdc++
# Stage 2
FROM python-alpine AS build-image
RUN apk add --no-cache \
build-base \
libtool \
autoconf \
automake \
libexecinfo-dev \
make \
cmake \
libcurl
ARG FUNCTION_DIR
ARG RUNTIME_VERSION
RUN mkdir -p ${FUNCTION_DIR}
RUN python${RUNTIME_VERSION} -m pip install awslambdaric --target ${FUNCTION_DIR}
# Stage 3
FROM python-alpine as build-image2
ARG FUNCTION_DIR
WORKDIR ${FUNCTION_DIR}
COPY --from=build-image ${FUNCTION_DIR} ${FUNCTION_DIR}
RUN apk update \
&& apk add gcc python3-dev musl-dev \
&& apk add jpeg-dev zlib-dev libjpeg-turbo-dev
COPY requirements.txt .
RUN python${RUNTIME_VERSION} -m pip install -r requirements.txt --target ${FUNCTION_DIR}
# Stage 4
FROM python-alpine
ARG FUNCTION_DIR
WORKDIR ${FUNCTION_DIR}
COPY --from=build-image2 ${FUNCTION_DIR} ${FUNCTION_DIR}
RUN apk add jpeg-dev zlib-dev libjpeg-turbo-dev \
&& apk add chromium chromium-chromedriver
ADD https://github.com/aws/aws-lambda-runtime-interface-emulator/releases/latest/download/aws-lambda-rie /usr/bin/aws-lambda-rie
RUN chmod 755 /usr/bin/aws-lambda-rie
COPY app/* ${FUNCTION_DIR}
COPY entry.sh /
ENTRYPOINT [ "/entry.sh" ]
CMD [ "app.handler" ]
app.py现在看起来如下所示;
import json
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
def handler(event, context):
chrome_options = Options()
chrome_options.add_argument('--autoplay-policy=user-gesture-required')
chrome_options.add_argument('--disable-background-networking')
chrome_options.add_argument('--disable-background-timer-throttling')
chrome_options.add_argument('--disable-backgrounding-occluded-windows')
chrome_options.add_argument('--disable-breakpad')
chrome_options.add_argument('--disable-client-side-phishing-detection')
chrome_options.add_argument('--disable-component-update')
chrome_options.add_argument('--disable-default-apps')
chrome_options.add_argument('--disable-dev-shm-usage')
chrome_options.add_argument('--disable-domain-reliability')
chrome_options.add_argument('--disable-extensions')
chrome_options.add_argument('--disable-features=AudioServiceOutOfProcess')
chrome_options.add_argument('--disable-hang-monitor')
chrome_options.add_argument('--disable-ipc-flooding-protection')
chrome_options.add_argument('--disable-notifications')
chrome_options.add_argument('--disable-offer-store-unmasked-wallet-cards')
chrome_options.add_argument('--disable-popup-blocking')
chrome_options.add_argument('--disable-print-preview')
chrome_options.add_argument('--disable-prompt-on-repost')
chrome_options.add_argument('--disable-renderer-backgrounding')
chrome_options.add_argument('--disable-setuid-sandbox')
chrome_options.add_argument('--disable-speech-api')
chrome_options.add_argument('--disable-sync')
chrome_options.add_argument('--disk-cache-size=33554432')
chrome_options.add_argument('--hide-scrollbars')
chrome_options.add_argument('--ignore-gpu-blacklist')
chrome_options.add_argument('--ignore-certificate-errors')
chrome_options.add_argument('--metrics-recording-only')
chrome_options.add_argument('--mute-audio')
chrome_options.add_argument('--no-default-browser-check')
chrome_options.add_argument('--no-first-run')
chrome_options.add_argument('--no-pings')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--no-zygote')
chrome_options.add_argument('--password-store=basic')
chrome_options.add_argument('--use-gl=swiftshader')
chrome_options.add_argument('--use-mock-keychain')
chrome_options.add_argument('--single-process')
chrome_options.add_argument('--headless')
chrome_options.add_argument('--user-data-dir={}'.format('/tmp/user-data'))
chrome_options.add_argument('--data-path={}'.format('/tmp/data-path'))
chrome_options.add_argument('--homedir={}'.format('/tmp'))
chrome_options.add_argument('--disk-cache-dir={}'.format('/tmp/cache-dir'))
driver = webdriver.Chrome(
executable_path='/usr/bin/chromedriver',
options=chrome_options)
if driver:
print("Selenium Driver Initiated")
response = {
"statusCode": 200,
"body": json.dumps(html, ensure_ascii=False)
}
return response
老实说,我仍然不明白为什么这些修改会起作用。任何关于这一点的想法都是非常受欢迎的!
再次感谢大家的帮助和支持
https://stackoverflow.com/questions/69047401
复制相似问题