在当今快速发展的互联网时代,LAMP(Linux、Apache、MySQL、PHP/Python/Perl)和LEMP(以Nginx取代Apache)架构作为经典Web解决方案,依然在各种规模的企业中扮演着重要角色。然而,传统手动维护这些架构的方式已无法满足现代业务需求,特别是在管理数十台甚至上百台服务器时,手动部署和维护成为效率瓶颈和故障隐患。
根据实践研究表明,基于Ansible的自动化运维系统能大幅节省传统手动部署LAMP架构的时间,显著降低部署过程中的错误率。尤其在管理50台以上服务器的环境中,效率提升更为明显。自动化运维不仅涵盖了初始部署,还包括配置管理、持续监控、故障自愈和安全更新的全生命周期管理。
本文将深入探讨如何利用Ansible和CI/CD等现代 DevOps 工具链,构建一个完整、健壮的LAMP/LEMP架构自动化运维体系,涵盖从架构设计到部署实践,从监控告警到故障处理的全流程。
LAMP和LEMP栈都是构建动态网站和Web应用程序的经典解决方案,它们在组件选择上有所不同:
Nginx与Apache的核心差异在于其处理并发连接的方式。Nginx使用事件驱动的异步架构,使其在相同硬件配置下能够处理更多的并发连接;而Apache则基于进程或线程的模型,每个连接需要一个进程或线程处理,在高并发场景下资源消耗更大。
在现代生产环境中,各组件的选型需要根据具体业务需求做出权衡:
.htaccess
动态配置和丰富的模块生态系统适合需要高度可配置性的场景;Nginx则更适合高并发、静态内容和反向代理场景。mod_php
模块;对于Nginx,则需配置PHP-FPM
(FastCGI进程管理器),后者在资源隔离和性能调优方面更具优势。在现代LAMP/LEMP自动化运维中,基础设施即代码(Infrastructure as Code)是核心理念。通过代码形式定义和管理基础设施配置,可以实现:
Ansible作为一款无代理的自动化工具,通过SSH协议管理远程系统,不需要在被管控主机上安装任何客户端软件。它的主要优势包括:
一个典型的Ansible自动化架构包含以下组件:
一个结构良好的Ansible项目目录对于维护复杂自动化任务至关重要。以下是推荐的目录结构:
lamp-automation/
├── inventories/ # 环境清单目录
│ ├── production/ # 生产环境
│ │ ├── hosts # 主机定义
│ │ └── group_vars/ # 组变量
│ │ ├── all.yml # 全局变量
│ │ ├── webservers.yml # Web服务器组变量
│ │ └── dbservers.yml # 数据库组变量
│ └── staging/ # 预发布环境
├── roles/
│ ├── common/ # 基础配置角色
│ │ ├── tasks/
│ │ ├── handlers/
│ │ ├── templates/
│ │ └── vars/
│ ├── apache/ # Apache角色
│ ├── nginx/ # Nginx角色
│ ├── php/ # PHP角色
│ ├── mysql/ # MySQL角色
│ ├── composer/ # PHP依赖管理
│ └── deploy/ # 应用部署角色
├── site.yml # 主Playbook
├── webservers.yml # Web服务器专用Playbook
├── dbservers.yml # 数据库服务器专用Playbook
└── requirements.yml # 角色依赖声明
使用Ansible Role可以将LAMP/LEMP栈的各个组件模块化,实现关注点分离和代码复用。每个角色负责一个特定的服务或功能,如Web服务器、数据库或PHP配置。
通用基础配置角色负责所有服务器的公共配置,是其他角色的基础:
# roles/common/tasks/main.yml
- name: Update package cache
apt:
update_cache: yes
cache_valid_time: 3600
when: ansible_os_family == "Debian"
- name: Install base packages
package:
name: "{{ base_packages }}"
state: present
- name: Configure timezone
timezone:
name: "{{ timezone }}"
- name: Configure hostname
hostname:
name: "{{ inventory_hostname }}"
- name: Configure sysctl parameters
sysctl:
name: "{{ item.name }}"
value: "{{ item.value }}"
state: present
reload: yes
loop: "{{ sysctl_parameters }}"
- name: Configure limits.conf
template:
src: limits.conf.j2
dest: /etc/security/limits.conf
notify: reboot required
- name: Create admin user
user:
name: "{{ admin_user }}"
group: sudo
shell: /bin/bash
generate_ssh_key: yes
ssh_key_bits: 2048
ssh_key_file: .ssh/id_rsa
- name: Configure sudoers
copy:
src: sudoers
dest: /etc/sudoers.d/{{ admin_user }}
mode: 0440
Apache角色负责Apache HTTP服务器的安装和配置:
# roles/apache/tasks/main.yml
- name: Install Apache
package:
name: "{{ apache_package }}"
state: present
- name: Create Apache log directory
file:
path: /var/log/apache2
state: directory
owner: root
group: root
mode: 0755
- name: Configure Apache modules
template:
src: modules.conf.j2
dest: "{{ apache_mods_enabled_dir }}/modules.conf"
notify: restart apache
- name: Enable Apache modules
apache2_module:
name: "{{ item }}"
state: present
loop: "{{ apache_modules }}"
notify: restart apache
- name: Configure virtual host
template:
src: vhost.conf.j2
dest: "{{ apache_sites_available_dir }}/{{ apache_vhost_filename }}"
notify: restart apache
- name: Enable site
file:
src: "{{ apache_sites_available_dir }}/{{ apache_vhost_filename }}"
dest: "{{ apache_sites_enabled_dir }}/{{ apache_vhost_filename }}"
state: link
notify: restart apache
- name: Configure security settings
template:
src: security.conf.j2
dest: "{{ apache_conf_dir }}/conf-available/security.conf"
notify: restart apache
- name: Ensure Apache is started and enabled
service:
name: apache2
state: started
enabled: yes
相应的变量文件定义组件特定的配置:
# roles/apache/vars/main.yml
apache_package: apache2
apache_service: apache2
apache_user: www-data
apache_group: www-data
apache_log_dir: /var/log/apache2
apache_conf_dir: /etc/apache2
apache_mods_available_dir: /etc/apache2/mods-available
apache_mods_enabled_dir: /etc/apache2/mods-enabled
apache_sites_available_dir: /etc/apache2/sites-available
apache_sites_enabled_dir: /etc/apache2/sites-enabled
apache_vhost_filename: "{{ domain_name | default('default') }}.conf"
apache_modules:
- rewrite
- ssl
- headers
apache_keepalive: On
apache_keepalive_timeout: 5
apache_max_keepalive_requests: 100
对于LEMP栈,Nginx角色提供类似功能但针对Nginx服务器:
# roles/nginx/tasks/main.yml
- name: Install Nginx
package:
name: nginx
state: present
- name: Create Nginx log directory
file:
path: /var/log/nginx
state: directory
owner: root
group: root
mode: 0755
- name: Configure nginx.conf
template:
src: nginx.conf.j2
dest: /etc/nginx/nginx.conf
notify: restart nginx
- name: Configure virtual host
template:
src: vhost.conf.j2
dest: /etc/nginx/sites-available/{{ domain_name | default('default') }}.conf
notify: reload nginx
- name: Enable site
file:
src: /etc/nginx/sites-available/{{ domain_name | default('default') }}.conf
dest: /etc/nginx/sites-enabled/{{ domain_name | default('default') }}.conf
state: link
notify: reload nginx
- name: Ensure Nginx is started and enabled
service:
name: nginx
state: started
enabled: yes
PHP-FPM角色负责PHP的安装和配置,适用于LEMP栈或与Apache配合使用:
# roles/php/tasks/main.yml
- name: Install PHP and extensions
package:
name: "{{ item }}"
state: present
loop: "{{ php_packages }}"
- name: Configure PHP-FPM pool
template:
src: www.conf.j2
dest: "{{ php_fpm_pool_dir }}/www.conf"
notify: reload php-fpm
- name: Configure php.ini
template:
src: php.ini.j2
dest: "{{ php_ini_path }}"
notify: reload php-fpm
- name: Configure PHP-FPM main config
template:
src: php-fpm.conf.j2
dest: "{{ php_fpm_conf_path }}"
notify: reload php-fpm
- name: Ensure PHP-FPM is started and enabled
service:
name: "{{ php_fpm_service }}"
state: started
enabled: yes
对应的变量文件:
# roles/php/vars/main.yml
php_packages:
- php
- php-fpm
- php-cli
- php-mysql
- php-curl
- php-gd
- php-mbstring
- php-xml
- php-zip
- php-json
- php-bcmath
php_fpm_service: php7.4-fpm
php_ini_path: /etc/php/7.4/fpm/php.ini
php_fpm_conf_path: /etc/php/7.4/fpm/php-fpm.conf
php_fpm_pool_dir: /etc/php/7.4/fpm/pool.d
php_memory_limit: 256M
php_max_execution_time: 120
php_upload_max_filesize: 64M
php_post_max_size: 64M
php_date_timezone: "Asia/Shanghai"
MySQL角色负责数据库服务器的安装和配置:
# roles/mysql/tasks/main.yml
- name: Install MySQL server
package:
name: "{{ mysql_packages }}"
state: present
- name: Ensure MySQL is started and enabled
service:
name: mysql
state: started
enabled: yes
- name: Update MySQL root password
mysql_user:
login_user: root
login_password: ""
name: root
password: "{{ mysql_root_password }}"
host: "{{ item }}"
loop:
- 127.0.0.1
- ::1
- localhost
when: mysql_root_password is defined
- name: Remove anonymous MySQL users
mysql_user:
login_user: root
login_password: "{{ mysql_root_password }}"
name: ""
host: "{{ item }}"
state: absent
loop:
- localhost
- "{{ ansible_hostname }}"
- name: Create application database
mysql_db:
login_user: root
login_password: "{{ mysql_root_password }}"
name: "{{ mysql_database_name }}"
state: present
- name: Create application database user
mysql_user:
login_user: root
login_password: "{{ mysql_root_password }}"
name: "{{ mysql_database_user }}"
password: "{{ mysql_database_password }}"
host: "%"
priv: "{{ mysql_database_name }}.*:ALL"
state: present
- name: Configure my.cnf
template:
src: my.cnf.j2
dest: /etc/mysql/my.cnf
notify: restart mysql
主Playbook将各个角色组织起来,按照逻辑顺序执行:
# site.yml
- name: Configure all servers with base configuration
hosts: all
become: yes
roles:
- common
- name: Configure database servers
hosts: dbservers
become: yes
roles:
- mysql
environment:
MYSQL_ROOT_PASSWORD: "{{ mysql_root_password }}"
- name: Configure web servers
hosts: webservers
become: yes
roles:
- { role: apache, when: web_server_type == 'apache' }
- { role: nginx, when: web_server_type == 'nginx' }
- php
- deploy
- name: Configure load balancers
hosts: loadbalancers
become: yes
roles:
- haproxy
- keepalived
通过Group Variables和Host Variables实现不同环境的差异化配置:
# inventories/production/group_vars/all.yml
# 全局配置
timezone: Asia/Shanghai
admin_user: deploy
mysql_root_password: "{{ vault_mysql_root_password }}"
# 网络配置
domain_name: example.com
# 服务版本
php_version: "7.4"
mysql_version: "8.0"
# inventories/production/group_vars/webservers.yml
# Web服务器组配置
web_server_type: nginx # 或 apache
# 资源限制
php_memory_limit: 512M
php_max_execution_time: 180
# 监控配置
enable_monitoring: yes
monitoring_agent: zabbix
# inventories/production/group_vars/dbservers.yml
# 数据库服务器组配置
mysql_bind_address: 0.0.0.0
mysql_max_connections: 500
mysql_buffer_pool_size: "2G"
mysql_innodb_log_file_size: "512M"
使用Ansible Vault保护敏感信息:
# 创建加密文件
ansible-vault create inventories/production/group_vars/vault.yml
# 编辑加密文件
ansible-vault edit inventories/production/group_vars/vault.yml
# vault.yml内容
vault_mysql_root_password: "SecurePassword123!"
vault_mysql_database_password: "AppUserPassword456!"
vault_ssl_certificate_key: |
-----BEGIN PRIVATE KEY-----
...
-----END PRIVATE KEY-----
现代LAMP/LEMP架构的自动化运维离不开持续集成和持续部署(CI/CD)流程的支持。完整的CI/CD流程包括以下阶段:
Jenkins作为流行的CI/CD工具,可以通过Pipeline-as-Code定义完整的构建部署流程:
// Jenkinsfile
pipeline {
agent any
environment {
REGISTRY = "registry.example.com"
IMAGE_PREFIX = "lamp-app"
DEPLOY_ENV = "${params.DEPLOY_ENV}"
ANSIBLE_VAULT_PASSWORD = credentials('ansible-vault-password')
}
parameters {
choice(
name: 'DEPLOY_ENV',
choices: ['staging', 'production'],
description: 'Target deployment environment'
)
}
stages {
stage('Checkout') {
steps {
checkout scm
}
}
stage('Code Quality') {
steps {
sh 'composer validate --no-check-all'
sh 'phpcs --standard=PSR2 src/'
}
}
stage('Unit Tests') {
steps {
sh 'phpunit --coverage-text --colors=never'
}
post {
always {
junit 'build/logs/junit.xml'
}
}
}
stage('Build Frontend') {
steps {
sh 'npm install'
sh 'npm run production'
}
}
stage('Build Docker Image') {
when {
expression {
return DEPLOY_ENV == 'staging' || DEPLOY_ENV == 'production'
}
}
steps {
script {
docker.build("${REGISTRY}/${IMAGE_PREFIX}:${env.BUILD_TAG}")
}
}
}
stage('Deploy to Staging') {
when {
branch 'develop'
environment name: 'DEPLOY_ENV', value: 'staging'
}
steps {
withCredentials([file(credentialsId: 'staging-ssh-key', variable: 'SSH_KEY')]) {
sh """
export ANSIBLE_HOST_KEY_CHECKING=False
ansible-playbook -i inventories/staging/hosts.ini \
--private-key=$SSH_KEY \
--vault-password-file=${ANSIBLE_VAULT_PASSWORD} \
site.yml
"""
}
}
}
stage('Deploy to Production') {
when {
branch 'main'
environment name: 'DEPLOY_ENV', value: 'production'
}
steps {
input message: 'Deploy to production?', ok: 'Confirm'
withCredentials([file(credentialsId: 'production-ssh-key', variable: 'SSH_KEY')]) {
sh """
export ANSIBLE_HOST_KEY_CHECKING=False
ansible-playbook -i inventories/production/hosts.ini \
--private-key=$SSH_KEY \
--vault-password-file=${ANSIBLE_VAULT_PASSWORD} \
--tags deployment \
site.yml
"""
}
}
}
}
post {
always {
cleanWs()
}
success {
emailext (
subject: "SUCCESS: Job '${env.JOB_NAME} [${env.BUILD_NUMBER}]'",
body: "Good news! The build ${env.BUILD_URL} completed successfully.",
to: "${env.CHANGE_AUTHOR_EMAIL}"
)
}
failure {
emailext (
subject: "FAILED: Job '${env.JOB_NAME} [${env.BUILD_NUMBER}]'",
body: "Bad news! The build ${env.BUILD_URL} failed. Please investigate.",
to: "${env.CHANGE_AUTHOR_EMAIL}"
)
}
}
}
通过Ansible实现蓝绿部署,减少发布风险:
# roles/deploy/tasks/main.yml
- name: Get current deployment version
stat:
path: /var/www/current
register: current_symlink
- name: Determine deployment type
set_fact:
deployment_type: "{% if current_symlink.stat.exists %}update{% else %}initial{% endif %}"
- name: Create deployment directory
file:
path: "/var/www/releases/{{ deploy_timestamp }}"
state: directory
owner: "{{ deploy_user }}"
group: "{{ deploy_group }}"
mode: 0755
- name: Sync application code
synchronize:
src: "{{ playbook_dir }}/../application/"
dest: "/var/www/releases/{{ deploy_timestamp }}"
rsync_opts:
- "--exclude=node_modules"
- "--exclude=*.git*"
- "--exclude=.env"
- name: Install Composer dependencies
composer:
command: install
working_dir: "/var/www/releases/{{ deploy_timestamp }}"
no_dev: yes
optimized_autoloader: yes
- name: Run database migrations
command: php artisan migrate --force
args:
chdir: "/var/www/releases/{{ deploy_timestamp }}"
when: deployment_type == 'update'
- name: Update symlink to new release
file:
src: "/var/www/releases/{{ deploy_timestamp }}"
dest: /var/www/current
state: link
force: yes
- name: Flush OPcache
uri:
url: "http://localhost/opcache-flush.php"
status_code: 200
ignore_errors: yes
- name: Clean up old releases
file:
path: "/var/www/releases/{{ item }}"
state: absent
with_items: "{{ old_releases }}"
安全地处理数据库模式变更:
# roles/db-migration/tasks/main.yml
- name: Check if migration is needed
command: php artisan migrate:status --format=json
args:
chdir: "/var/www/current"
register: migration_status
changed_when: false
- name: Check for pending migrations
set_fact:
pending_migrations: "{{ (migration_status.stdout | from_json).pending | length }}"
- name: Create database backup before migration
command: >
mysqldump -u {{ mysql_database_user }} -p{{ mysql_database_password }}
{{ mysql_database_name }} | gzip > /backup/{{ mysql_database_name }}_{{ ansible_date_time.epoch }}.sql.gz
when: pending_migrations | int > 0
- name: Run database migrations
command: php artisan migrate --force
args:
chdir: "/var/www/current"
when: pending_migrations | int > 0
register: migration_result
- name: Notify migration result
slack:
token: "{{ slack_token }}"
msg: "Database migration {{ 'succeeded' if migration_result.failed else 'failed' }} on {{ inventory_hostname }}"
when: migration_result is defined
在CI/CD流程中集成全面的自动化测试:
# roles/tests/tasks/main.yml
- name: Install testing dependencies
composer:
command: require
arguments: "--dev phpunit/phpunit:^9.0 codeception/codeception:^4.0"
working_dir: "/var/www/current"
- name: Run unit tests
command: vendor/bin/phpunit
args:
chdir: "/var/www/current"
register: unit_test_result
- name: Run feature tests
command: vendor/bin/codecept run
args:
chdir: "/var/www/current"
register: feature_test_result
- name: Run security check
command: vendor/bin/security-checker security:check
args:
chdir: "/var/www/current"
register: security_check
- name: Generate code coverage report
command: vendor/bin/phpunit --coverage-html reports/coverage
args:
chdir: "/var/www/current"
when: unit_test_result.rc == 0
完善的监控体系应该覆盖基础设施、应用程序和业务逻辑三个层面:
使用Ansible部署和配置Zabbix监控:
# roles/monitoring/tasks/main.yml
- name: Install Zabbix agent
package:
name: zabbix-agent
state: present
- name: Configure Zabbix agent
template:
src: zabbix_agentd.conf.j2
dest: /etc/zabbix/zabbix_agentd.conf
notify: restart zabbix-agent
- name: Enable Zabbix agent
service:
name: zabbix-agent
state: started
enabled: yes
- name: Add custom monitoring scripts
copy:
src: "{{ item }}"
dest: /etc/zabbix/scripts/
mode: 0755
with_fileglob:
- "monitoring/scripts/*.sh"
- name: Configure user parameters
copy:
src: user_parameters.conf
dest: /etc/zabbix/zabbix_agentd.d/user_parameters.conf
notify: restart zabbix-agent
自定义监控脚本示例:
#!/bin/bash
# monitoring/scripts/check_mysql_connections.sh
# MySQL连接数监控
MYSQL_USER="monitor"
MYSQL_PASS="monitor_pass"
MYSQL_HOST="localhost"
case $1 in
max_used)
mysql -h$MYSQL_HOST -u$MYSQL_USER -p$MYSQL_PASS -N -e "SHOW GLOBAL STATUS LIKE 'Max_used_connections'" 2>/dev/null | awk '{print $2}'
;;
threads_connected)
mysql -h$MYSQL_HOST -u$MYSQL_USER -p$MYSQL_PASS -N -e "SHOW GLOBAL STATUS LIKE 'Threads_connected'" 2>/dev/null | awk '{print $2}'
;;
connection_errors)
mysql -h$MYSQL_HOST -u$MYSQL_USER -p$MYSQL_PASS -N -e "SHOW GLOBAL STATUS LIKE 'Connection_errors%'" 2>/dev/null | awk '{sum+=$2} END {print sum}'
;;
*)
echo "Invalid parameter"
exit 1
;;
esac
部署ELK Stack实现集中式日志管理:
# roles/elk/tasks/main.yml
- name: Install Java
package:
name: openjdk-11-jdk
state: present
- name: Add Elasticsearch repository
apt_repository:
repo: "deb https://artifacts.elastic.co/packages/7.x/apt stable main"
state: present
filename: elasticsearch
key: https://artifacts.elastic.co/GPG-KEY-elasticsearch
- name: Install Elasticsearch
package:
name: elasticsearch
state: present
- name: Configure Elasticsearch
template:
src: elasticsearch.yml.j2
dest: /etc/elasticsearch/elasticsearch.yml
notify: restart elasticsearch
- name: Install Logstash
package:
name: logstash
state: present
- name: Configure Logstash pipeline
template:
src: logstash.conf.j2
dest: /etc/logstash/conf.d/lamp.conf
notify: restart logstash
- name: Install Kibana
package:
name: kibana
state: present
- name: Configure Kibana
template:
src: kibana.yml.j2
dest: /etc/kibana/kibana.yml
notify: restart kibana
- name: Configure Filebeat on application servers
include_tasks: filebeat.yml
when: "'webservers' in group_names or 'dbservers' in group_names"
集成应用性能监控(APM)工具:
# roles/apm/tasks/main.yml
- name: Install New Relic infrastructure agent
package:
name: newrelic-infra
state: present
- name: Configure New Relic license
copy:
content: "license_key: {{ newrelic_license_key }}"
dest: /etc/newrelic-infra.yml
notify: restart newrelic-infra
- name: Install New Relic PHP agent
php_extension:
name: newrelic
state: present
- name: Configure New Relic PHP agent
template:
src: newrelic.ini.j2
dest: /etc/php/7.4/mods-available/newrelic.ini
notify: reload php-fpm
自动化安全基线配置:
# roles/security/tasks/main.yml
- name: Apply security patches
apt:
upgrade: dist
update_cache: yes
cache_valid_time: 3600
when: ansible_os_family == "Debian"
- name: Configure UFW firewall
ufw:
state: enabled
policy: deny
direction: incoming
- name: Allow SSH through firewall
ufw:
rule: allow
name: OpenSSH
- name: Allow HTTP through firewall
ufw:
rule: allow
port: "80"
proto: tcp
- name: Allow HTTPS through firewall
ufw:
rule: allow
port: "443"
proto: tcp
- name: Configure SSH hardening
lineinfile:
path: /etc/ssh/sshd_config
regexp: "{{ item.regexp }}"
line: "{{ item.line }}"
state: present
with_items:
- { regexp: '^#?PermitRootLogin', line: 'PermitRootLogin no' }
- { regexp: '^#?PasswordAuthentication', line: 'PasswordAuthentication no' }
- { regexp: '^#?ChallengeResponseAuthentication', line: 'ChallengeResponseAuthentication no' }
- { regexp: '^#?UsePAM', line: 'UsePAM no' }
- { regexp: '^#?MaxAuthTries', line: 'MaxAuthTries 3' }
notify: restart ssh
- name: Configure fail2ban
package:
name: fail2ban
state: present
- name: Configure fail2ban jail.local
template:
src: jail.local.j2
dest: /etc/fail2ban/jail.local
notify: restart fail2ban
- name: Configure system auditd
package:
name: auditd
state: present
- name: Configure audit rules
copy:
src: audit.rules
dest: /etc/audit/rules.d/audit.rules
notify: restart auditd
Web应用安全加固:
# roles/security/tasks/web-security.yml
- name: Configure security headers for Apache
template:
src: security.conf.j2
dest: "{{ apache_conf_dir }}/conf-available/security.conf"
when: web_server_type == 'apache'
notify: restart apache
- name: Configure security headers for Nginx
template:
src: security-headers.conf.j2
dest: /etc/nginx/conf.d/security-headers.conf
when: web_server_type == 'nginx'
notify: reload nginx
- name: Configure PHP security settings
lineinfile:
path: "{{ php_ini_path }}"
regexp: "{{ item.regexp }}"
line: "{{ item.line }}"
state: present
with_items:
- { regexp: '^expose_php', line: 'expose_php = Off' }
- { regexp: '^allow_url_fopen', line: 'allow_url_fopen = Off' }
- { regexp: '^allow_url_include', line: 'allow_url_include = Off' }
- { regexp: '^disable_functions', line: 'disable_functions = exec,passthru,shell_exec,system,proc_open,popen,show_source' }
- { regexp: '^open_basedir', line: 'open_basedir = /var/www/:/tmp/' }
notify: reload php-fpm
- name: Install and configure ModSecurity
package:
name: "{{ 'modsecurity-crs' when web_server_type == 'apache' else 'modsecurity-nginx' }}"
state: present
when: enable_waf | bool
- name: Configure ModSecurity rules
template:
src: modsecurity.conf.j2
dest: /etc/modsecurity/modsecurity.conf
when: enable_waf | bool
notify: "{{ 'restart apache' when web_server_type == 'apache' else 'reload nginx' }}"
自动化合规检查和漏洞扫描:
# roles/compliance/tasks/main.yml
- name: Install Lynis for system auditing
package:
name: lynis
state: present
- name: Run Lynis system audit
command: lynis audit system
register: lynis_report
changed_when: false
- name: Extract Lynis warnings
set_fact:
lynis_warnings: "{{ lynis_report.stdout | regex_findall('Warning\\s*\\(\\s*[0-9]+\\s*\\)\\s*\\:[^\\n]+') }}"
- name: Report Lynis results
debug:
msg: "Lynis found {{ lynis_warnings | length }} warnings that need attention"
when: lynis_warnings | length > 0
- name: Install and run ClamAV
package:
name: clamav
state: present
- name: Update ClamAV definitions
command: freshclam
changed_when: false
- name: Run malware scan
command: clamscan --recursive --infected /var/www/
register: clamscan_result
changed_when: "'Infected files: 0' not in clamscan_result.stdout"
- name: Report malware scan results
fail:
msg: "Malware detected! {{ clamscan_result.stdout }}"
when: "'Infected files: 0' not in clamscan_result.stdout"
实现全栈数据备份:
# roles/backup/tasks/main.yml
- name: Create backup directory
file:
path: /backup
state: directory
owner: root
group: root
mode: 0755
- name: Install and configure BorgBackup
package:
name: borgbackup
state: present
- name: Configure BorgBackup repository
command: >
borg init --encryption=repokey {{ borg_repository }}
environment:
BORG_PASSPHRASE: "{{ borg_encryption_password }}"
args:
creates: "{{ borg_repository }}/config"
- name: Create BorgBackup script
template:
src: borg-backup.sh.j2
dest: /usr/local/bin/borg-backup.sh
mode: 0755
- name: Configure backup cron job
cron:
name: "Automated BorgBackup"
minute: "0"
hour: "2"
job: "/usr/local/bin/borg-backup.sh > /var/log/borg-backup.log 2>&1"
- name: Configure database backups
template:
src: mysql-backup.sh.j2
dest: /usr/local/bin/mysql-backup.sh
mode: 0755
- name: Schedule database backups
cron:
name: "MySQL daily backup"
minute: "30"
hour: "1"
job: "/usr/local/bin/mysql-backup.sh > /var/log/mysql-backup.log 2>&1"
- name: Configure filesystem backups
template:
src: filesystem-backup.sh.j2
dest: /usr/local/bin/filesystem-backup.sh
mode: 0755
- name: Test backup restoration process
command: >
borg extract --dry-run {{ borg_repository }}::{{ ansible_date_time.epoch }}
environment:
BORG_PASSPHRASE: "{{ borg_encryption_password }}"
changed_when: false
自动化灾难恢复Playbook:
# disaster-recovery.yml
- name: Disaster Recovery - Database Restoration
hosts: dbservers
become: yes
vars_files:
- vault.yml
tasks:
- name: Stop application
uri:
url: "http://localhost/maintenance-mode/start"
method: POST
delegate_to: "{{ item }}"
with_items: "{{ groups.webservers }}"
ignore_errors: yes
- name: Identify latest backup
command: >
borg list --short {{ borg_repository }}
environment:
BORG_PASSPHRASE: "{{ borg_encryption_password }}"
register: backup_list
- name: Extract latest database backup
command: >
borg extract --stdout {{ borg_repository }}::{{ backup_list.stdout_lines | last }} db.sql.gz
environment:
BORG_PASSPHRASE: "{{ borg_encryption_password }}"
register: backup_extract
- name: Restore database
shell: |
zcat > /tmp/restore.sql
mysql -u root -p{{ mysql_root_password }} < /tmp/restore.sql
args:
stdin: "{{ backup_extract.stdout }}"
- name: Start application
uri:
url: "http://localhost/maintenance-mode/stop"
method: POST
delegate_to: "{{ item }}"
with_items: "{{ groups.webservers }}"
ignore_errors: yes
- name: Verify service recovery
hosts: all
become: yes
tasks:
- name: Check service status
systemd:
name: "{{ item }}"
state: started
loop:
- mysql
- nginx
- php7.4-fpm
- name: Run application health check
uri:
url: "http://localhost/health"
return_content: yes
register: health_check
until: health_check.status == 200
retries: 10
delay: 5
实现基于规则的故障自动检测:
# roles/self-healing/tasks/main.yml
- name: Configure health checks
template:
src: health-check.sh.j2
dest: /usr/local/bin/health-check.sh
mode: 0755
- name: Create healing actions directory
file:
path: /usr/local/bin/healing-actions
state: directory
- name: Deploy service restart healing action
copy:
src: healing-actions/restart-service.sh
dest: /usr/local/bin/healing-actions/restart-service.sh
mode: 0755
- name: Deploy cache clear healing action
copy:
src: healing-actions/clear-cache.sh
dest: /usr/local/bin/healing-actions/clear-cache.sh
mode: 0755
- name: Configure self-healing cron
cron:
name: "Service health check and self-healing"
minute: "*/5"
job: "/usr/local/bin/health-check.sh > /var/log/health-check.log 2>&1"
- name: Configure log monitoring for common errors
template:
src: log-monitor.sh.j2
dest: /usr/local/bin/log-monitor.sh
mode: 0755
- name: Schedule log monitoring
cron:
name: "Error log monitoring"
minute: "*/10"
job: "/usr/local/bin/log-monitor.sh > /var/log/log-monitor.log 2>&1"
常见故障的自动化处理脚本:
#!/bin/bash
# healing-actions/restart-service.sh
SERVICE=$1
LOG_FILE="/var/log/self-healing.log"
log_message() {
echo "$(date): $1" >> $LOG_FILE
}
case $SERVICE in
mysql)
if systemctl is-active --quiet mysql; then
if ! mysql -e "SELECT 1" > /dev/null 2>&1; then
log_message "MySQL is running but not responsive, restarting"
systemctl restart mysql
# Verify recovery
sleep 10
if mysql -e "SELECT 1" > /dev/null 2>&1; then
log_message "MySQL recovery successful"
# Send notification
curl -X POST -H 'Content-type: application/json' \
--data "{\"text\":\"MySQL automatically recovered on $(hostname)\"}" \
$SLACK_WEBHOOK
else
log_message "MySQL recovery failed"
fi
fi
else
log_message "MySQL is not running, attempting to start"
systemctl start mysql
fi
;;
nginx)
if systemctl is-active --quiet nginx; then
if ! curl -f http://localhost/nginx-status > /dev/null 2>&1; then
log_message "Nginx is running but not responsive, restarting"
systemctl restart nginx
fi
else
log_message "Nginx is not running, attempting to start"
systemctl start nginx
fi
;;
php-fpm)
CURRENT_CHILDREN=$(ps aux | grep php-fpm | grep -v grep | wc -l)
if [ $CURRENT_CHILDREN -lt 2 ]; then
log_message "PHP-FPM has low child processes ($CURRENT_CHILDREN), restarting"
systemctl restart php7.4-fpm
fi
;;
*)
log_message "Unknown service: $SERVICE"
exit 1
;;
esac
基于指标的自动扩容机制:
# roles/auto-scaling/tasks/main.yml
- name: Install monitoring agent for auto-scaling
package:
name: sysstat
state: present
- name: Configure resource monitoring
template:
src: resource-monitor.sh.j2
dest: /usr/local/bin/resource-monitor.sh
mode: 0755
- name: Schedule resource monitoring
cron:
name: "Resource monitoring for auto-scaling"
minute: "*/2"
job: "/usr/local/bin/resource-monitor.sh > /var/log/resource-monitor.log 2>&1"
- name: Configure scale-up actions
template:
src: scale-up.sh.j2
dest: /usr/local/bin/scale-up.sh
mode: 0755
- name: Configure scale-down actions
template:
src: scale-down.sh.j2
dest: /usr/local/bin/scale-down.sh
mode: 0755
基于生产环境经验的性能调优:
# roles/performance/tasks/main.yml
- name: Configure MySQL performance tuning
template:
src: my.cnf.j2
dest: /etc/mysql/my.cnf
notify: restart mysql
- name: Configure PHP-FPM performance tuning
template:
src: php-fpm-pool.conf.j2
dest: /etc/php/7.4/fpm/pool.d/www.conf
notify: reload php-fpm
- name: Configure OPcache for PHP
lineinfile:
path: "{{ php_ini_path }}"
regexp: "{{ item.regexp }}"
line: "{{ item.line }}"
state: present
with_items:
- { regexp: '^opcache.enable', line: 'opcache.enable=1' }
- { regexp: '^opcache.memory_consumption', line: 'opcache.memory_consumption=256' }
- { regexp: '^opcache.max_accelerated_files', line: 'opcache.max_accelerated_files=20000' }
- { regexp: '^opcache.validate_timestamps', line: 'opcache.validate_timestamps=0' }
notify: reload php-fpm
- name: Configure Nginx performance tuning
template:
src: nginx.conf.j2
dest: /etc/nginx/nginx.conf
notify: reload nginx
- name: Configure kernel parameters for high performance
sysctl:
name: "{{ item.name }}"
value: "{{ item.value }}"
state: present
reload: yes
loop:
- { name: net.core.somaxconn, value: 65535 }
- { name: net.ipv4.tcp_max_syn_backlog, value: 65535 }
- { name: net.core.netdev_max_backlog, value: 65535 }
- { name: fs.file-max, value: 2097152 }
根据实践经验总结的自动化运维成熟度评估:
建立自动化运维的持续改进流程:
# roles/improvement/tasks/main.yml
- name: Collect performance metrics
template:
src: collect-metrics.sh.j2
dest: /usr/local/bin/collect-metrics.sh
mode: 0755
- name: Schedule metrics collection
cron:
name: "Performance metrics collection"
minute: "*/5"
job: "/usr/local/bin/collect-metrics.sh > /var/log/collect-metrics.log 2>&1"
- name: Generate weekly performance report
template:
src: generate-report.sh.j2
dest: /usr/local/bin/generate-report.sh
mode: 0755
- name: Schedule weekly reporting
cron:
name: "Weekly performance report"
minute: "0"
hour: "6"
weekday: "1"
job: "/usr/local/bin/generate-report.sh | mail -s 'Weekly Performance Report' admin@example.com"
- name: Configure automated improvement suggestions
template:
src: improvement-suggestions.sh.j2
dest: /usr/local/bin/improvement-suggestions.sh
mode: 0755
LAMP/LEMP架构的自动化维护是一个系统性工程,需要从架构设计、工具选型、流程规范到持续优化的全方位考量。通过结合Ansible的配置管理能力和CI/CD的持续交付理念,可以构建一个高效、稳定、可扩展的自动化运维体系。
本文介绍的方案具有以下核心优势:
随着云原生技术和AIOps的发展,LAMP/LEMP架构的自动化运维将向着更智能、更预见性的方向演进。运维团队应持续关注新技术发展,不断优化现有流程,构建更加智能高效的运维体系。
本文基于生产环境实践和行业最佳实践编写,具体实施时请根据实际环境进行调整和测试。自动化运维是一个持续改进的过程,需要不断地优化和适应新的技术挑战。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。