我的应用程序的一部分有一个很大的问题。我正在使用SQLAlchemy和MySQL的组合,大多数东西都工作得很好,但有一个问题是永远加载,有时甚至需要5-6分钟,加载客户列表。该表大约有3000行,对于数据库标准来说,这应该是相当小的,我在一个稍微大一点的表(25k行)上做了一个简单的连接。
SQL Alchemy中的查询如下:
last_inv = db.session.query(Sales.id).order_by(Sales.invoice_date.desc()).filter(Customer.email == Sales.email).limit(1).correlate(Customer)
results = db.session.query(Customer, last_inv.as_scalar()).filter_by(archive=0)
原始SQL如下所示:
SELECT customer.id AS customer_id
, customer.first_name AS customer_first_name
, customer.middle_name AS customer_middle_name
, customer.last_name AS customer_last_name
, customer.email AS customer_email
, customer.password AS customer_password
, customer.address1 AS customer_address1
, customer.address2 AS customer_address2
, customer.city AS customer_city
, customer.state AS customer_state
, customer.zip AS customer_zip
, customer.country AS customer_country
, customer.phone AS customer_phone
, customer.cell_phone AS customer_cell_phone
, customer.current_plan AS customer_current_plan
, customer.minutes_current_plan AS customer_minutes_current_plan
, customer.orig_sales_id AS customer_orig_sales_id
, customer.sales_id AS customer_sales_id
, customer.team_id AS customer_team_id
, customer.refill_date AS customer_refill_date
, customer.minutes_refill_date AS customer_minutes_refill_date
, customer.active AS customer_active
, customer.archive AS customer_archive
, customer.imported AS customer_imported
, customer.ipaddress AS customer_ipaddress
, customer.auto_renewal AS customer_auto_renewal
, customer.signup_date AS customer_signup_date
, customer.esn AS customer_esn
, customer.last_update_date AS customer_last_update_date
, customer.last_update_by AS customer_last_update_by
, customer.notes AS customer_notes
, customer.current_pin AS customer_current_pin
, customer.minutes_current_pin AS customer_minutes_current_pin
, customer.security_pin AS customer_security_pin
, (SELECT sales.id
FROM sales
WHERE customer.email = sales.email
ORDER
BY sales.invoice_date DESC LIMIT 1) AS anon_1
FROM customer
WHERE customer.team_id = 1
AND customer.archive = 0
我尝试过很多方法,但这真的让我开始感到绝望。这都是在亚马逊上运行的,运行时htop
显示了mysql的100%使用率。在phpmyadmin上的一个查询的分析器,HeidiSQL显示这是在不到两秒内完成的(当在cahce中没有命中时),所以它不是导致这一点的实际查询(正如我所理解的那样公平)。
这是EXPLAIN
显示的内容:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY customer ALL NULL NULL NULL NULL 3621 Using where
2 DEPENDENT SUBQUERY sales ALL NULL NULL NULL NULL 22619 Using where; Using filesort
来自phpmyadmin的分析器是here和可视化表示here。
我在EC2上运行了一个m1.mall实例,内存为1650MB。
我也运行了一个mysqlprofiler,下面是我所做的before和after优化的结果。我的my.cnf
文件是here。
我曾尝试在表上运行OPTIMIZE
,但由于某些原因,未优化的表数始终为98,所以我猜我做错了什么。为此,我使用了this脚本,以及phpmyadmin中的原始sql,但没有成功。
发布于 2014-04-21 13:27:13
尝试创建此多列索引,这将使查询速度更快:
CREATE INDEX sales_eml_invdat ON sales( email, invoice_date );
或者甚至在三列中
CREATE INDEX sales_eml_invdat_id ON sales( email, invoice_date, id );
但仅在id
不是主键列的情况下。
如果id
是主键,那么前一个索引就足够了。
-编辑-
对不起,我忘了MySql并不像其他数据库管理系统那么聪明。
它本身不能检测到这种情况,必须明确地告诉他如何去做。
请将子查询重写为:
SELECT sales.id
FROM sales
WHERE customer.email = sales.email
ORDER BY sales.email DESC, sales.invoice_date DESC
LIMIT 1
此更改使MySql能够使用( email, invoice_date )
索引跳过文件排序,请尝试。
https://stackoverflow.com/questions/23188121
复制相似问题