最近几天,我一直在开发一个非常简单的网页,我决定使用SQLAlchemy和Flask作为后端堆栈。
我试图找到一种方法来插入数据。我在一个问题中发现了session.merge
,我的同事也告诉我要使用session.bulk_update_mappings
。但是,这两种方法都比“查询所有的->费力地更新所有”慢得多。
表DDL:
create table if not exists testdb.table_datetime
(
id int auto_increment
primary key,
col1 varchar(50) null,
col2 varchar(50) null,
col3 varchar(50) null,
col4 varchar(50) null,
col5 varchar(50) null,
col6 varchar(50) null,
local_modified datetime null
)
collate=utf8mb4_bin;
我的ORM模型:
class DemoClass(Base):
__tablename__ = 'table_datetime'
local_modified = Column(DATETIME)
# __tablename__ = 'table_varchar'
# local_modified = Column(String)
id = Column(Integer, primary_key=True)
col1 = Column(String)
col2 = Column(String)
col3 = Column(String)
col4 = Column(String)
col5 = Column(String)
col6 = Column(String)
代码:
# Method 1: bulk update'
# Slow, 6~8s for 500 lines update
# with Session() as session:
# session.bulk_update_mappings(DemoClass, data)
# session.commit()
# Method 2: query all once, manully update and commit
# Fast, less than 1s
with Session() as session:
db_rows = session.query(DemoClass).all() # 500 records in all
for item in data:
for db_row in db_rows:
if item['id'] == db_row.id:
db_row.col1 = item['col1']
db_row.col2 = item['col2']
db_row.col3 = item['col3']
db_row.col4 = item['col4']
db_row.col5 = item['col5']
db_row.col6 = item['col6']
# db_row.local_modified = item['local_modified']
db_row.local_modified = datetime.strptime(item['local_modified'], '%Y-%m-%d %H:%M:%S')
break
session.commit()
# Method 3 construct ORM class and session merge each, then commit
# Slow, 6~8s for 500 lines update
# try:
# items = [DemoClass(id=d['id'], col1=d['col1'], col2=d['col2'],col3=d['col3'],
# col4=d['col4'],col5=d['col5'],col6=d['col6']
# ) for d in data]
# except KeyError as e:
# return {'status_code': 400, 'error_msg': f"update failed, JSON payload missing key: {e}"}, 400
# try:
# with Session() as session:
# for m in items:
# # print(3) # merge is very slow
# session.merge(m)
# session.commit()
# # print(4)
# except exc.SQLAlchemyError as e:
# return {'status_code': 500, 'error_msg': 'server error, please contact admin'}, 500
# # print(5)
我的测试环境:
Flask==2.0.1
PyMySQL==1.0.2
MySQL==1.0.2
SQLAlchemy==1.4.25
Python 3.6.5 Windows
MySQL remote server, Server version 5.7.33-0ubuntu0.18.04.1
update
完整的测试代码和设置可以在这里获得:https://github.com/ajfg93/sqlalchemy-demo/tree/datetime_slow
为什么session.merge
和session.bulk_update_mappings
要慢得多?从SQLAlchemy打印的回显消息中,我可以看出session.merge
发出了许多select
,我想这就是原因。但我不知道bulk_update_mappings
的原因。
顺便问一下,在SQLAlchemy中正确的插入方式是什么?我在谷歌上搜索了一下,但它把我带到了session.merge
,这让我失望了。上面的“方法2”是正确的吗?
发布于 2021-10-01 19:52:03
时间差的原因与您在另一个问题https://stackoverflow.com/a/69411298/7589014中的原因相同,即由于没有对对象进行更改,所以不会对db进行任何更改。
https://stackoverflow.com/questions/69410467
复制相似问题