我想直接从SQL语句中提取出结果表的列名:
query = """
select
sales.order_id as id,
p.product_name,
sum(p.price) as sales_volume
from sales
right join products as p
on sales.product_id=p.product_id
group by id, p.product_name;
"""
column_names = parse_sql(query)
# column_names:
# ['id', 'product_name', 'sales_volume']
知道在parse_sql()
里该做什么吗?结果函数应该能够识别别名并删除表别名/标识符(例如“sales”)。或"p.")。
提前感谢!
发布于 2022-01-17 17:35:12
我使用库平方解析做了类似的事情。基本上,这个库接受您的SQL查询并将其标记化。完成之后,您可以搜索select查询令牌并解析底层令牌。在代码中,这读起来像
import sqlparse
def find_selected_columns(query) -> list[str]:
tokens = sqlparse.parse(query)[0].tokens
found_select = False
for token in tokens:
if found_select:
if isinstance(token, sqlparse.sql.IdentifierList):
return [
col.value.split(" ")[-1].strip("`").rpartition('.')[-1]
for col in token.tokens
if isinstance(col, sqlparse.sql.Identifier)
]
else:
found_select = token.match(sqlparse.tokens.Keyword.DML, ["select", "SELECT"])
raise Exception("Could not find a select statement. Weired query :)")
此代码也适用于具有公共表表达式的查询,即它只返回最终的select列。根据您正在使用的SQL方言和引号,您可能必须修改行col.value.split(“”)-1.条带(“`”).rpartition(‘.’)-1
发布于 2022-04-30 05:58:03
尝试SQLGlot
它比sqlparse更容易出错,也更容易出错。
import sqlglot
import sqlglot.expressions as exp
query = """
select
sales.order_id as id,
p.product_name,
sum(p.price) as sales_volume
from sales
right join products as p
on sales.product_id=p.product_id
group by id, p.product_name;
"""
column_names = []
for expression in sqlglot.parse_one(query).find(exp.Select).args["expressions"]:
if isinstance(expression, exp.Alias):
column_names.append(expression.text("alias"))
elif isinstance(expression, exp.Column):
column_names.append(expression.text("this"))
print(column_names)
https://stackoverflow.com/questions/70745252
复制相似问题