文章/答案/技术大牛

发布

Python 与 Elasticsearch 的完美结合：elasticsearch-dsl-py 详解

文章来源：企鹅号 - 孙导TV

Elasticsearch 是一个强大的开源搜索和分析引擎，被广泛应用于各种场景，如日志分析、全文搜索、监控等。elasticsearch-dsl-py 是 Elasticsearch 官方提供的高级 Python 客户端库，它构建在 elasticsearch-py (低级客户端) 之上，提供了更高级别的抽象，使得与 Elasticsearch 的交互更加简单、直观。

「elasticsearch-py 与 elasticsearch-dsl-py 的关系」

「elasticsearch-py：」是 Elasticsearch 的官方低级 Python 客户端，它直接映射 Elasticsearch 的 REST API，提供了最大的灵活性，但使用起来相对繁琐。

「elasticsearch-dsl-py：」是基于 elasticsearch-py 的高级客户端，它提供了一种更 Pythonic 的方式来构建和执行查询，并支持将 Elasticsearch 文档映射为 Python 对象，大大简化了开发过程。

简单来说，elasticsearch-py 相当于直接操作 HTTP 请求，而 elasticsearch-dsl-py 则提供了一套更友好的工具和抽象，让你更专注于业务逻辑，而不是底层的 HTTP 细节。

「安装 elasticsearch-dsl-py」

可以使用 pip 安装 elasticsearch-dsl-py：

pip install elasticsearch-dsl

它会自动安装 elasticsearch-py 作为依赖。

「核心概念」

「Document：」将 Elasticsearch 文档映射为 Python 类，方便操作和管理数据。

「Search：」用于构建和执行搜索查询的对象，提供了丰富的查询 DSL (Domain Specific Language) 支持。

「Query：」表示各种类型的查询，如 match、term、bool 等。

「Filter：」用于过滤搜索结果的条件。

「Aggregation：」用于对搜索结果进行聚合分析，如统计、分组等。

「示例1：定义 Document 和创建索引」

from elasticsearch import Elasticsearch

from elasticsearch_dsl import Document, Text, Date, Integer, Keyword, Index

# 连接到 Elasticsearch

client = Elasticsearch(hosts=["http://localhost:9200"])

# 定义 Document 类

class Article(Document):

title = Text(analyzer='snowball') # 使用 snowball 分析器

content = Text()

publish_date = Date()

author_id = Integer()

tags = Keyword()

class Index:

name = 'my-blog'# 索引名称

# 创建索引（如果不存在）并映射 Document

Article.init(using=client)

# 打印索引信息

index = Index('my-blog', using=client)

print(index.get())

在这个例子中，我们定义了一个Article类，它继承自Document，并定义了各个字段的类型。Index类用于指定索引名称。Article.init()方法用于创建索引和映射。

「示例2：添加文档」

from datetime import datetime

article = Article(

title="使用 elasticsearch-dsl-py",

content="本文介绍了 elasticsearch-dsl-py 的基本用法。",

publish_date=datetime.now(),

author_id=123,

tags=["elasticsearch", "python"]

)

article.save(using=client) # 保存文档到 Elasticsearch

print(f"文档 ID: {article.meta.id}")

我们创建了一个Article实例，并使用save()方法将其保存到 Elasticsearch。

「示例3：执行搜索查询」

from elasticsearch_dsl import Search, Q

# 构建搜索查询

s = Search(using=client, index="my-blog")

s = s.query("match", content="elasticsearch") # 匹配 content 字段包含 "elasticsearch" 的文档

#添加过滤条件

s = s.filter("term", author_id=123)

#添加多个过滤条件

s = s.filter("terms", tags=["python", "elasticsearch"])

# 执行搜索

response = s.execute()

# 遍历搜索结果

print(f"搜索到 {response.hits.total.value} 个结果：")

for hit in response:

print(f"标题: {hit.title}")

print(f"内容: {hit.content}")

print(f"发布日期: {hit.publish_date}")

print("---")

在这个例子中，我们使用Search对象构建了一个查询，并使用query()方法指定了查询类型为match。我们还展示了如何使用filter添加过滤条件，使用terms添加多个过滤条件。execute()方法用于执行搜索。

「示例4：使用 Query 对象构建复杂查询」

from elasticsearch_dsl import Search, Q

s = Search(using=client, index="my-blog")

# 构建 bool 查询

q = Q(

"bool",

must=[Q("match", content="python")], # 必须匹配 content 字段包含 "python" 的文档

should=[Q("match", title="elasticsearch")], # 应该匹配 title 字段包含 "elasticsearch" 的文档

minimum_should_match=1, # 至少匹配一个 should 子句

must_not=[Q("term", author_id=456)] # 必须不匹配 author_id 为 456 的文档

)

s = s.query(q)

response = s.execute()

print(f"复杂查询搜索到 {response.hits.total.value} 个结果：")

for hit in response:

print(f"标题: {hit.title}")

print("---")

这个例子展示了如何使用Q对象构建更复杂的bool查询，包括must、should、minimum_should_match和must_not等子句。

「elasticsearch-dsl-py 的优势」

「更简洁的语法：」使用更 Pythonic 的方式构建查询，代码更易读、易维护。

「更高级的抽象：」提供了 Document、Search、Query 等对象，简化了与 Elasticsearch 的交互。

「类型安全：」可以定义字段类型，避免数据类型错误。

「更好的可读性：」查询语句更接近自然语言，易于理解。

elasticsearch-dsl-py 是一个非常优秀的 Elasticsearch Python 客户端库，它提供了更高级别的抽象，使得与 Elasticsearch 的交互更加简单、高效。对于需要使用 Elasticsearch 的 Python 项目，强烈推荐使用 elasticsearch-dsl-py。

发表于: 2025-01-032025-01-03 09:04:41
原文链接：https://page.om.qq.com/page/Oqu9N7gB4GOpY4xRnMOD3jiA0
腾讯「腾讯云开发者社区」是腾讯内容开放平台帐号（企鹅号）传播渠道之一，根据《腾讯内容开放平台服务协议》转载发布内容。
如有侵权，请联系 cloudcommunity@tencent.com 删除。

扫码

添加站长进交流群

领取专属 10元无门槛券

私享最新 技术干货

Python 与 Elasticsearch 的完美结合：elasticsearch-dsl-py 详解

相关快讯

扫码

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐