在 Elasticsearch (ES) 中进行分页查询主要有三种方式:from + size
、search_after
和 scroll
。每种方式都有其适用场景和优缺点。
from
(起始位置)和size
(每页数量)来获取数据。它简单易用,适用于数据量不大或不需要深度分页的场景。但是,当from
值很大时,性能会下降,因为需要合并和排序所有分片返回的结果。ES 默认的max_result_window
限制了最大分页数,通常为 10000,这意味着from + size
的值不能超过这个限制。如果需要处理大量数据或深度分页,这种方式可能不是最佳选择。选择哪种分页方式取决于具体的需求和场景。对于大多数常见的分页需求,from + size
可能足够使用。但如果需要处理大量数据或进行深度分页,那么scroll
或search_after
可能是更好的选择。在实际应用中,需要根据数据量、查询频率、实时性要求等因素综合考虑。
GET _search
{
"query": {
"match_all": {}
}
}
GET /
GET /_cluster/health
GET /_cat/health?v
GET /db01_v1_20240903-index/_search
GET /db01_v1_20240903-index/_search?from=0&size=10
POST /db01_v1_20240903-index/_search
{
"from": 0,
"size": 10,
"_source": ["_id", "id", "source", "target", "description", "weight"],
"query": {
"query_string": {
"query": "source:*應用*",
"default_field": "source",
"fuzziness": 1
}
}
}
GET /db01_v1_20240903-index/_search?from=0&size=10
POST /db01_v1_20240903-index/_search
{
"from": 0,
"size": 10,
"query": {
"query_string": {
"query": "subject_id:*我照顧的人*",
"default_field": "subject_id",
"fuzziness": 1
}
}
}
POST /db01_v1_20240903-index/_search
{
"from": 0,
"size": 10,
"query": {
"match": {
"subject_id": {
"query": "照顧",
"fuzziness": 1
}
}
}
}
def list_label_readable(self, name, page, page_size, label):
all_docs = []
if page < 1 or page_size <= 0:
return all_docs, 0
label_dict = QueryEnum.query_info.value[label]
index_name = self.index_prefix + label_dict['index_name']
response = self._es.search(
index=index_name,
body={
"from": (page - 1) * page_size,
"size": page_size,
"_source": label_dict['_source'],
"query": {
"query_string": {
"query": f"{label_dict['query_name']}:*{name}*",
"default_field": f"{label_dict['query_name']}",
"fuzziness": label_dict['fuzziness']
}
}
if label == QueryEnum.ENTITIES.value or label == QueryEnum.RELATIONSHIPS.value else
{
"match": {
f"{label_dict['query_name']}":
{
"query": f"{name}",
"fuzziness": label_dict['fuzziness']
}
}
},
},
)
total = response['hits']['total']['value']
hits = response['hits']['hits']
for hit in hits:
_source = hit['_source']
_source['_id'] = hit['_id']
all_docs.append(_source)
return all_docs, total
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。