前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >Elasticsearch使用:Scripting API(二)

Elasticsearch使用:Scripting API(二)

作者头像
HLee
修改2021-02-04 16:34:10
1.4K0
修改2021-02-04 16:34:10
举报
文章被收录于专栏:房东的猫

简介

我们之前看见了在 Elasticsearch 里的 ingest node 里,我们可以通过以下 processor 的处理帮我们处理我们的一些数据。它们的功能是非常具体而明确的。那么在 Elasticsearch 里,有没有一种更加灵活的方式可供我们来进行编程处理呢?如果有,它使用的语言是什么呢?

在 Elasticsearch 中,它使用了一个叫做 Painless 的语言。它是专门为 Elasticsearch 而建立的。Painless 是一种简单,安全的脚本语言,专为与 Elasticsearch 一起使用而设计。 它是 Elasticsearch 的默认脚本语言,可以安全地用于 inline 和 stored 脚本。它具有像 Groovy 那样的语法。自 Elasticsearch 6.0 以后的版本不再支持 Groovy,Javascript 及 Python 语言。

使用脚本,你可以在 Elasticsearch 中评估自定义表达式。 例如,您可以使用脚本来返回 “script fields” 作为搜索请求的一部分,或者评估查询的自定义分数。

脚本

脚本的语法为:

代码语言:javascript
复制
"script": {
    "lang":   "...",  
    "source" | "id": "...",  
    "params": { ... }  
  }
  • 这里 lang 默认的值为 "painless"。在实际的使用中可以不设置,除非有第二种语言供使用
  • source 可以为 inline 脚本,或者是一个 id,那么这个 id 对应于一个 stored 脚本
  • 任何有名字的参数,可以被用于脚本的输入参数

Scripting应用

1.inline 脚本

首先我们来创建一个简单的文档:

代码语言:javascript
复制
PUT twitter/_doc/1
{
  "user": "双榆树-张三",
  "message": "今儿天气不错啊,出去转转去",
  "uid": 2,
  "age": 20,
  "city": "北京",
  "province": "北京",
  "country": "中国",
  "address": "中国北京市海淀区",
  "location": {
    "lat": "39.970718",
    "lon": "116.325747"
  }
}

在这个文档里,我们现在想把 age 修改为 30,那么一种办法就是把所有的文档内容都读出来,让修改其中的 age 想为30,再重新用同样的方法写进去。首先这里需要有几个动作:先读出数据,然后修改,再次写入数据。显然这样比较麻烦。在这里我们可以直接使用 Painless 语言直接进行修改:

代码语言:javascript
复制
POST twitter/_update/1
{
  "script": {
    "source": "ctx._source.age = 30"
  }
}

这里的 source 表明是我们的 Painless 代码。这里我们只写了很少的代码在 DSL 之中。这种代码称之为 inline。在这里我们直接通过 ctx._source.age 来访问  _souce 里的 age。这样我们通过编程的办法直接对年龄进行了修改。运行的结果是:

代码语言:javascript
复制
{
    "_index":"twitter",
    "_type":"_doc",
    "_id":"1",
    "_version":16,
    "_seq_no":20,
    "_primary_term":1,
    "found":true,
    "_source":{
        "user":"双榆树-张三",
        "message":"今儿天气不错啊,出去转转去",
        "uid":2,
        "age":30,
        "city":"北京",
        "province":"北京",
        "country":"中国",
        "address":"中国北京市海淀区",
        "location":{
            "lat":"39.970718",
            "lon":"116.325747"
        }
    }
}

显然这个 age 已经改变为 30。上面的方法固然好,但是每次执行 scripts 都是需要重新进行编译的。编译好的 script 可以缓存并供以后使用。上面的 script 如果是改变年龄的话,需要重新进行编译。一种更好的方法是改为这样的:

代码语言:javascript
复制
POST twitter/_update/1
{
  "script": {
    "source": "ctx._source.age = params.value",
    "params": {
      "value": 34
    }
  }
}

这样,我们的 script 的 source 是不用改变的,只需要编译一次。下次调用的时候,只需要修改 params 里的参数即可。

在 Elasticsearch 里,以下两个被视为两个不同的脚本,需要分别进行编译,所以最好的办法是使用 params 来传入参数。

代码语言:javascript
复制
"script": {  "source": "ctx._source.num_of_views += 2"}

"script": {  "source": "ctx._source.num_of_views += 3"}

除了上面的 update 之外,我们也可以使用 script query 来对我们的文档来继续搜索:

代码语言:javascript
复制
GET twitter/_search
{
  "query": {
    "script": {
      "script": {
        "source": "doc['city'].contains(params.name)",
        "lang": "painless",
        "params": {
          "name": "北京"
        }
      }
    }
  }
}

在上面的脚本中,查询在 city 字段中含有 “北京” 的所有文档。

2.存储的脚本 (stored script)

在这种情况下,scripts 可以被存放于一个集群的状态中。它之后可以通过 ID 进行调用:

代码语言:javascript
复制
PUT _scripts/add_age
{
  "script": {
    "lang": "painless",
    "source": "ctx._source.age += params.value"
  }
}

在这里,我们定义了一个叫做 add_age 的 script。它的作用就是帮我们把 source 里的 age 加上一个数值。我们可以在之后调用它:

代码语言:javascript
复制
POST twitter/_update/1
{
  "script": {
    "id": "add_age",
    "params": {
      "value": 2
    }
  }
}

通过上面的执行,我们可以看到,age 将会被加上 2。

3.访问source里的字段

Painless 中用于访问字段值的语法取决于上下文。在 Elasticsearch 中,有许多不同的 Plainless上下文。就像那个链接显示的那样,Plainless 上下文包括:ingest processor, update, update by query, sort,filter等等。

Context

访问字段

Ingest node: 访问字段使用ctx

ctx.field_name

Updates: 使用_source 字段

ctx._source.field_name

这里的 updates 包括 _update,_reindex 以及 update_by_query。这里,我们对于 context(上下文的理解)非常重要。它的意思是针对不同的 API,在使用中 ctx 所包含的字段是不一样的。在下面的例子中,我们针对一些情况来做具体的分析。

首先我们创建一个叫做 add_field_c 的 pipeline。

例子1

代码语言:javascript
复制
PUT _ingest/pipeline/add_field_c
{
  "processors": [
    {
      "script": {
        "lang": "painless",
        "source": "ctx.field_c = (ctx.field_a + ctx.field_b) * params.value",
        "params": {
          "value": 2
        }
      }
    }
  ]
}

这个 pipepline 的作用是创建一个新的field:field_c。它的结果是 field_a 及 field_b 的和,并乘以 2。那么我们创建一个如下的文档:

代码语言:javascript
复制
PUT test_script/_doc/1?pipeline=add_field_c
{
  "field_a": 10,
  "field_b": 20
}

在这里,我们使用了pipleline add_field_c。执行后的结果是:

代码语言:javascript
复制
POST test_script/_search
{
  "query": {
    "match_all": {}
  }
}

结果:
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "test_script",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "field_c" : 60,
          "field_a" : 10,
          "field_b" : 20
        }
      }
    ]
  }
}

显然,我们可以看到 field_c 被成功创建了。

例子2

在 ingest 过程中,可以使用脚本处理器来处理 metadata,如 _index 和 _type。 下面是一个Ingest Pipeline 的示例,无论原始索引请求中提供了什么,它都会将索引和类型重命名为 my_index:

代码语言:javascript
复制
PUT _ingest/pipeline/my_index
{
  "description": "use index:my_index and type:_doc",
  "processors": [
    {
      "script": {
        "source": "ctx._index = 'my_index'; ctx._type = '_doc';"
      }
    }
  ]
}

使用上面的 pipeline,我们可以尝试 index 一个文档到 any_index:

代码语言:javascript
复制
PUT any_index/_doc/1?pipeline=my_index
{
  "message": "text"
}

结果:
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "my_index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "message" : "text"
        }
      }
    ]
  }
}

也就是说真正的文档时存到 my_index 之中,而不是 any_index。

例子3

代码语言:javascript
复制
PUT _ingest/pipeline/blogs_pipeline
{
  "processors": [
    {
      "script": {
        "source": """ if (ctx.category == "") { ctx.category = "None"} """
      }
    }
  ]
}

我们上面定义了一个 pipeline,它可以帮我们检查如果 category 字段是否为空,如果是,就修改为 “None”。还是以之前的那个 test_script 索引为例:

代码语言:javascript
复制
PUT test_script/_doc/2?pipeline=blogs_pipeline
{
  "field_a": 5,
  "field_b": 10,
  "category": ""
} 

GET test_script/_doc/2

结果:
{
  "_index" : "test_script",
  "_type" : "_doc",
  "_id" : "2",
  "_version" : 1,
  "_seq_no" : 1,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "field_a" : 5,
    "field_b" : 10,
    "category" : "None"
  }
}

显然,它把 category 为 “” 的字段变为 “None” 了。

例子4

代码语言:javascript
复制
POST _reindex
{
  "source": {
    "index": "blogs"
  },
  "dest": {
    "index": "blogs_fixed"
  },
  "script": {
    "source": """ if (ctx._source.category == "") { ctx._source.category = "None" }"""
  }
}

上面的这个例子在 reindex 时,如果 category 为空时,写入“None”。我们可以从上面的两个例子中看出来,针对 pipeline,我们可以直接对 cxt.field 进行操作,而针对 update 来说,我们可以对 cxt._source 下的字段进行操作。这也是之前提到的上下文的区别。

例子5

代码语言:javascript
复制
PUT test/_doc/1
{
  "counter": 1,
  "tags": [
    "red"
  ]
}

您可以使用和 update 脚本将 tag 添加到 tags 列表(这只是一个列表,因此即使存在标记也会添加):

代码语言:javascript
复制
POST test/_update/1
{
  "script": {
    "source": "ctx._source.tags.add(params.tag)",
    "lang": "painless",
    "params": {
      "tag": "blue"
    }
  }
}

GET test/_doc/1

结果:
{
  "_index" : "test",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 2,
  "_seq_no" : 1,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "counter" : 1,
    "tags" : [
      "red",
      "blue"
    ]
  }
}

显示 “blue”,已经被成功加入到 tags 列表之中了。

您还可以从 tags 列表中删除 tag。 删除 tag 的 Painless 函数采用要删除的元素的数组索引。 为避免可能的运行时错误,首先需要确保 tag 存在。 如果列表包含tag的重复项,则此脚本只删除一个匹配项。

代码语言:javascript
复制
POST test/_update/1
{
  "script": {
    "source": "if (ctx._source.tags.contains(params.tag)) { ctx._source.tags.remove(ctx._source.tags.indexOf(params.tag)) }",
    "lang": "painless",
    "params": {
      "tag": "blue"
    }
  }
} 

GET test/_doc/1

结果:
{
  "_index" : "test",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 3,
  "_seq_no" : 2,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "counter" : 1,
    "tags" : [
      "red"
    ]
  }
}

“blue” 显然已经被删除了。

4.使用 Painless 访问 Doc 里的值

为了说明 Painless 的工作原理,让我们将一些曲棍球统计数据加载到 Elasticsearch 索引中:

代码语言:javascript
复制
PUT hockey/_bulk?refresh
{"index":{"_id":1}}
{"first":"johnny","last":"gaudreau","goals":[9,27,1],"assists":[17,46,0],"gp":[26,82,1],"born":"1993/08/13"}
{"index":{"_id":2}}
{"first":"sean","last":"monohan","goals":[7,54,26],"assists":[11,26,13],"gp":[26,82,82],"born":"1994/10/12"}
{"index":{"_id":3}}
{"first":"jiri","last":"hudler","goals":[5,34,36],"assists":[11,62,42],"gp":[24,80,79],"born":"1984/01/04"}
{"index":{"_id":4}}
{"first":"micheal","last":"frolik","goals":[4,6,15],"assists":[8,23,15],"gp":[26,82,82],"born":"1988/02/17"}
{"index":{"_id":5}}
{"first":"sam","last":"bennett","goals":[5,0,0],"assists":[8,1,0],"gp":[26,1,0],"born":"1996/06/20"}
{"index":{"_id":6}}
{"first":"dennis","last":"wideman","goals":[0,26,15],"assists":[11,30,24],"gp":[26,81,82],"born":"1983/03/20"}
{"index":{"_id":7}}
{"first":"david","last":"jones","goals":[7,19,5],"assists":[3,17,4],"gp":[26,45,34],"born":"1984/08/10"}
{"index":{"_id":8}}
{"first":"tj","last":"brodie","goals":[2,14,7],"assists":[8,42,30],"gp":[26,82,82],"born":"1990/06/07"}
{"index":{"_id":39}}
{"first":"mark","last":"giordano","goals":[6,30,15],"assists":[3,30,24],"gp":[26,60,63],"born":"1983/10/03"}
{"index":{"_id":10}}
{"first":"mikael","last":"backlund","goals":[3,15,13],"assists":[6,24,18],"gp":[26,82,82],"born":"1989/03/17"}
{"index":{"_id":11}}
{"first":"joe","last":"colborne","goals":[3,18,13],"assists":[6,20,24],"gp":[26,67,82],"born":"1990/01/30"}

文档里的值可以通过一个叫做 doc 的 Map 值来访问。例如,以下脚本计算玩家的总进球数。 此示例使用类型 int 和 fo r循环。

代码语言:javascript
复制
GET hockey/_search
{
  "query": {
    "function_score": {
      "script_score": {
        "script": {
          "lang": "painless",
          "source": " int total = 0; for (int i = 0; i < doc['goals'].length; ++i) {              total += doc['goals'][i]; } return total; "
        }
      }
    }
  }
}

这里我们通过 script 来计算每个文档的 _score。通过 script 把每个运动员的 goal 都加起来,并形成最终的 _score。这里我们通过doc['goals'] 这个 Map 类型来访问我们的字段值。显示的结果为:

代码语言:javascript
复制
{
  "took" : 12,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10,
      "relation" : "eq"
    },
    "max_score" : 87.0,
    "hits" : [
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 87.0,
        "_source" : {
          "first" : "sean",
          "last" : "monohan",
          "goals" : [
            7,
            54,
            26
          ],
          "assists" : [
            11,
            26,
            13
          ],
          "gp" : [
            26,
            82,
            82
          ],
          "born" : "1994/10/12"
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 75.0,
        "_source" : {
          "first" : "jiri",
          "last" : "hudler",
          "goals" : [
            5,
            34,
            36
          ],
          "assists" : [
            11,
            62,
            42
          ],
          "gp" : [
            24,
            80,
            79
          ],
          "born" : "1984/01/04"
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "39",
        "_score" : 51.0,
        "_source" : {
          "first" : "mark",
          "last" : "giordano",
          "goals" : [
            6,
            30,
            15
          ],
          "assists" : [
            3,
            30,
            24
          ],
          "gp" : [
            26,
            60,
            63
          ],
          "born" : "1983/10/03"
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "6",
        "_score" : 41.0,
        "_source" : {
          "first" : "dennis",
          "last" : "wideman",
          "goals" : [
            0,
            26,
            15
          ],
          "assists" : [
            11,
            30,
            24
          ],
          "gp" : [
            26,
            81,
            82
          ],
          "born" : "1983/03/20"
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 37.0,
        "_source" : {
          "first" : "johnny",
          "last" : "gaudreau",
          "goals" : [
            9,
            27,
            1
          ],
          "assists" : [
            17,
            46,
            0
          ],
          "gp" : [
            26,
            82,
            1
          ],
          "born" : "1993/08/13"
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "7",
        "_score" : 31.0,
        "_source" : {
          "first" : "david",
          "last" : "jones",
          "goals" : [
            7,
            19,
            5
          ],
          "assists" : [
            3,
            17,
            4
          ],
          "gp" : [
            26,
            45,
            34
          ],
          "born" : "1984/08/10"
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "10",
        "_score" : 31.0,
        "_source" : {
          "first" : "mikael",
          "last" : "backlund",
          "goals" : [
            3,
            15,
            13
          ],
          "assists" : [
            6,
            24,
            18
          ],
          "gp" : [
            26,
            82,
            82
          ],
          "born" : "1989/03/17"
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 25.0,
        "_source" : {
          "first" : "micheal",
          "last" : "frolik",
          "goals" : [
            4,
            6,
            15
          ],
          "assists" : [
            8,
            23,
            15
          ],
          "gp" : [
            26,
            82,
            82
          ],
          "born" : "1988/02/17"
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "8",
        "_score" : 23.0,
        "_source" : {
          "first" : "tj",
          "last" : "brodie",
          "goals" : [
            2,
            14,
            7
          ],
          "assists" : [
            8,
            42,
            30
          ],
          "gp" : [
            26,
            82,
            82
          ],
          "born" : "1990/06/07"
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : 5.0,
        "_source" : {
          "first" : "sam",
          "last" : "bennett",
          "goals" : [
            5,
            0,
            0
          ],
          "assists" : [
            8,
            1,
            0
          ],
          "gp" : [
            26,
            1,
            0
          ],
          "born" : "1996/06/20"
        }
      }
    ]
  }
}

或者,您可以使用 script_fields 而不是 function_score 执行相同的操作:

代码语言:javascript
复制
GET hockey/_search
{
  "query": {
    "match_all": {}
  },
  "script_fields": {
    "total_goals": {
      "script": {
        "lang": "painless",
        "source": " int total = 0; for (int i = 0; i < doc['goals'].length; ++i) {            total += doc['goals'][i]; } return total;        "
      }
    }
  }
}

结果:
{
  "took" : 7,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "fields" : {
          "total_goals" : [
            37
          ]
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "fields" : {
          "total_goals" : [
            87
          ]
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "fields" : {
          "total_goals" : [
            75
          ]
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 1.0,
        "fields" : {
          "total_goals" : [
            25
          ]
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : 1.0,
        "fields" : {
          "total_goals" : [
            5
          ]
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "6",
        "_score" : 1.0,
        "fields" : {
          "total_goals" : [
            41
          ]
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "7",
        "_score" : 1.0,
        "fields" : {
          "total_goals" : [
            31
          ]
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "8",
        "_score" : 1.0,
        "fields" : {
          "total_goals" : [
            23
          ]
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "39",
        "_score" : 1.0,
        "fields" : {
          "total_goals" : [
            51
          ]
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "10",
        "_score" : 1.0,
        "fields" : {
          "total_goals" : [
            31
          ]
        }
      }
    ]
  }
}

以下示例使用 Painless 脚本按其组合的名字和姓氏对玩家进行排序。 使用 doc ['first'].value 和 doc ['last'].value 访问名称。

代码语言:javascript
复制
GET hockey/_search
{
  "query": {
    "match_all": {}
  },
  "sort": {
    "_script": {
      "type": "string",
      "order": "asc",
      "script": {
        "lang": "painless",
        "source": "doc['first.keyword'].value + ' ' + doc['last.keyword'].value"
      }
    }
  }
}

结果:
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "7",
        "_score" : null,
        "_source" : {
          "first" : "david",
          "last" : "jones",
          "goals" : [
            7,
            19,
            5
          ],
          "assists" : [
            3,
            17,
            4
          ],
          "gp" : [
            26,
            45,
            34
          ],
          "born" : "1984/08/10"
        },
        "sort" : [
          "david jones"
        ]
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "6",
        "_score" : null,
        "_source" : {
          "first" : "dennis",
          "last" : "wideman",
          "goals" : [
            0,
            26,
            15
          ],
          "assists" : [
            11,
            30,
            24
          ],
          "gp" : [
            26,
            81,
            82
          ],
          "born" : "1983/03/20"
        },
        "sort" : [
          "dennis wideman"
        ]
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : null,
        "_source" : {
          "first" : "jiri",
          "last" : "hudler",
          "goals" : [
            5,
            34,
            36
          ],
          "assists" : [
            11,
            62,
            42
          ],
          "gp" : [
            24,
            80,
            79
          ],
          "born" : "1984/01/04"
        },
        "sort" : [
          "jiri hudler"
        ]
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : null,
        "_source" : {
          "first" : "johnny",
          "last" : "gaudreau",
          "goals" : [
            9,
            27,
            1
          ],
          "assists" : [
            17,
            46,
            0
          ],
          "gp" : [
            26,
            82,
            1
          ],
          "born" : "1993/08/13"
        },
        "sort" : [
          "johnny gaudreau"
        ]
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "39",
        "_score" : null,
        "_source" : {
          "first" : "mark",
          "last" : "giordano",
          "goals" : [
            6,
            30,
            15
          ],
          "assists" : [
            3,
            30,
            24
          ],
          "gp" : [
            26,
            60,
            63
          ],
          "born" : "1983/10/03"
        },
        "sort" : [
          "mark giordano"
        ]
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : null,
        "_source" : {
          "first" : "micheal",
          "last" : "frolik",
          "goals" : [
            4,
            6,
            15
          ],
          "assists" : [
            8,
            23,
            15
          ],
          "gp" : [
            26,
            82,
            82
          ],
          "born" : "1988/02/17"
        },
        "sort" : [
          "micheal frolik"
        ]
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "10",
        "_score" : null,
        "_source" : {
          "first" : "mikael",
          "last" : "backlund",
          "goals" : [
            3,
            15,
            13
          ],
          "assists" : [
            6,
            24,
            18
          ],
          "gp" : [
            26,
            82,
            82
          ],
          "born" : "1989/03/17"
        },
        "sort" : [
          "mikael backlund"
        ]
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : null,
        "_source" : {
          "first" : "sam",
          "last" : "bennett",
          "goals" : [
            5,
            0,
            0
          ],
          "assists" : [
            8,
            1,
            0
          ],
          "gp" : [
            26,
            1,
            0
          ],
          "born" : "1996/06/20"
        },
        "sort" : [
          "sam bennett"
        ]
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : null,
        "_source" : {
          "first" : "sean",
          "last" : "monohan",
          "goals" : [
            7,
            54,
            26
          ],
          "assists" : [
            11,
            26,
            13
          ],
          "gp" : [
            26,
            82,
            82
          ],
          "born" : "1994/10/12"
        },
        "sort" : [
          "sean monohan"
        ]
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "8",
        "_score" : null,
        "_source" : {
          "first" : "tj",
          "last" : "brodie",
          "goals" : [
            2,
            14,
            7
          ],
          "assists" : [
            8,
            42,
            30
          ],
          "gp" : [
            26,
            82,
            82
          ],
          "born" : "1990/06/07"
        },
        "sort" : [
          "tj brodie"
        ]
      }
    ]
  }
}

5.检查缺失项

doc ['field'].value。如果文档中缺少该字段,则抛出异常。要检查文档是否缺少值,可以调用 doc ['field'] .size() == 0。

使用Painless更新字段

您还可以轻松更新字段。 您可以使用 ctx._source.<field-name> 访问字段的原始源。首先,让我们通过提交以下请求来查看玩家的源数据:

代码语言:javascript
复制
GET hockey/_search
{
  "stored_fields": [
    "_id",
    "_source"
  ],
  "query": {
    "term": {
      "_id": 1
    }
  }
}

结果:
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "first" : "johnny",
          "last" : "gaudreau",
          "goals" : [
            9,
            27,
            1
          ],
          "assists" : [
            17,
            46,
            0
          ],
          "gp" : [
            26,
            82,
            1
          ],
          "born" : "1993/08/13"
        }
      }
    ]
  }
}

要将玩家1的姓氏更改为 hockey,只需将 ctx._source.last 设置为新值:

代码语言:javascript
复制
POST hockey/_update/1
{
  "script": {
    "lang": "painless",
    "source": "ctx._source.last = params.last",
    "params": {
      "last": "hockey"
    }
  }
}

您还可以向文档添加字段。 例如,此脚本添加一个包含玩家 nickname 为 hockey的新字段。

代码语言:javascript
复制
POST hockey/_update/1
{
  "script": {
    "lang": "painless",
    "source": "ctx._source.last = params.last; ctx._source.nick = params.nick",
    "params": {
      "last": "gaudreau",
      "nick": "hockey"
    }
  }
}

GET hockey/_doc/1

结果:
{
  "_index" : "hockey",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 4,
  "_seq_no" : 12,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "first" : "johnny",
    "last" : "hockey",
    "goals" : [
      9,
      27,
      1
    ],
    "assists" : [
      17,
      46,
      0
    ],
    "gp" : [
      26,
      82,
      1
    ],
    "born" : "1993/08/13",
    "nick" : "hockey"
  }
}

有一个叫做 “nick” 的新字段被加入了。

我们甚至可以对日期类型来进行操作从而得到年月等信息:

代码语言:javascript
复制
GET hockey/_search
{
  "script_fields": {
    "birth_year": {
      "script": {
        "source": "doc.born.value.year"
      }
    }
  }
}

结果:
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "fields" : {
          "birth_year" : [
            1994
          ]
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "fields" : {
          "birth_year" : [
            1984
          ]
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 1.0,
        "fields" : {
          "birth_year" : [
            1988
          ]
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : 1.0,
        "fields" : {
          "birth_year" : [
            1996
          ]
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "6",
        "_score" : 1.0,
        "fields" : {
          "birth_year" : [
            1983
          ]
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "7",
        "_score" : 1.0,
        "fields" : {
          "birth_year" : [
            1984
          ]
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "8",
        "_score" : 1.0,
        "fields" : {
          "birth_year" : [
            1990
          ]
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "39",
        "_score" : 1.0,
        "fields" : {
          "birth_year" : [
            1983
          ]
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "10",
        "_score" : 1.0,
        "fields" : {
          "birth_year" : [
            1989
          ]
        }
      },
      {
        "_index" : "hockey",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "fields" : {
          "birth_year" : [
            1993
          ]
        }
      }
    ]
  }
}

6.Script Caching

Elasticsearch第一次看到一个新脚本,它会编译它并将编译后的版本存储在缓存中。无论是 inline 或是 stored 脚本都存储在缓存中。新脚本可以驱逐缓存的脚本。默认的情况下是可以存储100个脚本。我们可以通过设置 script.cache.max_size 来改变其大小,或者通过 script.cache.expire 来设置过期的时间。这些设置需要在 config/elasticsearch.yml 里设置。

7.Script 调试

不能调试的脚本是非常难的。有一个好的调试手段无疑对我们的脚本编程是非常有用的。

Debug.explain

Painless 没有 REPL,虽然有一天它很好,但它不会告诉你关于调试 Elasticsearch 中嵌入的 Painless 脚本的全部故事,因为脚本可以访问的数据或 “上下文” 是如此重要。 目前,调试嵌入式脚本的最佳方法是在选择位置抛出异常。 虽然你可以抛出自己的异常(throw new exception('whatever'),但 Painless 的沙箱会阻止你访问有用的信息,如对象的类型。 所以 Painless 有一个实用工具方法 Debug.explain,它会为你抛出异常。 例如,你可以使用 _explain 来探索 script query 可用的上下文。

代码语言:javascript
复制
PUT /hockey/_doc/1?refresh
{
  "first": "johnny",
  "last": "gaudreau",
  "goals": [
    9,
    27,
    1
  ],
  "assists": [
    17,
    46,
    0
  ],
  "gp": [
    26,
    82,
    1
  ]
} 
代码语言:javascript
复制
POST /hockey/_explain/1
{
  "query": {
    "script": {
      "script": "Debug.explain(doc.goals)"
    }
  }
}

这表明doc.goals类是通过 org.elasticsearch.index.fielddata.ScriptDocValues.Long 来响应的:

代码语言:javascript
复制
{
  "error" : {
    "root_cause" : [
      {
        "type" : "script_exception",
        "reason" : "runtime error",
        "painless_class" : "org.elasticsearch.index.fielddata.ScriptDocValues.Longs",
        "to_string" : "[1, 9, 27]",
        "java_class" : "org.elasticsearch.index.fielddata.ScriptDocValues$Longs",
        "script_stack" : [
          "Debug.explain(doc.goals)",
          "                 ^---- HERE"
        ],
        "script" : "Debug.explain(doc.goals)",
        "lang" : "painless",
        "position" : {
          "offset" : 17,
          "start" : 0,
          "end" : 24
        }
      }
    ],
    "type" : "script_exception",
    "reason" : "runtime error",
    "painless_class" : "org.elasticsearch.index.fielddata.ScriptDocValues.Longs",
    "to_string" : "[1, 9, 27]",
    "java_class" : "org.elasticsearch.index.fielddata.ScriptDocValues$Longs",
    "script_stack" : [
      "Debug.explain(doc.goals)",
      "                 ^---- HERE"
    ],
    "script" : "Debug.explain(doc.goals)",
    "lang" : "painless",
    "position" : {
      "offset" : 17,
      "start" : 0,
      "end" : 24
    },
    "caused_by" : {
      "type" : "painless_explain_error",
      "reason" : null
    }
  },
  "status" : 400
}

您可以使用相同的技巧来查看 _source 是 _update API 中的 LinkedHashMap:

代码语言:javascript
复制
POST /hockey/_update/1
{
  "script": "Debug.explain(ctx._source)"
}
代码语言:javascript
复制
{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "failed to execute script"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "failed to execute script",
    "caused_by" : {
      "type" : "script_exception",
      "reason" : "runtime error",
      "painless_class" : "java.util.LinkedHashMap",
      "to_string" : "{first=johnny, last=gaudreau, goals=[9, 27, 1], assists=[17, 46, 0], gp=[26, 82, 1]}",
      "java_class" : "java.util.LinkedHashMap",
      "script_stack" : [
        "Debug.explain(ctx._source)",
        "                 ^---- HERE"
      ],
      "script" : "Debug.explain(ctx._source)",
      "lang" : "painless",
      "position" : {
        "offset" : 17,
        "start" : 0,
        "end" : 26
      },
      "caused_by" : {
        "type" : "painless_explain_error",
        "reason" : null
      }
    }
  },
  "status" : 400
}

参考:

【1】https://www.elastic.co/guide/en/elasticsearch/painless/current/painless-walkthrough.html

【2】https://www.elastic.co/guide/en/elasticsearch/painless/current/painless-debugging.html

本文系转载,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文系转载前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 简介
    • 脚本
    • Scripting应用
      • 1.inline 脚本
        • 2.存储的脚本 (stored script)
          • 3.访问source里的字段
            • 例子1
            • 例子2
            • 例子3
            • 例子4
            • 例子5
          • 4.使用 Painless 访问 Doc 里的值
            • 5.检查缺失项
              • 使用Painless更新字段
            • 6.Script Caching
              • 7.Script 调试
                • Debug.explain
            相关产品与服务
            Elasticsearch Service
            腾讯云 Elasticsearch Service(ES)是云端全托管海量数据检索分析服务,拥有高性能自研内核,集成X-Pack。ES 支持通过自治索引、存算分离、集群巡检等特性轻松管理集群,也支持免运维、自动弹性、按需使用的 Serverless 模式。使用 ES 您可以高效构建信息检索、日志分析、运维监控等服务,它独特的向量检索还可助您构建基于语义、图像的AI深度应用。
            领券
            问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档