Skip to main content
 首页 » 编程设计

Elasticsearch Query DSL:查询上下文和过滤上下文

2022年07月19日131findumars

Elasticsearch Query DSL:查询上下文和过滤上下文

Elasticsearch提供非常完善基于JSON的Query DSL(Domain Specific Language)用于定义查询。主要包括查询上下文即过滤上下文,以及两者组合查询。

1. 查询上下文

在查询上下文中使用的查询子句基于文档相关性原则进行查询,用于回答“文档匹配查询子句的程度”。查询结果列出所有相关文档并按照相关性评分进行排序。相关性评分有查询上下文中的查询子句计算出来,用_score表示,即相对于其他文档的匹配程度。

无论何时将查询参数传给查询子句,查询上下文都有效,如搜索API中的查询参数。下面示例带有查询上下文的查询,返回所有描述包括science单词的课程。

GET /courses/_search 
{ 
  "query": { 
    "match": {  
      "course_description": "science"  
    } 
  } 
} 

2. 过滤上下文

过滤上下文可视为结果为0/1的二值工具。查询上下文回答“匹配程度”,过滤上下文简单回答“是/否”。

过滤上下文大多数用于过滤结构化数据,如范围查询(给定日期范围)、状态检查等。elasticsearch会自动缓存频繁使用过滤上下文,从而提升查询性能。

无论何时将过滤参数传给查询子句,过滤上下文都有效,如在bool查询中的filter 或者 must_not 参数,constant_score 查询中的过滤参数,或filter聚集。下面带过滤上下文查询子句返回所有学生得分大于等于33的课程文档。

GET /courses/_search 
{ 
  "query": { 
    "bool": { 
      "filter": { 
          "range":  { "students_enrolled": { "gte": 33 }} 
        } 
       
    } 
  } 
} 

注意:查询上下文与过滤上下文的基本差异————查询上下文与_score(相关性评分)关联,而过滤上下文与二值(true、false)关联。

3. 查询示例

本节我们通过几个示例加深理解。为了验证查询结果,这里提供一些查询数据,读者可以批量插入至courses进行测试。

{ 
"_index" : "courses", 
"_type" : "_doc", 
"_id" : "7G4TN3ABnUeCEegtv7VW", 
"_score" : 1.0, 
"_source" : { 
    "name" : "Marketing 101", 
    "room" : "E4", 
    "professor" : { 
    "name" : "William Smith", 
    "department" : "finance", 
    "facutly_type" : "part-time", 
    "email" : "wills@onuni.com" 
    }, 
    "students_enrolled" : 18, 
    "course_publish_date" : "2015-06-21", 
    "course_description" : "Mkt 101 is a course from the business school on the introduction to marketing that teaches students the fundamentals of market analysis, customer retention and online advertisements" 
} 
}, 
{ 
"_index" : "courses", 
"_type" : "_doc", 
"_id" : "7W4TN3ABnUeCEegtv7VW", 
"_score" : 1.0, 
"_source" : { 
    "name" : "Accounting 101", 
    "room" : "E3", 
    "professor" : { 
    "name" : "Thomas Baszo", 
    "department" : "finance", 
    "facutly_type" : "part-time", 
    "email" : "baszot@onuni.com" 
    }, 
    "students_enrolled" : 27, 
    "course_publish_date" : "2015-01-19", 
    "course_description" : "Act 101 is a course from the business school on the introduction to accounting that teaches students how to read and compose basic financial statements" 
} 
}, 
{ 
"_index" : "courses", 
"_type" : "_doc", 
"_id" : "7m4TN3ABnUeCEegtv7VW", 
"_score" : 1.0, 
"_source" : { 
    "name" : "Tax Accounting 200", 
    "room" : "E7", 
    "professor" : { 
    "name" : "Thomas Baszo", 
    "department" : "finance", 
    "facutly_type" : "part-time", 
    "email" : "baszot@onuni.com" 
    }, 
    "students_enrolled" : 17, 
    "course_publish_date" : "2016-06-15", 
    "course_description" : "Tax Act 200 is an intermediate course covering various aspects of tax law" 
} 
}, 
{ 
"_index" : "courses", 
"_type" : "_doc", 
"_id" : "724UN3ABnUeCEegtkLUq", 
"_score" : 1.0, 
"_source" : { 
    "name" : "Capital Markets 350", 
    "room" : "E3", 
    "professor" : { 
    "name" : "Thomas Baszo", 
    "department" : "finance", 
    "facutly_type" : "part-time", 
    "email" : "baszot@onuni.com" 
    }, 
    "students_enrolled" : 13, 
    "course_publish_date" : "2016-01-11", 
    "course_description" : "This is an advanced course teaching crucial topics related to raising capital and bonds, shares and other long-term equity and debt financial instrucments" 
} 
} 
 

1、仅有查询上下文

GET /courses/_search 
{ 
  "query": { 
     
    "match": {  
      "course_description": "science"  
    } 
  } 
} 

响应信息包括_score表明文档相关性评分。

2、带过滤占位符的查询上下文
使用bool组合多个匹配子句,这里filter参数为空,filter参数表示过滤上下文。

GET /courses/_search 
{ 
  "query": {  
    "bool": {  
      "must": [ 
        { "match": { "professor.facutly_type": "part-time" }}, 
        { "match": { "professor.department": "finance" }} 
      ], 
      "filter": [  
          
      ] 
    } 
  } 
} 

must内所有子句必须都匹配,相当于and功能。

3、带过滤的查询上下文

在查询基础上增加过滤条件。范围过滤会在结果上删除符合过滤条件的文档。

GET /courses/_search 
{ 
  "query": {  
    "bool": {  
      "must": [ 
        { "match": { "professor.facutly_type": "part-time" }}, 
        { "match": { "professor.department": "finance" }} 
      ], 
      "filter": [  
         { "range":  { "students_enrolled": { "gte": 16 }}} 
      ] 
    } 
  } 
} 

4、使用must_not 子句

must_not 子句从结果中删除符合条件文档。

GET /courses/_search 
{ 
  "query": {  
    "bool": {  
      "must": [ 
        { "match": { "professor.facutly_type": "part-time" }}, 
        { "match": { "professor.department": "finance" }} 
      ], 
      "must_not": [ 
        { "match": { "course_description": "business" }} 
      ],  
      "filter": [  
         { "range":  { "students_enrolled": { "gte": 16 }}} 
      ] 
    } 
  } 
} 

must_not相当于not功能,表示不匹配。

5、multi_match

多字段匹配:

GET /courses/_search 
{ 
  "query": { 
    "multi_match": { 
      "query": "computer", 
      "fields": ["name","professor.department"] 
    } 
  } 
} 

6、multi_phrase

multi_phrase需要完全匹配搜索词组。部分或打断词组将不会匹配。

GET /courses/_search 
{ 
  "query": { 
    "match_phrase": { 
      "course_description": "computer science introduction teaching" 
    } 
  } 
} 

7、match_phase_prefix

match_phase_prefix 部分以查询词组为前缀查询。

GET /courses/_search 
{ 
  "query": { 
    "match_phrase_prefix": { 
      "course_description": "computer science" 
    } 
  } 
} 

8、范围子句

gte表示大于或等于,lte表示小于或等于。其他选项gt(大于),lt(小于)。

GET /courses/_search 
{ 
  "query": { 
    "range": { 
      "students_enrolled": { 
        "gte": 20, 
        "lte": 30 
      } 
    } 
  } 
} 

9、should

Should 子句一般用于查询最相关的文档。如果删除minimum_should_match子句则返回多个文档,反之返回最相关文档。

GET /courses/_search 
{ 
  "query": { 
    "bool": { 
      "must": [ 
        {"match": {"name":"101"}} 
      ],  
      "must_not": [ 
        {"match": {"room": "e7"}} 
      ], 
      "should": [ 
        { 
          "range": { 
            "students_enrolled": { 
              "gte": 10, 
              "lte": 20 
            } 
          } 
        } 
      ], 
      "minimum_should_match": 1 
    } 
  } 
} 

should相当于or功能。minimum_should_match紧跟should后面,用于限定必须满足or条件最小量。

4. 总结

我们一起学习了Elasticsearch Query DSL,并通过示例说明查询上下文和过滤上下文以及两者组合使用。


本文参考链接:https://blog.csdn.net/neweastsun/article/details/104278308
阅读延展