Elasticsearch SQL是可以通过SQL方式查询Elasticsearch的组件。你能认为它是转换器,既能理解SQL,又能理解Elasticsearch,并利用Elasticsearch功能易于实现实时读取和处理数据。
1. Elasticsearch SQL
原生整合
Elasticsearch SQL是为Elasticsearch官方提供,根据底层存储对相关节点有效地执行每个查询。
无外部组件
不需要额外的硬件,进程,运行时库去查询Elasticsearch;通过在Elasticsearch集群中运行,Elasticsearch SQL消除了额外的组件。
轻量有效
Elasticsearch SQL并不抽象Elasticsearch及其搜索功能——相反,它包含并公开SQL,以同样实时的、声明式、简洁的方式支持全文搜索。
2. 示例
增加示例数据:
PUT /library/_bulk?refresh
{"index":{"_id": "Leviathan Wakes"}}
{"name": "Leviathan Wakes", "author": "James S.A. Corey", "release_date": "2011-06-02", "page_count": 561}
{"index":{"_id": "Hyperion"}}
{"name": "Hyperion", "author": "Dan Simmons", "release_date": "1989-05-26", "page_count": 482}
{"index":{"_id": "Dune"}}
{"name": "Dune", "author": "Frank Herbert", "release_date": "1965-06-01", "page_count": 604}
现在执行SQL查询:
POST /_sql?format=txt
{
"query": "SELECT * FROM library WHERE release_date < '2000-01-01'"
}
这里format表示输出格式,还有其他类型,如json、csv、yaml等。txt以表格方式进行展示便于阅读。
author | name | page_count | release_date
---------------+---------------+---------------+------------------------
Dan Simmons |Hyperion |482 |1989-05-26T00:00:00.000Z
Frank Herbert |Dune |604 |1965-06-01T00:00:00.000Z
2.1 输出json列式数据
增加"columnar": true参数,结果以列式输出,一般用于图表展示数据。
POST /_sql?format=json
{
"query": "SELECT * FROM library ORDER BY page_count DESC",
"fetch_size": 5,
"columnar": true
}
{
"columns": [
{"name": "author", "type": "text"},
{"name": "name", "type": "text"},
{"name": "page_count", "type": "short"},
{"name": "release_date", "type": "datetime"}
],
"values": [
["Peter F. Hamilton", "Vernor Vinge", "Frank Herbert", "Alastair Reynolds", "James S.A. Corey"],
["Pandora's Star", "A Fire Upon the Deep", "Dune", "Revelation Space", "Leviathan Wakes"],
[768, 613, 604, 585, 561],
["2004-03-02T00:00:00.000Z", "1992-06-01T00:00:00.000Z", "1965-06-01T00:00:00.000Z", "2000-03-15T00:00:00.000Z", "2011-06-02T00:00:00.000Z"]
],
"cursor": "sDXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAAEWWWdrRlVfSS1TbDYtcW9lc1FJNmlYdw==:BAFmBmF1dGhvcgFmBG5hbWUBZgpwYWdlX2NvdW50AWYMcmVsZWFzZV9kYXRl+v///w8="
}
cursor是游标,用于读取下一页数据。但后续调用仍然需要columnar参数:
POST /_sql?format=json
{
"cursor": "sDXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAAEWWWdrRlVfSS1TbDYtcW9lc1FJNmlYdw==:BAFmBmF1dGhvcgFmBG5hbWUBZgpwYWdlX2NvdW50AWYMcmVsZWFzZV9kYXRl+v///w8=",
"columnar": true
}
2.2 传输查询参数
我们可以在SQL语句中直接写具体值进行查询:
POST /_sql?format=txt
{
"query": "SELECT YEAR(release_date) AS year FROM library WHERE page_count > 300 AND author = 'Frank Herbert' GROUP BY year HAVING COUNT(*) > 0"
}
但也可以抽取参数在单独行,值用占位符代替:
POST /_sql?format=txt
{
"query": "SELECT YEAR(release_date) AS year FROM library WHERE page_count > ? AND author = ? GROUP BY year HAVING COUNT(*) > ?",
"params": [300, "Frank Herbert", 0]
}
2.3 REST 参数
除了query和fetch_size外,还有其他一些参数,如请求超时等。下面列出常用参数:
参数名称 | 缺省值 | 描述 |
---|---|---|
query | 必须 | 执行的查询sql |
fetch_size | 1000 | 返回最大行数 |
filter | 无 | 可选,使用query dsl增加过滤功能 |
request_timeout | 90s | 请求超时时间 |
columnar | 1000 | 返回列式结果,仅这些格式可用:json, yaml, cbor, smile |
params | 无 | 列出占位符参数的值 |
2.4 函数和操作符
内容比较多,一般理解SQL语法掌握比较容易,读者直接阅读官网文档吧。
3. 总结
本文通过示例简要介绍SQL访问Elasticsearch。对于灵活定义sql输出结果以图表方式(如,eCharts)展示比较方便。
本文参考链接:https://blog.csdn.net/neweastsun/article/details/110338763