ES知识
2021-07-28 21:02:28 0 举报
AI智能生成
ES知识梳理汇总
作者其他创作
大纲/内容
主要概念
index
database
type
table
document
row
field
column
mapping
schema
核心操作
index 操作
创建/删除 index、开启/关闭 index、添加/查看 mapping、设置/查看 settings。
# 创建索引
PUT /songs_v3
PUT /songs_v3
# 删除索引
DELETE /songs_v3
DELETE /songs_v3
# 创建 index,指定 settings
PUT /songs_v4
{
"settings": {
"number_of_shards": 6,
"number_of_replicas": 1
}
}
PUT /songs_v4
{
"settings": {
"number_of_shards": 6,
"number_of_replicas": 1
}
}
# 获取 index 的 settings 信息
GET /songs_v4/_settings
GET /songs_v4/_settings
# 修改 index 的配置信息
# index 的配置分为两类:
# static(number of shards/index.shard.check_on_startup)
# dynamic(index 正常工作时,能修改的配置信息)
PUT /songs_v4/_settings
{
"number_of_replicas": 2
}
# index 的配置分为两类:
# static(number of shards/index.shard.check_on_startup)
# dynamic(index 正常工作时,能修改的配置信息)
PUT /songs_v4/_settings
{
"number_of_replicas": 2
}
# index 开启状态,不允许执行
PUT /songs_v4/_settings
{
"index.shard.check_on_startup": true
}
PUT /songs_v4/_settings
{
"index.shard.check_on_startup": true
}
# 关闭 index
POST /songs_v4/_close
POST /songs_v4/_close
# 开启 index
POST /songs_v4/_open
POST /songs_v4/_open
# 获取 index 中的 mapping types
GET /songs_v4/_mapping
GET /songs_v4/_mapping
# 删除 mapping_type(不支持)
DELETE /songs_v4/_mapping
DELETE /songs_v4/_mapping
document 操作
索引/查询/更新/删除 document、搜索 document、执行 script
# 索引文档
# 显示指定文档 ID
PUT /songs_v4/_doc/5
{
"songName": "could this be love",
"singer": "Jennifer Lopez",
"lyrics": "Could This Be love, work up This Morning Just..."
}
# 显示指定文档 ID
PUT /songs_v4/_doc/5
{
"songName": "could this be love",
"singer": "Jennifer Lopez",
"lyrics": "Could This Be love, work up This Morning Just..."
}
# 随机生成文档 ID
POST /songs_v4/_doc
{
"songName": "could this be love",
"singer": "Jennifer Lopez",
"lyrics": "Could This Be love, work up This Morning Just..."
}
POST /songs_v4/_doc
{
"songName": "could this be love",
"singer": "Jennifer Lopez",
"lyrics": "Could This Be love, work up This Morning Just..."
}
# 更新文档
PUT /songs_v4/_doc/5
{
"songName": "could this be love",
"singer": "zp",
"lyrics": "Could This Be love, work up This Morning Just..."
}
PUT /songs_v4/_doc/5
{
"songName": "could this be love",
"singer": "zp",
"lyrics": "Could This Be love, work up This Morning Just..."
}
# 根据 ID 明确查询某个文档
GET /songs_v4/_doc/5
GET /songs_v4/_doc/5
# 根据 ID 删除文档
DELETE /songs_v4/_doc/5
DELETE /songs_v4/_doc/5
# 搜索一个文档
GET /songs_v4/_search?q=singer:Jennifer
GET /songs_v4/_search?q=singer:Jennifer
GET /songs_v4/_mapping
mapping操作
# 创建 index 后,创建 mapping
PUT /books
PUT /books/_mapping
{
"properties": {
"bookName": {"type": "text"},
"content": {"type": "text"}
}
}
GET /books/_mapping
DELETE /books
PUT /books
PUT /books/_mapping
{
"properties": {
"bookName": {"type": "text"},
"content": {"type": "text"}
}
}
GET /books/_mapping
DELETE /books
# 创建 index,并指定 mapping
PUT /books
{
"mappings": {
"properties": {
"bookName": {"type": "text"},
"content": {"type": "text"}
}
}
}
GET /books/_mapping
DELETE /books
PUT /books
{
"mappings": {
"properties": {
"bookName": {"type": "text"},
"content": {"type": "text"}
}
}
}
GET /books/_mapping
DELETE /books
# 给 mapping 添加字段
PUT /books/_mappings
{
"properties": {
"author": {"type": "text"}
}
}
GET /books/_mapping
PUT /books/_mappings
{
"properties": {
"author": {"type": "text"}
}
}
GET /books/_mapping
多重字段
PUT my_index
{
"mappings": {
"properties": {
"city": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
}
}
}
}
{
"mappings": {
"properties": {
"city": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
}
}
}
}
PUT my_index/_doc/1
{
"city": "New York"
}
PUT my_index/_doc/2
{
"city": "York"
}
{
"city": "New York"
}
PUT my_index/_doc/2
{
"city": "York"
}
GET my_index/_search
{
"query": {
"match": {
"city": "york"
}
},
"sort": [
{
"city.raw": {
"order": "asc"
}
}
],
"aggs": {
"citys": {
"terms": {
"field": "city.raw"
}
}
}
}
{
"query": {
"match": {
"city": "york"
}
},
"sort": [
{
"city.raw": {
"order": "asc"
}
}
],
"aggs": {
"citys": {
"terms": {
"field": "city.raw"
}
}
}
}
具体优化
索引类型
doc_values
大多数字段进行了反向索引,因此可以用于搜索,但排序、聚合、scripts 操作等需要正向索引
fielddata
大多数字段可利用 doc_values 来进行排序、聚合、scripts 等操作,但 doc_values 不支持 text 字段,text 字段利用 fielddata 机制来替代。(常驻内存,非常昂贵)
index
doc_values 指定文档是否进行正向索引,index 指定文档是否进行反向索引
store
默认情况下,_source 会存储文档所有的字段,当一个字段的 store 属性设置为 true 时,ES 会单独存储一份该字段。
使用场景,比如书籍,content 字段会保存几百万个字符,在几百万字符中提取 name、author 是很麻烦的事情,所以会考虑将 content 字段通过 store 存储。
PUT books
{
"mappings": {
"properties": {
"name": {"type": "text"},
"author": {"type": "text"},
"content": {"type": "text", "store": true}
},
"_source": {
"excludes": [
"content"
]
}
}
}
{
"mappings": {
"properties": {
"name": {"type": "text"},
"author": {"type": "text"},
"content": {"type": "text", "store": true}
},
"_source": {
"excludes": [
"content"
]
}
}
}
元字段
字段名 说明
_index 文档所属的 index
_id 文档的 id
_type 文档所属的 type
_uid _type#_id 的组合
_source 文档的原生 json 字符串
_all 自动组合所有的字段值,已过时
_field_names 索引了每个字段的名称
_parent 指定文档之间父子关系,已过时
_routing 将一个文档根据路由存储到指定分片上
_meta 用于自定义元数据
_index 文档所属的 index
_id 文档的 id
_type 文档所属的 type
_uid _type#_id 的组合
_source 文档的原生 json 字符串
_all 自动组合所有的字段值,已过时
_field_names 索引了每个字段的名称
_parent 指定文档之间父子关系,已过时
_routing 将一个文档根据路由存储到指定分片上
_meta 用于自定义元数据
简单操作
集群管理
curl http://localhost:9200/_cat/health?pretty
curl http://localhost:9200/_cat/nodes?pretty
curl http://localhost:9200/_cat/shards?pretty
curl http://localhost:9200/_cat/indices?v
查看集群中的索引列表
curl http://localhost:9200/_cat
增查改删
PUT /index_name/type_name/id
PUT /shop_index/productInfo/1
{
"name": "HuaWei Mate8",
"desc": "Cheap and easy to use",
"price": 2500,
"producer": "HuaWei Producer",
"tags": [
"Cheap",
"Fast"
]
}
{
"name": "HuaWei Mate8",
"desc": "Cheap and easy to use",
"price": 2500,
"producer": "HuaWei Producer",
"tags": [
"Cheap",
"Fast"
]
}
GET /index_name/type_name/id
GET /shop_index/productInfo/1
{
"_index": "shop_index",
"_type": "productInfo",
"_id": "1",
"_version": 1,
"found": true,
"_source": {
"name": "HuaWei Mate8",
"desc": "Cheap and easy to use",
"price": 2500,
"producer": "HuaWei Producer",
"tags": [
"Cheap",
"Fast"
]
}
}
{
"_index": "shop_index",
"_type": "productInfo",
"_id": "1",
"_version": 1,
"found": true,
"_source": {
"name": "HuaWei Mate8",
"desc": "Cheap and easy to use",
"price": 2500,
"producer": "HuaWei Producer",
"tags": [
"Cheap",
"Fast"
]
}
}
PUT /index_name/type_name/id
PUT /shop_index/productInfo/1
{
"name": "HuaWei Mate8",
"desc": "Cheap and easy to use",
"price": 2400,
"producer": "HuaWei Producer",
"tags": [
"Cheap",
"Fast"
]
}
{
"name": "HuaWei Mate8",
"desc": "Cheap and easy to use",
"price": 2400,
"producer": "HuaWei Producer",
"tags": [
"Cheap",
"Fast"
]
}
POST /index_name/type_name/id/_update
POST /shop_index/productInfo/1/_update
{
"doc": {
"price": 2200
}
}
{
"doc": {
"price": 2200
}
}
DELETE /index_index/type_index/id
DELETE /shop_index/productInfo/1
全能型的数据产品
由于支持倒排索引、列存储等数据结构,ES 提供非常灵活的搜索分析能力
支持交互式分析,即使在万亿级日志的情况下,ES 搜索响应时间也是秒级。
ES 拥有一套完整的日志解决方案(ELK),可以秒级实现从采集到展示。
Elasticsearch是一个分布式、高扩展、高实时的搜索与数据分析引擎。它能很方便的使大量数据具有搜索、分析和探索的能力。充分利用Elasticsearch的水平伸缩性,能使数据在生产环境变得更有价值。
优势
具有高可用性、高扩展性;
很简便的横向扩容,分布式的架构,可以轻松地对资源进行横向纵向扩缩容,可以满足不同数据量级及查询场景对硬件资源的需求。能由数百台到万台机器搭建满足PB级的快速搜索,也能搭建单机版服务小公司
查询速度快,性能佳;
ES底层采用Lucene作为搜索引擎,并在此之上做了多重优化,保证了用户对数据查询数据的需求。可"代替"传统关系型数据库,也可用于复杂数据分析,海量数据的近实时处理等。
搜索功能强大,高度匹配用户意图。
关性高:ES内部提供了完善的评分机制,会根据分词出现的频次等信息对文档进行相关性排序,保证相关性越高的文档排序越靠前。另外还提供了包括模糊查询,前缀查询,通配符查询等在内的多种查询手段,帮助用户快速高效地进行检索。
功能点多但使用比较简便,开箱即用,性能优化比较简单
生态圈丰富,社区活跃,适配多种工具
处理日志和输出到Elasticsearch,您可以使用日志记录工具,如Logstash(www.elastic.co/products/logstash),搜索和可视化界面分析这些日志,你可以使用Kibana(www.elastic.co/产品/ kibana),即传说中的ELK技术栈。另外当前主流的大数据框架也几乎都支持ES,比如Flink和ES就是个完美搭档。
应用场景
日志实时分析
ES 应用最广泛的领域,支持全栈的日志分析
ES 拥有一套完整的日志解决方案(ELK),可以秒级实现从采集到展示。
搜索服务
全文索引
商品索引
时序分析
时序数据的特点是写入吞吐量特别高,ES 支持的同时也提供了丰富的多维统计分析算子
典型的场景是监控数据分析
物联网场景,也有大量的时序数据
数据分析
数据监控
查询服务
后端存储
不足
ES的优点在于查询,然而实践证明,在被作为数据库来使用,即写完马上查询会有延迟
ClickHouse相比 Elasticserach 做亿级别数据深度聚合需求会更加合适。
不支持包含频繁更新、事务(transaction)的操作
ES的权限这块还不完善
版本
Elastic 6.x 版只允许每个 Index 包含一个 Type,7.x 版将会彻底移除 Type。
6.3版本后开始支持sql
refer
阮一峰:https://www.ruanyifeng.com/blog/2017/08/elasticsearch.html
新版本:https://blog.csdn.net/ZYC88888/article/details/91463253
访问
web:9200
curl localhost:9200
0 条评论
下一页