当前位置: 首页 > Linux

ElasticSearch连载一基本介绍

时间:2023-04-06 03:40:01 Linux

ElasticSearch简称ES,ES是一个高度可扩展、开源的全文搜索和分析引擎,能够准实时地快速存储、搜索和分析海量数据。应用场景我们常见商城产品的搜索日志分析系统(ELK)是基于海量数据(千万级别的数据),需要快速调查、分析、可视化结果。安装并运行ESJava环境。安装Elastic需要Java8环境。如果你的机器没有安装Java,可以参考JAVA安装ElasticSearch安装安装好Java环境后,我们可以开始下面的ElasticSearch安装或者按照官方文档https://artifacts.elastic.co/downloads/elasticsearch安装wget/elasticsearch-5.5.1.zipunzipelasticsearch-5.5.1.zipcdelasticsearch-5.5.1/进入解压目录后,运行如下命令启动ElasticSearch/bin/elasticsearch如果此时报如下错误Error1OpenJDK64位服务器VM警告:如果处理器的数量预计会从一个增加,那么您应该使用-XX:ParallelGCThreads=N适当地配置并行GC线程的数量打开:elasticsearch-5.5.1/config/jvm。options最后添加:-XX:-AssumeMPError2OpenJDK64-BitServerVMwarning:INFO:os::commit_memory(0x0000000085330000,2060255232,0)failed;error='Cannotallocatememory'(errno=12)先执行:sysctl-wvm.max_map_count=262144然后打开elasticsearch-5.5.1/config/jvm.options-Xmx512m-Xms512m错误三【2019-06-27T15:01:43,165][警告][o.e.b.ElasticsearchUncaughtExceptionHandler][]线程[main]org.elasticsearch.bootstrap中未捕获的异常。启动异常:java.lang.RuntimeException:cannotrunelasticsearchasroot原因:从elasticsearchversion5开始,出于安全考虑,不允许root用户运行。解决方法:创建普通用户,修改elasticsearch安装目录权限,切换到普通用户运行elasticsearch。useraddelkchown-Relk.elk/usr/local/share/applications/elasticsearch-5.5.1su-elkcd/usr/local/share/applications/elasticsearch-5.5.1restart./bin/elasticsearch如果一切正常,Elastic将在默认端口9200上运行。此时,打开另一个命令行窗口,请求端口,你会得到一个描述。$curl'localhost:9200'{"name":"cWyaT72","cluster_name":"elasticsearch","cluster_uuid":"A7akNm1SRw2Gm-BdSBkdaw","version":{"number":"5.5.1","build_hash":"19c13d0","build_date":"2017-07-18T20:44:24.823Z","build_snapshot":false,"lucene_version":"6.6.0"},"tagline":"你知道,对于Search"}访问配置Elastic默认只允许本地访问。如果需要远程访问,可以修改config/elasticsearch.yml文件,去掉network.host的注释,修改为0.0.0.0,重启Elastic。network.host:0.0.0.0在上面的代码中,将其设置为0.0.0.0以便任何人都可以访问它。在线服务不要这样设置,要设置到特定的IP。基本概念Node和ClusterElastic本质上是一个分布式数据库,可以让多台服务器协同工作,每台服务器可以运行多个Elastic实例。单个Elastic实例称为节点。一组节点形成一个集群。查看集群Healthcurl-XGET'http://localhost:9200/_cat/health?v'获取集群所有节点curl-XGET'http://localhost:9200/_cat/nodes?v'IndexElastic会索引所有fields,处理后写入倒排索引(InvertedIndex)。查找数据时,直接查找索引。(Index类似于传统关系数据库中的数据库,是存放关系文档的地方)。因此,Elastic数据管理的顶层单元称为Index(索引)。它是单个数据库的同义词。每个索引(即数据库)的名称必须是小写的。下面的命令可以查看当前节点的所有Index。curl-XGET'http://localhost:9200/_cat/indices?v'DocumentIndex中的一条记录称为一个Document,多个Document构成一个Index。文档以JSON格式表示,如:{"goods_name":"airconditioner","category_name":"Homeappliancecategory","price":"3999.00"}同一个Index中的文档不要求相同structure(scheme),但最好保持一致,有利于提高搜索效率。TypeDocument可以分组,比如goods_listIndex,可以按照类别(家电,衣服)或者价格(>1000,<1000)进行分类。这个分组叫做Type,是一个虚拟的过滤Document的逻辑分组。列出每个Index下的Typecurl'http://localhost:9200/_mapping?pretty=true'根据计划,Elastic6.x版本只允许每个Index包含一个Type,7.x版本将完全移除Type。Index操作CreateNew(创建索引)创建一个新的Index,可以直接向Elastic服务器发送PUT请求。下面的例子是新建一个名为goods_list的Index。curl-XPUT'http://localhost:9200/goods_list'服务器返回一个JSON对象,其中的acknowledge字段表示操作成功。{"acknowledged":true,"shards_acknowledged":true}删除(删除索引)curl-XDELETE'http://localhost:9200/goods_list'{"acknowledged":true}数据操作上面介绍了一些Index和TypeBasic了解了Index的概念和基本操作,下面我们来创建一个完整的Index结构并对数据进行操作。新索引结构curl-XPUT'localhost:9200/goods_list'-d'{"mappings":{"goods_info":{"properties":{"goods_name":{"type":"keyword"},"category_name":{"type":"keyword"},"price":{"type":"float"}}}}}'{"acknowledged":true}执行上面的命名并创建一个新的索引来添加新的记录发送具有指定/Index/Type的PUT请求以在索引中添加新记录。例如向/goods_list/goods_info发送请求添加商品记录。curl-XPUT'localhost:9200/goods_list/goods_info/1'-d'{"goods_name":"HuaweiLaptop","category_name":"Computer","price":"1000"}'返回的JSON对象server,会给出Index,Type,Id,Version等信息:{"_index":"goods_list","_type":"goods_info","_id":"1","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"created":true}相信细心的你会发现/goods_list/goods_info/1,后面多了一个1,这个1就是记录的ID。添加新记录时可以是任意字符串,也可以不指定Id,此时应该改为POST请求。curl-XPOST'localhost:9200/goods_list/goods_info'-d'{"goods_name":"washingmachine","category_name":"homeappliances","price":"899.99"}'如果没有指定ID,Elastic会随机生成一串字符串作为ID{"_index":"goods_list","_type":"goods_info","_id":"AWub5f7FFq1D5epJJhqT","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"created":true}查看记录curl'localhost:9200/goods_list/goods_info/1?pretty=true'以上代码requeststoview/对于记录goods_list/goods_info/1,URL的参数pretty=true表示以易读格式返回记录。返回数据中found字段表示查询成功,_source字段返回原始记录:{"_index":"goods_list","_type":"goods_info","_id":"1","_version":1,"found":true,"_source":{"goods_name":"华为笔记本","category_name":"电脑","price":"1000"}}如果ID不正确,数据找不到,并且found字段为false。curl'localhost:9200/goods_list/goods_info/2?pretty=true'ID=2不存在,所以会返回如下结果:{"_index":"goods_list","_type":"goods_info","_id":"2","found":false}删除记录curl-XDELETE'localhost:9200/goods_list/goods_info/1'PS:这里不要删除这条记录,后面会用到。更新记录curl-XPUT'localhost:9200/goods_list/goods_info/1'-d'{"user":"HuaweiNotebook","title":"Computer","desc":"5000"}'更新记录为使用PUT请求重新发送数据。{"_index":"goods_list","_type":"goods_info","_id":"1","_version":2,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"created":false}返回结果中,有几个字段发生了变化:"_version":2,"result":"updated","created":false数据查询返回所有记录curl'localhost:9200/goods_list/goods_info/_search'{"took":127,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":2,"max_score":1,"hits":[{"_index":"goods_list","_type":"goods_info","_id":"AWub5f7FFq1D5epJJhqT","_score":1,"_source":{"goods_name":"洗衣机","category_name":"家电","price":"899.99"}},{"_index":"goods_list","_type":"goods_info","_id":"1","_score":1,"_source":{"user":"华为笔记本","title":"电脑","desc":"5000"}}]}}上面代码中,返回结果的taken字段表示操作的耗时(毫秒),timed_out字段表示是否超时,hits字段表示命中记录。含义各子字段如下:total:返回记录条数,本例中有2条max_score:最高匹配度,本例中为1.0hits:返回记录数组,返回记录中,每条记录有一个_score字段,表示匹配的程序,默认是按照这个字段降序排列的。总结这里主要介绍Elastic的安装,基本概念,数据的基本操作,下一章会带来弹性分词与全文检索及相关技术点参考链接https://www.elastic.co/guide/...原文地址https://github.com/WilburXu/b...