8

solr(3)索引mysql数据

 3 years ago
source link: https://wakzz.cn/2017/10/01/solr/(3)%E7%B4%A2%E5%BC%95mysql%E6%95%B0%E6%8D%AE/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

1、修改配置

将数据库驱动jar和solr/dist路径下的solr-dataimporthandler-x.x.x.jar复制到solr-x.x.x/server/solr-webapp/webapp/WEB-INF/lib下
在solr/server/solr/核心/conf路径下添加文件data-config.xml,并添加以下内容(示范如下)

<dataConfig>
<!-- url如果包含特殊字符如";",必须使用html转移字符 -->
<dataSource driver="com.mysql.jdbc.Driver" url="jdbc:mysql://192.168.100.25:6660/tbcms" user="root" password="123456"/>
<document>
<!-- query为全量更新SQL -->
<entity name="T_TBPackage" pk="TBPackageID" query="select * from T_TBPackage"
<!-- 每一个field映射着数据库中列与文档中的域,column是数据库列,name是solr的域(必须是在managed-schema文件中配置过的域才行) -->
<field column="TBPackageID" name="id"/>
<field column="TBPackageName" name="TBPackageName"/>
<field column="PackageTypeID" name="PackageTypeID"/>
<field column="PINCount" name="PINCount"/>
<field column="PINCenterDistance" name="PINCenterDistance"/>
<field column="ElementBodyWidth" name="ElementBodyWidth"/>
<field column="ElementPlasticBodyLength" name="ElementPlasticBodyLength"/>
<field column="Height" name="Height"/>
<field column="ExposedPad" name="ExposedPad"/>
</entity>
</document>
</dataConfig>

修改solrconfig.xml,添加以下内容

<requestHandler name="/dataimport" class="solr.DataImportHandler">
<lst name="defaults">
<str name="config">data-config.xml</str>
</lst>
</requestHandler>

修改managed-schema,添加mysql中需要存入solr的字段(示范如下)

<field name="TBPackageName" type="string" indexed="true" stored="true"/>
<field name="PackageTypeID" type="string" indexed="true" stored="true"/>
<field name="PINCount" type="string" indexed="true" stored="true"/>
<field name="PINCenterDistance" type="string" indexed="true" stored="true"/>
<field name="ElementBodyWidth" type="string" indexed="true" stored="true"/>
<field name="ElementPlasticBodyLength" type="string" indexed="true" stored="true"/>
<field name="Height" type="string" indexed="true" stored="true"/>

重启solr

2、全量索引

2.1、执行全量更新

这里写图片描述

2.2、更新成功

这里写图片描述

3、增量索引

3.1、solr默认使用UTC时间,即与中国时差8小时,所以需要修改配置文件bin/solr.in.sh

SOLR_TIMEZONE="UTC+8"

3.2、修改mysql数据库的表结构,添加一个时间戳字段,当某行数据发生更新时该字段自动更新为修改数据的时间,为solr增量添加提供服务(范例如下)

last_modified timestamp not null on update current_timestamp default current_timestamp

3.3、修改solr/server/solr/核心/conf路径下添加文件data-config.xml,并添加增量SQL(示范如下)

<dataConfig>
<!-- url如果包含特殊字符如";",必须使用html转移字符 -->
<dataSource driver="com.mysql.jdbc.Driver" url="jdbc:mysql://192.168.100.25:6660/tbcms" user="root" password="123456"/>
<document>
<!-- deltaQuery检索出增量更新需要更新的字段 -->
<entity name="T_TBPackage" pk="TBPackageID" query="select * from T_TBPackage"
deltaQuery="select TBPackageID from T_TBPackage where last_modified > '${dih.last_index_time}'">
<!-- 每一个field映射着数据库中列与文档中的域,column是数据库列,name是solr的域(必须是在managed-schema文件中配置过的域才行) -->
<field column="TBPackageID" name="id"/>
<field column="TBPackageName" name="TBPackageName"/>
<field column="PackageTypeID" name="PackageTypeID"/>
<field column="PINCount" name="PINCount"/>
<field column="PINCenterDistance" name="PINCenterDistance"/>
<field column="ElementBodyWidth" name="ElementBodyWidth"/>
<field column="ElementPlasticBodyLength" name="ElementPlasticBodyLength"/>
<field column="Height" name="Height"/>
<field column="ExposedPad" name="ExposedPad"/>
</entity>
</document>
</dataConfig>

3.4、重启solr
3.5、增量更新,此处clean选项必须取消勾选。否则当增量更新成功后,会把没有增量更新的数据删除!!!
这里写图片描述
3.6、更新成功
这里写图片描述
注: 存放上一次更新时间的配置文件为solr/server/solr/核心/conf/dataimport.properties


Recommend

  • 47

    Optimize and expungeDeletes may no longer be so bad for you. They’re still expensive and should not be used casually. That said, these operations are no longer as susceptible to the issues listedin this blog. If you’re n...

  • 23

    This is the first in a two-part series where we introduce the basics of running Solr on Kubernetes (k8s) for search engineers. Specifically, we cover the following topics: Getting started with Google Kuberne...

  • 30
    • www.tuicool.com 5 years ago
    • Cache

    Running Solr w/ Etcd

    Not a recommendation. Still experimenting. Not rigorously tested. Help needed. Intimidated. Last week, while I was revisiting Apache S...

  • 28
    • www.tuicool.com 5 years ago
    • Cache

    Solr Monitoring Made Easy with Sematext

    As shown in Part 1 Solr Key Metrics to Monitor , the setup, tuning, and operations of Solr require deep insights into the performance metrics such as r...

  • 28
    • www.tuicool.com 5 years ago
    • Cache

    Solr Open Source Monitoring Tools

    Open source software adoption continues to grow. Tools likeKafkaand Solr are widely used in small startups, ones that are using cloud ready tools from the start, but also in l...

  • 31
    • panlf.github.io 5 years ago
    • Cache

    Solr的入门使用 | 梦起飞

    Solr是Apache下的一个顶级开源项目,采用Java开发,它是基于Lucene的全文搜索服务器。Solr提供了比Lucene更为丰富的查询语言,同时实现了可配置、可扩展,并对索引、搜索性能进行了优化,被很多需要搜索...

  • 45
    • www.tuicool.com 5 years ago
    • Cache

    Solr-RCE-via-Velocity-template

    solr通过启动的时候加上-a参数,就可以使用额外的 JVM 参数(例如以 -X 开头的参数)启动 Solr,下面开启一下jdwp的远程调试。 solr start -p 8988 -f -a "-Xdebug -Xrunjdwp:transport=dt_socket,server=y...

  • 4

    1、代码模块因为solrj没有提供MySQL的索引支持,所以只能基于http请求实现索引MySQL import java.io.IOException;import java.io.UnsupportedEncodingException;import java.util.ArrayList;import java.util.Date;i...

  • 4
    • miopas.github.io 3 years ago
    • Cache

    Solr 对文本索引及其查询

    Solr 对文本索引及其查询 — Miopas 关于 Solr 如何对文本数据做索引(indexing)。 Solr 在建立索引和处理查询的过程中,都对文本数据进行了预处理。这个过程涉及了 Solr 的三个组件:Analyzer, Tokenizer 以及 Filter。 Analyzer...

  • 6
    • blog.51cto.com 2 years ago
    • Cache

    Apache Solr 的 Spring Data (数据)

    版本 4.3.15

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK