37

使用go-mysql-elasticsearch将mysql实时同步到es

 4 years ago
source link: https://www.tuicool.com/articles/vuUjUfU
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

注意:

  1. mysql binlog必须是ROW模式 ,并且binlog_row_image必须为full
  2. 要同步的mysql数据表必须包含主键,否则直接忽略
  3. 不支持程序运行过程中修改表结构
  4. 要赋予用于连接mysql的账户RELOAD权限以及REPLICATION权限, SUPER权限:
GRANT REPLICATION SLAVE ON *.* TO 'elastic'@'127.0.0.1'; 
GRANT RELOAD ON *.* TO 'elastic'@'127.0.0.1'; 
UPDATE mysql.user SET Super_Priv='Y' WHERE user='elastic' AND host='127.0.0.1';

1.安装golang

下载地址 https://studygolang.com/dl

2.下载go-mysql-elasticsearch

下载地址 https://github.com/siddontang/go-mysql-elasticsearch

或者通过git

git clone https://github.com/siddontang/go-mysql-elasticsearch
cd go-mysql-elasticsearch
 make
 vi etc/river.toml

配置以下内容

# MySQL address, user and password
# user must have replication privilege in MySQL.
my_addr = "127.0.0.1:3306"
my_user = "root"
my_pass = "root"
my_charset = "utf8"

# Set true when elasticsearch use https
#es_https = false
# Elasticsearch address
es_addr = "127.0.0.1:9200"
# Elasticsearch user and password, maybe set by shield, nginx, or x-pack
es_user = ""
es_pass = ""

# Path to store data, like master.info, if not set or empty,
# we must use this to support breakpoint resume syncing.
# TODO: support other storage, like etcd.
data_dir = "./var"

# Inner Http status address
stat_addr = "127.0.0.1:12800"

# pseudo server id like a slave
server_id = 1001

# mysql or mariadb
flavor = "mysql"

# mysqldump execution path
# if not set or empty, ignore mysqldump.
#mysqldump = "mysqldump"
mysqldump='/usr/local/mysql-8.0.16-macos10.14-x86_64/bin/mysqldump'

# if we have no privilege to use mysqldump with --master-data,
# we must skip it.
#skip_master_data = false

# minimal items to be inserted in one bulk
bulk_size = 128

# force flush the pending requests if we don't have enough items >= bulk_size
flush_bulk_time = "200ms"

# Ignore table without primary key
skip_no_pk_table = false

# MySQL data source
[[source]]
schema = "mmmdb"

# Only below tables will be synced into Elasticsearch.
# "t_[0-9]{4}" is a wildcard table format, you can use it if you have many sub tables, like table_0000 - table_1023
# I don't think it is necessary to sync all tables in a database.
#tables = ["t", "t_[0-9]{4}", "tfield", "tfilter"]
tables=["user"]

[[rule]]
schema = "mmmdb"
table = "user"
index = "mmmdb"
type = "user"

[[rule.fields]]
name="name"
age="age"
sex="sex"

配置完成后执行

./bin/go-mysql-elasticsearch -config=./etc/river.toml

现在可以尝试操作mysql数据库,发现已经实时同步到es


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK