45

DBLE分库分表实战-我的二狗子呢

 4 years ago
source link: https://blog.51cto.com/lee90/2434496
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

DBLE分库分表实战

环境: DBLE 2.19.03.0

OS版本: CentOS Linux release 7.6.1810 (Core) 

IP:  192.168.20.10/24

MySQL版本: MySQL-社区版-5.7.26

添加2个账号授权:

create user 'rw'@'%' identified by 'rw123456';

create user 'rd'@'%' identified by 'rd123456';

GRANT SELECT, INSERT, UPDATE, DELETE, CREATE,REFERENCES,CREATE TEMPORARY TABLES,INDEX ON *.* TO  rw@'%' ;

GRANT SELECT ON *.* TO 'rd'@'%' ;

连接方式:

mysql -urw -prw123456 --port 8066 -h 192.168.20.10 testdb 

mysql -urd -prd123456 --port 8066 -h 192.168.20.10 testdb 

ddl专用:

mysql -uop -p123456 --port 8066 -h 192.168.20.10 testdb 

管理账号:

mysql -uman1 -p654321 --port 9066 -h 192.168.20.10 

解压DBLE:

tar xf dble-2.19.03.tar.gz  /usr/local/

cd /usr/local

ln -s dble-2.19.03 dble

cd conf/

vim schema.xml   修改后的如下:

<?xml version="1.0"?>
<!DOCTYPE dble:schema SYSTEM "schema.dtd">
<dble:schema xmlns:dble="http://dble.cloud/" version="2.19.03.0">

    <schema name="testdb">
        <!-- 全局表 -->
        <table name="company" primaryKey="id" type="global" dataNode="dn1,dn2,dn3"/>

        <!-- range分区2 -->
        <table name="travelrecord" primaryKey="id" dataNode="dn1,dn2,dn3" rule="sharding-by-range_t"/>

        <!-- hash mod 3 分区 -->
        <table name="hotnews" primaryKey="id" dataNode="dn1,dn2,dn3" rule="id-sharding-by-mod3"/>

        <!-- hashStringmod3 分区 -->
        <table name="user_auth" primaryKey="open_id" dataNode="dn1,dn2,dn3" rule="user-auth-sharding-by-open_id" />

        <!-- ER 分区 -->
        <table name="order1" dataNode="dn1,dn2,dn3" rule="id-sharding-by-mod3"> 
            <childTable name="order_detail" primaryKey="id" joinKey="order_id" parentKey="id" /> 
        </table>
    </schema>
    
    <dataNode name="dn1" dataHost="192.168.20.10" database="db1"/>
    <dataNode name="dn2" dataHost="192.168.20.10" database="db2"/>
    <dataNode name="dn3" dataHost="192.168.20.10" database="db3"/>
    
    <dataHost name="192.168.20.10" maxCon="500" minCon="10" balance="0" switchType="-1" slaveThreshold="100">
        <heartbeat>select user()</heartbeat>
        <writeHost host="hostM" url="192.168.20.10:3306" user="rw" password="rw123456">
            <readHost host="hostS" url="192.168.20.10:3306" user="rd" password="rd123456"/>
        </writeHost>
    </dataHost>
    
</dble:schema>

vim rule.xml 修改后的内容如下:

    <tableRule name="sharding-by-range_t">
        <rule>
            <columns>id</columns>
            <algorithm>rangeLong2</algorithm>
        </rule>
    </tableRule>

    <tableRule name="id-sharding-by-mod3">
        <rule>
            <columns>id</columns>
            <algorithm>hashmod3</algorithm>
        </rule>
    </tableRule>

    <tableRule name="user-auth-sharding-by-open_id">
        <rule>                                                                                                                      
            <columns>open_id</columns>
            <algorithm>hashStringmod3</algorithm>
        </rule>
    </tableRule>

    <function name="rangeLong2" class="NumberRange">
        <property name="mapFile">autopartition-long_t.txt</property>
        <property name="defaultNode">0</property><!-- 不符合条件的插入到第一个分区去 -->
    </function>
    
    <function name="hashmod3" class="Hash">
        <property name="partitionCount">3</property>
        <property name="partitionLength">1</property>
    </function>
    
    <function name="hashStringmod3" class="StringHash">
        <property name="partitionCount">3</property>
        <property name="partitionLength">1</property>
        <property name="hashSlice">0:20</property>  <!-- 表示取前20位进行hash取模后再决定数据落在那个分片上 -->
    </function>

[root@centos7 /usr/local/dble/conf ]#  vim autopartition-long_t.txt  # 增加一个路由规则文件

# range start-end ,data node index
# K=1000,M=10000.
# 范围:前开后闭 (开区间,闭区间]
0-1M=0
1M-2M=1
2M-3M=2

vim server.xml 内容如下:

修改user部分为如下: 
    <user name="man1">
        <property name="password">654321</property>
        <property name="manager">true</property>
        <!-- manager user can't set schema-->
    </user>
    <user name="op">
        <property name="password">123456</property>
        <property name="schemas">testdb</property>
    </user>
        <!-- table's DML privileges  INSERT/UPDATE/SELECT/DELETE -->
<!--
        <privileges check="true">
            <schema name="testdb" dml="0110" >
                <table name="employee" dml="1111"></table>
            </schema>
        </privileges>
-->
    <user name="rw">
        <property name="password">rw123456</property>
        <property name="schemas">testdb</property>
    </user>
    <user name="rd">
        <property name="password">rd123456</property>
        <property name="schemas">testdb</property>
        <property name="readOnly">true</property>
    </user>

然后, reload 下 dble , 进行测试

ddl专用:

    mysql -uop -p123456 --port 8066 -h 192.168.20.10 testdb 

去创建符合上面的要求的几个表,并写入数据测试:

## 测试range分区
(testdb) > create table travelrecord (
id bigint not null primary key,
user_id varchar(100),
traveldate DATE, 
fee decimal(10,2),
days int
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

(testdb) > insert into travelrecord (id,user_id,traveldate,fee,days) values(10,'wang','2014-01-05',510,3);
(testdb) > insert into travelrecord (id,user_id,traveldate,fee,days) values(13000,'lee','2011-01-05',26.5,3);
(testdb) > insert into travelrecord (id,user_id,traveldate,fee,days) values(29800,'zhang','2018-01-05',23.3,3);

(testdb) > select * from travelrecord ;
+-------+---------+------------+--------+------+
| id    | user_id | traveldate | fee    | days |
+-------+---------+------------+--------+------+
|    10 | wang    | 2014-01-05 | 510.00 |    3 |
| 13000 | lee     | 2011-01-05 |  26.50 |    3 |
| 29800 | zhang   | 2018-01-05 |  23.30 |    3 |
+-------+---------+------------+--------+------+
## 测试全局表
(testdb) > create table company(id int not null primary key,name varchar(100)); 

(testdb) > insert into company(id,name) values(1,'hp');
(testdb) > insert into company(id,name) values(2,'ibm');
(testdb) > insert into company(id,name) values(3,'oracle');

(testdb) > select * from company ;
+----+--------+
| id | name   |
+----+--------+
|  1 | hp     |
|  2 | ibm    |
|  3 | oracle |
+----+--------+
3 rows in set (0.01 sec)

多执行几次,你会看到三个分片上都插入了3条数据,因为company定义为全局表。

(testdb) > explain insert into company(id,name) values(1,'hp');
+-----------+----------+---------------------------------------------+
| DATA_NODE | TYPE     | SQL/REF                    |
+-----------+----------+---------------------------------------------+
| dn1       | BASE SQL | insert into company(id,name) values(1,'hp') |
| dn2       | BASE SQL | insert into company(id,name) values(1,'hp') |
| dn3       | BASE SQL | insert into company(id,name) values(1,'hp') |
+-----------+----------+---------------------------------------------+
3 rows in set (0.00 sec)

使用 explain select * from company ;   命令也可以看到随机分发到3个节点的。
## 测试hashmod分区
create table hotnews (id bigint unsigned not null primary key ,title varchar(400) ,created_time datetime) ENGINE=InnoDB DEFAULT CHARSET=utf8;

然后, 我们写个脚本,批量插入些数据,看看情况:

for i in {1..1000}; do 
  mysql -uop -p123456 --port 8066 -h 192.168.20.10 testdb  -e "insert into hotnews(id,title,created_time) values($i,'one',now());"
done

然后,到后端的3个分片上看下数据量,大致如下,还是比较均匀的:
(db1) > select count(*)  from db1.hotnews;
+----------+
| count(*) |
+----------+
|      333 |
+----------+
1 row in set (0.00 sec)

(db1) > select count(*)  from db2.hotnews;
+----------+
| count(*) |
+----------+
|      334 |
+----------+
1 row in set (0.00 sec)

(db1) > select count(*)  from db3.hotnews;
+----------+
| count(*) |
+----------+
|      333 |
+----------+
1 row in set (0.00 sec)
## hashStringmod分区
CREATE TABLE `user_auth` (
  `id` bigint unsigned NOT NULL AUTO_INCREMENT COMMENT '主键id',
  `open_id` varchar(100) NOT NULL DEFAULT '' COMMENT '第三方授权id',
  `union_id` varchar(100) NOT NULL DEFAULT '' COMMENT '授权的关联id',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='用户AUTH信息表' ;

#### 注意:实际生产环境的主键id需要由程序去保证唯一性(例如使用雪花算法)

(testdb) > insert into user_auth (id,open_id,union_id) values(1,'331116828422393856','oy0IAj9mdPUr7bLMl879Jp37eV3Y');
(testdb) > insert into user_auth (id,open_id,union_id) values(2,'341170994247204864','oy0IA3Yj9mdPUr7bLMl879Jp37eV');
(testdb) > insert into user_auth (id,open_id,union_id) values(3,'330414325695332352','oy0IAj9mdPU3Yr7bLMl879Jp37eV');
(testdb) > insert into user_auth (id,open_id,union_id) values(4,'328588424011591680','oy0IAj9mdPUr7bLMl8Jp37e79V');
(testdb) > insert into user_auth (id,open_id,union_id) values(5,'330414325695332352','oy0IA3Yj9mdPUr7p37ebLMl879JV3Y');
(testdb) > insert into user_auth (id,open_id,union_id) values(6,'341172222247211111','oy0IAj9bLMl879Jp37eV3YmdPUr7');
(testdb) > insert into user_auth (id,open_id,union_id) values(7,'341173334247755464','Jp37eoy0IAj9mdPUr73YbLMl879V');

(testdb) > select id,open_id,union_id from user_auth order by id asc ;
+----+--------------------+--------------------------------+
| id | open_id            | union_id                       |
+----+--------------------+--------------------------------+
|  1 | 331116828422393856 | oy0IAj9mdPUr7bLMl879Jp37eV3Y   |
|  2 | 341170994247204864 | oy0IA3Yj9mdPUr7bLMl879Jp37eV   |
|  3 | 330414325695332352 | oy0IAj9mdPU3Yr7bLMl879Jp37eV   |
|  4 | 328588424011591680 | oy0IAj9mdPUr7bLMl8Jp37e79V     |
|  5 | 330414325695332352 | oy0IA3Yj9mdPUr7p37ebLMl879JV3Y |
|  6 | 341172222247211111 | oy0IAj9bLMl879Jp37eV3YmdPUr7   |
|  7 | 341173334247755464 | Jp37eoy0IAj9mdPUr73YbLMl879V   |
+----+--------------------+--------------------------------+
7 rows in set (0.00 sec)

(testdb) > explain select id,open_id,union_id from user_auth where open_id = '341173334247755464' ;
+-----------+----------+--------------------------------------------------------------------------------+
| DATA_NODE | TYPE     | SQL/REF                                                                        |
+-----------+----------+--------------------------------------------------------------------------------+
| dn2       | BASE SQL | select id,open_id,union_id from user_auth where open_id = '341173334247755464' |
+-----------+----------+--------------------------------------------------------------------------------+
1 row in set (0.00 sec)

(testdb) > explain select id,open_id,union_id from user_auth where open_id = '331116828422393856' ;
+-----------+----------+--------------------------------------------------------------------------------+
| DATA_NODE | TYPE     | SQL/REF                                                                        |
+-----------+----------+--------------------------------------------------------------------------------+
| dn1       | BASE SQL | select id,open_id,union_id from user_auth where open_id = '331116828422393856' |
+-----------+----------+--------------------------------------------------------------------------------+
1 row in set (0.00 sec)

(testdb) > explain select id,open_id,union_id from user_auth where open_id = '328588424011591680' ;
+-----------+----------+--------------------------------------------------------------------------------+
| DATA_NODE | TYPE     | SQL/REF                                                                        |
+-----------+----------+--------------------------------------------------------------------------------+
| dn3       | BASE SQL | select id,open_id,union_id from user_auth where open_id = '328588424011591680' |
+-----------+----------+--------------------------------------------------------------------------------+
1 row in set (0.00 sec)

############################################################################

上面就是几种常用的分区了, 另外还有种 date类型按时间分区的可能在日志表的场景下也常用些。

date类型分区的实验:

先去后端的db上创建物理的库:

create database userdb1 ;
create database userdb2 ;
create database userdb3 ;
create database userdb4 ;
create database userdb5 ;
create database userdb6 ;
create database userdb7 ;
create database userdb8 ;
create database userdb9 ;
create database userdb10 ;
create database userdb11 ;
create database userdb12 ;
create database userdb13 ;

修改后的 schema.xml 类似如下:

<?xml version="1.0"?>
<!DOCTYPE dble:schema SYSTEM "schema.dtd">

<dble:schema xmlns:dble="http://dble.cloud/" version="2.19.03.0">

    <schema name="testdb">
        <!-- 按月分片 -->
        <table name="user" dataNode="user_dn$1-13" rule="sharding-by-month-user"/>
    </schema>
    
    <dataNode name="user_dn1" dataHost="192.168.20.10" database="userdb1"/>
    <dataNode name="user_dn2" dataHost="192.168.20.10" database="userdb2"/>
    <dataNode name="user_dn3" dataHost="192.168.20.10" database="userdb3"/>
    <dataNode name="user_dn4" dataHost="192.168.20.10" database="userdb4"/>
    <dataNode name="user_dn5" dataHost="192.168.20.10" database="userdb5"/>
    <dataNode name="user_dn6" dataHost="192.168.20.10" database="userdb6"/>
    <dataNode name="user_dn7" dataHost="192.168.20.10" database="userdb7"/>
    <dataNode name="user_dn8" dataHost="192.168.20.10" database="userdb8"/>
    <dataNode name="user_dn9" dataHost="192.168.20.10" database="userdb9"/>
    <dataNode name="user_dn10" dataHost="192.168.20.10" database="userdb10"/>
    <dataNode name="user_dn11" dataHost="192.168.20.10" database="userdb11"/>
    <dataNode name="user_dn12" dataHost="192.168.20.10" database="userdb12"/>
    <dataNode name="user_dn13" dataHost="192.168.20.10" database="userdb13"/>
    
    <dataHost name="192.168.20.10" maxCon="500" minCon="10" balance="0" switchType="-1" slaveThreshold="100">
        <heartbeat>select user()</heartbeat>
        <writeHost host="hostM" url="192.168.20.10:3306" user="rw" password="rw123456">
            <readHost host="hostS" url="192.168.20.10:3306" user="rd" password="rd123456"/>
        </writeHost>
    </dataHost>
    
</dble:schema>

然后,到 rule.xml中添加规则:

<tableRule name="sharding-by-month-user">
    <rule>
        <columns>addData</columns>
        <algorithm>partbymonth-user</algorithm>
    </rule>
</tableRule>

<!-- 加的基于月份的分片规则, 注意如果数量超了 会插入报错 -->
    <function name="partbymonth-user" class="Date">
        <property name="dateFormat">yyyy-MM-dd</property>
            <property name="sBeginDate">2018-01-01</property>
         <!--  <property name="sEndDate">2019-02-31</property> -->
        <property name="sPartionDay">30</property>  <!-- 默认是每10天一个分片。我这里改成每30天一个分片,另外注意并不按照固定的月来写入 -->
        <property name="defaultNode">0</property><!-- 默认小于 2018-01-01 的数据插入到dn1去 -->
    </function>
(testdb) > create table if not exists user (addData date, dbname varchar(32),username varchar(32),province varchar(16),age int(3));

(testdb) > insert into user (addData,dbname,username,age) values ('2015-01-01',database(),'user1',12);

(testdb) > insert into user (addData,dbname,username,age) values ('2016-02-01',database(),'user1',12);

(testdb) > explain  insert into user (addData,dbname,username,age) values ('2017-03-01',database(),'user1',12);
+-----------+----------+--------------------------------------------------------------------------------------------------+
| DATA_NODE | TYPE     | SQL/REF                                                                                          |
+-----------+----------+--------------------------------------------------------------------------------------------------+
| user_dn1  | BASE SQL | INSERT INTO user (addData, dbname, username, age) VALUES ('2017-03-01', DATABASE(), 'user1', 12) |
+-----------+----------+--------------------------------------------------------------------------------------------------+
(testdb) > insert into user (addData,dbname,username,age) values ('2017-03-01',database(),'user1',12);


(testdb) > insert into user (addData,dbname,username,age) values ('2018-04-01',database(),'user1',12);
(testdb) > insert into user (addData,dbname,username,age) values ('2018-04-11',database(),'user1',12);
(testdb) > insert into user (addData,dbname,username,age) values ('2018-04-21',database(),'user1',12);
(testdb) > insert into user (addData,dbname,username,age) values ('2018-04-25',database(),'user1',12);
(testdb) > insert into user (addData,dbname,username,age) values ('2018-04-30',database(),'user1',12);
(testdb) > insert into user (addData,dbname,username,age) values ('2018-05-01',database(),'user1',12);
(testdb) > insert into user (addData,dbname,username,age) values ('2018-05-03',database(),'user1',12);
(testdb) > insert into user (addData,dbname,username,age) values ('2018-05-05',database(),'user1',12);
(testdb) > insert into user (addData,dbname,username,age) values ('2018-06-21',database(),'user1',12);
(testdb) > insert into user (addData,dbname,username,age) values ('2018-07-30',database(),'user1',12);
(testdb) > insert into user (addData,dbname,username,age) values ('2019-01-01',database(),'user1',12);

(testdb) > insert into user (addData,dbname,username,age) values ('2019-06-01',database(),'user1',12);
ERROR 1064 (HY000): can't find any valid data node :user -> ADDDATA -> 2019-06-01


因此,我们需要提前人工把分片加好 并做好可用分区的监控,不然会造成无法写入数据的事故出现。


(testdb) > select * from user order by addData asc ;
+------------+----------+----------+----------+------+
| addData    | dbname   | username | province | age  |
+------------+----------+----------+----------+------+
| 2015-01-01 | userdb1  | user1    | NULL     |   12 |
| 2016-02-01 | userdb1  | user1    | NULL     |   12 |
| 2017-03-01 | userdb1  | user1    | NULL     |   12 |
| 2018-04-01 | userdb4  | user1    | NULL     |   12 |
| 2018-04-11 | userdb4  | user1    | NULL     |   12 |
| 2018-04-21 | userdb4  | user1    | NULL     |   12 |
| 2018-04-25 | userdb4  | user1    | NULL     |   12 |
| 2018-04-30 | userdb4  | user1    | NULL     |   12 |
| 2018-05-01 | userdb5  | user1    | NULL     |   12 |
| 2018-05-03 | userdb5  | user1    | NULL     |   12 |
| 2018-05-05 | userdb5  | user1    | NULL     |   12 |
| 2018-06-21 | userdb6  | user1    | NULL     |   12 |
| 2018-07-30 | userdb8  | user1    | NULL     |   12 |
| 2019-01-01 | userdb13 | user1    | NULL     |   12 |
+------------+----------+----------+----------+------+
14 rows in set (0.02 sec)


查询测试:
(testdb) > explain select * from user where addData between '2018-04-01' and '2018-04-30' ;
+-----------+----------+------------------------------------------------------------------------+
| DATA_NODE | TYPE     | SQL/REF                                                                |
+-----------+----------+------------------------------------------------------------------------+
| user_dn4  | BASE SQL | select * from user where addData between '2018-04-01' and '2018-04-30' |
+-----------+----------+------------------------------------------------------------------------+
1 row in set (0.00 sec)


(testdb) > select * from user where addData between '2018-04-01' and '2018-04-30' ;
+------------+---------+----------+----------+------+
| addData    | dbname  | username | province | age  |
+------------+---------+----------+----------+------+
| 2018-04-01 | userdb4 | user1    | NULL     |   12 |
| 2018-04-11 | userdb4 | user1    | NULL     |   12 |
| 2018-04-21 | userdb4 | user1    | NULL     |   12 |
| 2018-04-25 | userdb4 | user1    | NULL     |   12 |
| 2018-04-30 | userdb4 | user1    | NULL     |   12 |
+------------+---------+----------+----------+------+
5 rows in set (0.01 sec)


(testdb) > explain select * from user where addData between '2018-04-01' and '2018-05-30' order by addData asc ;
+-----------------+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| DATA_NODE       | TYPE          | SQL/REF                                                                                                                                                                                 |
+-----------------+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| user_dn4_0      | BASE SQL      | select `user`.`addData`,`user`.`dbname`,`user`.`username`,`user`.`province`,`user`.`age` from  `user` where addData BETWEEN '2018-04-01' AND '2018-05-30' ORDER BY `user`.`addData` ASC |
| user_dn5_0      | BASE SQL      | select `user`.`addData`,`user`.`dbname`,`user`.`username`,`user`.`province`,`user`.`age` from  `user` where addData BETWEEN '2018-04-01' AND '2018-05-30' ORDER BY `user`.`addData` ASC |
| merge_1         | MERGE         | user_dn4_0; user_dn5_0                                                                                                                                                                  |
| shuffle_field_1 | SHUFFLE_FIELD | merge_1                                                                                                                                                                                 |
+-----------------+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4 rows in set (0.00 sec)


(testdb) > select * from user where addData between '2018-04-01' and '2018-05-30' order by addData asc ;
+------------+---------+----------+----------+------+
| addData    | dbname  | username | province | age  |
+------------+---------+----------+----------+------+
| 2018-04-01 | userdb4 | user1    | NULL     |   12 |
| 2018-04-11 | userdb4 | user1    | NULL     |   12 |
| 2018-04-21 | userdb4 | user1    | NULL     |   12 |
| 2018-04-25 | userdb4 | user1    | NULL     |   12 |
| 2018-04-30 | userdb4 | user1    | NULL     |   12 |
| 2018-05-01 | userdb5 | user1    | NULL     |   12 |
| 2018-05-03 | userdb5 | user1    | NULL     |   12 |
| 2018-05-05 | userdb5 | user1    | NULL     |   12 |
+------------+---------+----------+----------+------+
8 rows in set (0.01 sec)

date类型的可用分区的监控(脚本的原理同样适用于其他类型的分区):

简单的做法就是定期执行一个explain的insert插入测试, 如果有ERROR关键字就告警出来

一个简单的脚本如下:
# 提前60天预警

DAYS=$(date -d 60days  +%F)
echo $DAYS

if mysql -urw -prw123456 --port 8066 -h 192.168.20.10 testdb 2>/dev/null -e "explain insert into user (addData,dbname,username,age) values (\"$DAYS\",database(),'user1',12);" ; then 
    echo "当前可用分片数量处于安全状态"
else
    echo "需要加新的分片了"
fi

date类型加新的分片的方法: 

1、修改schema.xml 加上新的分片的配置信息,修改后大致这样:

<?xml version="1.0"?>
<!DOCTYPE dble:schema SYSTEM "schema.dtd">
<dble:schema xmlns:dble="http://dble.cloud/" version="2.19.03.0">

    <schema name="testdb">	
        <!-- 按月分片 -->
        <table name="user" dataNode="user_dn$1-23" rule="sharding-by-month-user"/>
    </schema>

    <dataNode name="user_dn1" dataHost="192.168.20.10" database="userdb1"/>
    <dataNode name="user_dn2" dataHost="192.168.20.10" database="userdb2"/>
    <dataNode name="user_dn3" dataHost="192.168.20.10" database="userdb3"/>
    <dataNode name="user_dn4" dataHost="192.168.20.10" database="userdb4"/>
    <dataNode name="user_dn5" dataHost="192.168.20.10" database="userdb5"/>
    <dataNode name="user_dn6" dataHost="192.168.20.10" database="userdb6"/>
    <dataNode name="user_dn7" dataHost="192.168.20.10" database="userdb7"/>
    <dataNode name="user_dn8" dataHost="192.168.20.10" database="userdb8"/>
    <dataNode name="user_dn9" dataHost="192.168.20.10" database="userdb9"/>
    <dataNode name="user_dn10" dataHost="192.168.20.10" database="userdb10"/>
    <dataNode name="user_dn11" dataHost="192.168.20.10" database="userdb11"/>
    <dataNode name="user_dn12" dataHost="192.168.20.10" database="userdb12"/>
    <dataNode name="user_dn13" dataHost="192.168.20.10" database="userdb13"/>
    <dataNode name="user_dn14" dataHost="192.168.20.10" database="userdb14"/>
    <dataNode name="user_dn15" dataHost="192.168.20.10" database="userdb15"/>
    <dataNode name="user_dn16" dataHost="192.168.20.10" database="userdb16"/>
    <dataNode name="user_dn17" dataHost="192.168.20.10" database="userdb17"/>
    <dataNode name="user_dn18" dataHost="192.168.20.10" database="userdb18"/>
    <dataNode name="user_dn19" dataHost="192.168.20.10" database="userdb19"/>
    <dataNode name="user_dn20" dataHost="192.168.20.10" database="userdb20"/>
    <dataNode name="user_dn21" dataHost="192.168.20.10" database="userdb21"/>
    <dataNode name="user_dn22" dataHost="192.168.20.10" database="userdb22"/>
    <dataNode name="user_dn23" dataHost="192.168.20.10" database="userdb23"/>

    <dataHost name="192.168.20.10" maxCon="500" minCon="10" balance="0" switchType="-1" slaveThreshold="100">
        <heartbeat>select user()</heartbeat>
        <writeHost host="hostM" url="192.168.20.10:3306" user="rw" password="rw123456">
            <readHost host="hostS" url="192.168.20.10:3306" user="rd" password="rd123456"/>
        </writeHost>
    </dataHost>
</dble:schema>


2、重载配置文件
reload @@config_all ;


3、去后端创建对应的物理库  
create database userdb14;
.....这里省略其它的建库语句.......
create database userdb23;

4、通过dble再次下发下建表命令
create table if not exists user (addData date, dbname varchar(32),username varchar(32),province varchar(16),age int(3));


5、插入数据测试
(testdb) > explain insert into user (addData,dbname,username,age) values ('2019-11-01',database(),'user1',12);
+-----------+----------+--------------------------------------------------------------------------------------------------+
| DATA_NODE | TYPE     | SQL/REF                                                                                          |
+-----------+----------+--------------------------------------------------------------------------------------------------+
| user_dn23 | BASE SQL | INSERT INTO user (addData, dbname, username, age) VALUES ('2019-11-01', DATABASE(), 'user1', 12) |
+-----------+----------+--------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

(testdb) > explain insert into user (addData,dbname,username,age) values ('2019-12-01',database(),'user1',12);
ERROR 1064 (HY000): can't find any valid data node :user -> ADDDATA -> 2019-12-01

######################################################################################################

ER 表 (互联网场景下用多表JOIN的不多,因此ER分片规则不太常用到,但是需要大致的了解):

下面的内容大篇幅参考: https://blog.csdn.net/zhanglei_16/article/details/50779929

1:ER分片关系简介

有一类业务,例如订单(ORDER)跟订单明细表(ORDER_DETAIL),明细表会依赖订单单,就是该会存在表的主从关系,

这类似业务的切分可以抽象出合适的切分规则,比如根据用户ID切分,其它相关的表都依赖于用户ID,再或者根据订单ID进行切分,

总之部分业务总会可以抽象出父子关系的表。这类表适用于ER分片表,子表的记录与所关联的父表记录存放在同一个数据分片上,

避免数据Join跨库操作,以order与order_detail例子为例,schema.xml中定义合适的分片配置,order,order_detail 根据order_id

迕行数据切分,保证相同order_id的数据分到同一个分片上,在进行数据插入操作时,Mycat会获取order所在的分片,

然后将order_detail也插入到order所在的分片

2:父表按照主键ID分片,字表的分片字段与主表ID关联,配置为ER分片

2.1:在schema.xml添加如下配置配置文件修改

<!-- ER 分区 -->

<table name="order1" dataNode="dn1,dn2,dn3" rule="id-sharding-by-mod3"> 

 <childTable name="order_detail" primaryKey="id" joinKey="order_id" parentKey="id" /> 

</table>

在rule.xml里面设定分片规则:

    <tableRule name="id-sharding-by-mod3">

        <rule>                                                                                                                                                 

            <columns>id</columns>

            <algorithm>hashmod3</algorithm>

        </rule>

    </tableRule>

    <!-- mod 3 -->

    <function name="hashmod3" class="Hash">                                                                                                                    

        <property name="partitionCount">3</property>

        <property name="partitionLength">1</property>

    </function>

然后, reload 下 dble 

2.2 先建表, order 和 order_detail 表,有主外键关系

mysql> explain CREATE TABLE order1 (id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,sn VARCHAR(64),create_time DATETIME) ENGINE=InnoDB DEFAULT CHARSET=utf8;

+-----------+-----------------------------------------------------------------------------------------------------+

| DATA_NODE | SQL                                                                                                 |

+-----------+-----------------------------------------------------------------------------------------------------+

| dn1       | CREATE TABLE order1(id int unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY,sn VARCHAR(64),create_time DATETIME) |

| dn2       | CREATE TABLE order1(id int unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY,sn VARCHAR(64),create_time DATETIME) |

| dn3       | CREATE TABLE order1(id int unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY,sn VARCHAR(64),create_time DATETIME) |

+-----------+-----------------------------------------------------------------------------------------------------+

3 rows in set (0.02 sec)

mysql> CREATE TABLE order1(id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,sn VARCHAR(64),create_time DATETIME) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Query OK, 0 rows affected (0.35 sec)

mysql> CREATE TABLE order_detail(id INT AUTO_INCREMENT PRIMARY KEY, order_id INT,ord_status CHAR(1),address VARCHAR(128),create_time DATETIME,CONSTRAINT FK_ORDid FOREIGN KEY (order_id) REFERENCES order1 (id)) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Query OK, 0 rows affected (0.44 sec)

3.3 录入数据:

mysql> explain INSERT INTO order1(id,sn,create_time) VALUES(1,'BJ0001',NOW());

+-----------+----------------------------------------------------------------+

| DATA_NODE | SQL                                                            |

+-----------+----------------------------------------------------------------+

| dn2       | INSERT INTO order1(id,sn,create_time) VALUES(1,'BJ0001',NOW()) |

+-----------+----------------------------------------------------------------+

1 row in set (0.03 sec)

录入数据,一组组录入,涉及到外键关系: 

第一组北京的订单

mysql> INSERT INTO order1(id,sn,create_time) VALUES(1,'BJ0001',NOW());

Query OK, 1 row affected (0.05 sec)

mysql> INSERT INTO ORDER_DETAIL(id,order_id,ord_status,address,create_time) VALUES (1,1,'1','test data  of order1(id=1,BJ001) ',NOW());

第二组上海的订单:

mysql> explain INSERT INTO order1(id,sn,create_time) VALUES(3,'SHH001',NOW());

+-----------+----------------------------------------------------------------+

| DATA_NODE | SQL                                                            |

+-----------+----------------------------------------------------------------+

| dn1       | INSERT INTO order1(id,sn,create_time) VALUES(3,'SHH001',NOW()) |

+-----------+----------------------------------------------------------------+

1 row in set (0.02 sec)

mysql> INSERT INTO order1(id,sn,create_time) VALUES(3,'SHH001',NOW());

Query OK, 1 row affected (0.04 sec)

mysql> INSERT INTO ORDER_DETAIL(id,order_id,ord_status,address,create_time) VALUES (3,3,'1','test data of order1(id=3,SHH001)',NOW());

Query OK, 1 row affected (0.06 sec)

第三组广州的订单:

mysql> explain INSERT INTO order1(id,sn,create_time) VALUES(4,'GZH004',NOW());

+-----------+----------------------------------------------------------------+

| DATA_NODE | SQL                                                            |

+-----------+----------------------------------------------------------------+

| dn2       | INSERT INTO order1(id,sn,create_time) VALUES(4,'GZH004',NOW()) |

+-----------+----------------------------------------------------------------+

1 row in set (0.00 sec)

mysql> INSERT INTO order1(id,sn,create_time) VALUES(4,'GZH004',NOW());

Query OK, 1 row affected (0.06 sec)

mysql> INSERT INTO ORDER_DETAIL(id,order_id,ord_status,address,create_time) VALUES (4,4,'1','test data  of order1(id=4,GZH004) ',NOW());

Query OK, 1 row affected (0.05 sec)

第四组 武汉的订单,这里故意将order_id设置成4,看看效果,是否随id为4的广州的那组分片:

mysql> explain INSERT INTO order1(id,sn,create_time) VALUES(5,'WUHAN005',NOW());

+-----------+------------------------------------------------------------------+

| DATA_NODE | SQL                                                              |

+-----------+------------------------------------------------------------------+

| dn3       | INSERT INTO order1(id,sn,create_time) VALUES(5,'WUHAN005',NOW()) |

+-----------+------------------------------------------------------------------+

1 row in set (0.01 sec)

mysql> explain INSERT INTO order1(id,sn,create_time) VALUES(6,'WUHAN006',NOW());

Query OK, 1 row affected (0.03 sec)

mysql> INSERT INTO ORDER_DETAIL(id,order_id,ord_status,address,create_time) VALUES (6,4,'1','test data  of order1(id=6,WUHAN006) ',NOW());

Query OK, 1 row affected (0.05 sec)

通过DBLE,查看下数据写入的情况:

(testdb) > select * from order1;

+----+--------+---------------------+

| id | sn     | create_time         |

+----+--------+---------------------+

|  1 | BJ0001 | 2019-08-31 23:05:36 |

|  4 | GZH004 | 2019-08-31 23:06:57 |

|  3 | SHH001 | 2019-08-31 23:06:43 |

+----+--------+---------------------+

3 rows in set (0.01 sec)

(testdb) > select * from order_detail ;

+----+----------+------------+--------------------------------------+---------------------+

| id | order_id | ord_status | address                              | create_time         |

+----+----------+------------+--------------------------------------+---------------------+

|  1 |        1 | 1          | test data  of ORDER1(ID=1,BJ001)     | 2019-08-31 23:06:17 |

|  4 |        4 | 1          | test data  of ORDER1(ID=4,GZH004)    | 2019-08-31 23:07:01 |

|  6 |        4 | 1          | test data  of ORDER1(ID=6,WUHAN006)  | 2019-08-31 23:07:23 |

|  3 |        3 | 1          | test data of ORDER1(ID=3,SHH001)     | 2019-08-31 23:06:47 |

+----+----------+------------+--------------------------------------+---------------------+

4 rows in set (0.01 sec)

直连后端的db1,看下数据情况 (db2 和 db3 上面的数据查看,使用同样的方法);

((none)) > select * from db1.order1;

+----+--------+---------------------+

| id | sn     | create_time         |

+----+--------+---------------------+

|  3 | SHH001 | 2019-08-31 23:06:43 |

+----+--------+---------------------+

1 row in set (0.00 sec)

((none)) > select * from db1.order_detail;

+----+----------+------------+----------------------------------+---------------------+

| id | order_id | ord_status | address                          | create_time         |

+----+----------+------------+----------------------------------+---------------------+

|  3 |        3 | 1          | test data of ORDER1(ID=3,SHH001) | 2019-08-31 23:06:47 |

+----+----------+------------+----------------------------------+---------------------+

1 row in set (0.00 sec)

2.6 走DBLE,模拟下业务的查询:

(testdb) > explain select t1.*,t2.* from order1 t1,order_detail t2 where t2.ord_status='1' and t2.id=1 and t1.id=t2.order_id;

+-----------------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| DATA_NODE       | TYPE          | SQL/REF                                                                                                                                                                                                                                                      |

+-----------------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| dn1_0           | BASE SQL      | select `t2`.`id`,`t2`.`order_id`,`t2`.`ord_status`,`t2`.`address`,`t2`.`create_time`,`t1`.`id`,`t1`.`sn`,`t1`.`create_time` from  `order1` `t1` join  `order_detail` `t2` on `t1`.`id` = `t2`.`order_id` where (`t2`.`ord_status` = '1') AND (`t2`.`id` = 1) |

| dn2_0           | BASE SQL      | select `t2`.`id`,`t2`.`order_id`,`t2`.`ord_status`,`t2`.`address`,`t2`.`create_time`,`t1`.`id`,`t1`.`sn`,`t1`.`create_time` from  `order1` `t1` join  `order_detail` `t2` on `t1`.`id` = `t2`.`order_id` where (`t2`.`ord_status` = '1') AND (`t2`.`id` = 1) |

| dn3_0           | BASE SQL      | select `t2`.`id`,`t2`.`order_id`,`t2`.`ord_status`,`t2`.`address`,`t2`.`create_time`,`t1`.`id`,`t1`.`sn`,`t1`.`create_time` from  `order1` `t1` join  `order_detail` `t2` on `t1`.`id` = `t2`.`order_id` where (`t2`.`ord_status` = '1') AND (`t2`.`id` = 1) |

| merge_1         | MERGE         | dn1_0; dn2_0; dn3_0                                                                                                                                                                                                                                          |

| shuffle_field_1 | SHUFFLE_FIELD | merge_1                                                                                                                                                                                                                                                      |

+-----------------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

5 rows in set (0.00 sec)

(testdb) > SELECT

  t1.*,

  order1 t1,

  order_detail t2

WHERE t2.ord_status = '1'

  AND t2.id = 1

  AND t1.id = t2.order_id ;

+----+--------+---------------------+----+----------+------------+-----------------------------------+---------------------+

| id | sn     | create_time         | id | order_id | ord_status | address                           | create_time         |

+----+--------+---------------------+----+----------+------------+-----------------------------------+---------------------+

|  1 | BJ0001 | 2019-08-31 23:05:36 |  1 |        1 | 1          | test data  of ORDER1(ID=1,BJ001)  | 2019-08-31 23:06:17 |

+----+--------+---------------------+----+----------+------------+-----------------------------------+---------------------+

1 row in set (0.00 sec)

2.7 总结:当子表与父表的关联字段正好是父表的分片字段时,子表直接根据父表规则进行分片,在数据录入的时候子表直接放在父表的分片上面,在进行关联查询join的时候,走的是父表的路由。

【重要】其它的总结:

当子表与父表的关联字段不是父表的分片字段时,必须通过查找对应的父表记录来确认子表所在分片,如果找不到则会抛出错误,在join查询的时候,路由走的是所有分片节点!!!!


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK