44

pt-tools系统:pt-kill 实战

 5 years ago
source link: http://keithlan.github.io/2018/11/23/pt_kill_practise/?amp%3Butm_medium=referral
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

列出几种常用场景,并进行分析实战测试

特殊、 打印出执行时间超过3秒的connection,仅仅打印,不kill

  • 每2秒循环一次,超过10秒就退出pt-kill程序
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --print --ignore-self --busy-time=3 --interval 2 --run-time=10

重点注意: 这里的--busy-time=3,指的是Command=Query的连接,其他的并不会被匹配哦 , 所以一般情况下删除的都是比较安全的用户thread
重点注意2: ddl,dml,select,都是属于Command=Query  

+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------------------------------+
| Id   | User          | Host                 | db           | Command          | Time   | State                                                         | Info                                      |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------------------------------+
|    3 | repl          | xx.xxx.126.166:60528 | NULL         | Binlog Dump GTID | 492777 | Master has sent all binlog to slave; waiting for more updates | NULL                                      |
|    4 | repl          | xx.xxx.126.165:48604 | NULL         | Binlog Dump GTID | 492765 | Master has sent all binlog to slave; waiting for more updates | NULL                                      |
|  502 | job_heartbeat | xx.xxx.2.217:34626   | heartbeat_db | Sleep            |      0 |                                                               | NULL                                      |
| 1053 | dbadmin       | localhost            | heartbeat_db | Query            |      1 | altering table                                                | alter table heartbeat add column ts2 date|
| 1055 | pt_kill       | xx.xxx.126.166:63167 | NULL         | Sleep            |      1 |                                                               | NULL                                      |
| 1056 | dbadmin       | localhost            | NULL         | Query            |      0 | starting                                                      | show processlist                          |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------------------------------+


+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------------------------------------+
| Id   | User          | Host                 | db           | Command          | Time   | State                                                         | Info                                            |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------------------------------------+
|    3 | repl          | xx.xxx.126.166:60528 | NULL         | Binlog Dump GTID | 492716 | Master has sent all binlog to slave; waiting for more updates | NULL                                            |
|    4 | repl          | xx.xxx.126.165:48604 | NULL         | Binlog Dump GTID | 492704 | Master has sent all binlog to slave; waiting for more updates | NULL                                            |
|  502 | job_heartbeat | xx.xxx.2.217:34626   | heartbeat_db | Sleep            |      0 |                                                               | NULL                                            |
| 1053 | dbadmin       | localhost            | heartbeat_db | Query            |      3 | updating                                                      | update heartbeat set ts = '2018-09-06 00:07:58'|
| 1055 | pt_kill       | xx.xxx.126.166:63167 | NULL         | Sleep            |      0 |                                                               | NULL                                            |
| 1056 | dbadmin       | localhost            | NULL         | Query            |      0 | starting                                                      | show processlist                                |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------------------------------------+

+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------+
| Id   | User          | Host                 | db           | Command          | Time   | State                                                         | Info              |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------+
|    3 | repl          | xx.xxx.126.166:60528 | NULL         | Binlog Dump GTID | 492613 | Master has sent all binlog to slave; waiting for more updates | NULL              |
|    4 | repl          | xx.xxx.126.165:48604 | NULL         | Binlog Dump GTID | 492601 | Master has sent all binlog to slave; waiting for more updates | NULL              |
|  502 | job_heartbeat | xx.xxx.2.217:34626   | heartbeat_db | Sleep            |      0 |                                                               | NULL              |
| 1053 | dbadmin       | localhost            | heartbeat_db | Query            |      3 | User sleep                                                    | select 1,sleep(4) |
| 1055 | pt_kill       | xx.xxx.126.166:63167 | NULL         | Sleep            |      1 |                                                               | NULL              |
| 1056 | dbadmin       | localhost            | NULL         | Query            |      0 | starting                                                      | show processlist  |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------+

+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+------------------+
| Id   | User          | Host                 | db           | Command          | Time   | State                                                         | Info             |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+------------------+
|    3 | repl          | xx.xxx.126.166:60528 | NULL         | Binlog Dump GTID | 492740 | Master has sent all binlog to slave; waiting for more updates | NULL             |
|    4 | repl          | xx.xxx.126.165:48604 | NULL         | Binlog Dump GTID | 492728 | Master has sent all binlog to slave; waiting for more updates | NULL             |
|  502 | job_heartbeat | xx.xxx.2.217:34626   | heartbeat_db | Sleep            |      0 |                                                               | NULL             |
| 1053 | dbadmin       | localhost            | heartbeat_db | Query            |      4 | starting                                                      | rollback |
| 1055 | pt_kill       | xx.xxx.126.166:63167 | NULL         | Sleep            |      0 |                                                               | NULL             |
| 1056 | dbadmin       | localhost            | NULL         | Query            |      0 | starting                                                      | show processlist |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+------------------+

一、打印出sleep时间超过3秒的connection,仅仅打印,不kill

  • 每2秒循环一次,无限循环下去
pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --print --ignore-self --idle-time=3 --interval 2
  • 每2秒循环一次,超过10秒就退出pt-kill程序
pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --print --ignore-self --idle-time=3 --interval 2 --run-time=10

二、kill掉query语句中带有sleep关键字(不区分大小写)的connection, 且Time超过3秒

pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --print --ignore-self  --interval 2 --match-info "(?i-xsm:(sleep))" --busy-time=3 --kill --victims all

三、kill掉非系统用户的select开头,且执行时间超过3秒的 connection

pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --ignore-self --print  --interval 2 --match-info "(?i-xsm:^(select))"  --ignore-user="root|repl"  --busy-time=3 --kill --victims all

四、kill掉非系统用户的select,update,delete开头,且执行时间超过3秒的 connection

pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --ignore-self --print  --interval 2 --match-info "(?i-xsm:^(select))|(?i-xsm:^(update))|(?i-xsm:^(delete))"  --ignore-user="root|repl"  --busy-time=3 --kill --victims all

五、kill掉指定特征的query语句

  • kill掉非系统用户,且query语句中同时包含heartbeat 和 where ,且heartbeat在前where在后,且执行时间超过3秒的 connection
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --match-info ".*heartbeat.*where.*" --ignore-user="root|repl" --busy-time=3 --kill --victims all


# 2018-11-15T10:38:03 KILL 1053 (Query 27 sec) select *,sleep(10) from heartbeat  where id < '1000000000'
# 2018-11-15T10:38:05 KILL 1053 (Query 29 sec) select *,sleep(10) from heartbeat  where id < '1000000000'
# 2018-11-15T10:38:07 KILL 1053 (Query 31 sec) select *,sleep(10) from heartbeat  where id < '1000000000'
# 2018-11-15T10:38:09 KILL 1053 (Query 33 sec) select *,sleep(10) from heartbeat  where id < '1000000000'

六、kill掉非系统库的select开头,且执行时间超过3秒的 connection

pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --ignore-self --print  --interval 2 --match-info "(?i-xsm:^(select))"  --ignore-db="mysql|information_schema"   --ignore-user="root|repl"  --busy-time=3 --kill --victims all

七、kill掉非系统用户的select,update,delete开头,且执行时间超过3秒,且不是被locked住 的 connection

pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --ignore-self --print  --interval 2 --match-info "(?i-xsm:^(select))|(?i-xsm:^(update))|(?i-xsm:^(delete))"  --ignore-state="Locked"   --ignore-user="root|repl"  --busy-time=3  --kill  --victims all

八、kill掉非系统用户,指定state(Locked、login、Updating、Sorting for order等状态),且执行时间超过3秒 的 connection

  • 指定Locked的connection删除掉
pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --ignore-self --print  --interval 2 --match-info "(?i-xsm:^(select))|(?i-xsm:^(update))|(?i-xsm:^(delete))"  --match-state="Locked"   --ignore-user="root|repl"  --busy-time=3 --kill --victims all

九、kill掉非系统用户,指定Command(Query、Sleep、Binlog Dump、Connect等状态),且执行时间超过3秒 的 connection

  • kill掉指定Connect的command connection
pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --ignore-self --print  --interval 2 --match-info "(?i-xsm:^(select))|(?i-xsm:^(update))|(?i-xsm:^(delete))"  --match-command="Connect"   --ignore-user="root|repl"  --busy-time=3 --kill --victims all

十、kill掉指定来源host ip ,且select开头的,且执行时间超过3s的connection

pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --ignore-self --print  --interval 2 --match-info "(?i-xsm:^(select))"  --ignore-host="x.x.x.x"   --ignore-user="root|repl"  --busy-time=3 --kill --victims all

十一、kill掉非系统用户,Command=Sleep,且空闲时间为3s的connection

pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --ignore-self --print  --interval 2  --ignore-user="root|repl"  --idle-time=3 --kill --victims all

十二、kill掉非系统用户,指定特征的query,在后台运行,并打印日志

pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --ignore-self --print  --interval 2 --match-info ".*heartbeat.*where.*"  --ignore-user="root|repl"  --busy-time=3 --daemonize --log='/root/kill.log' --kill --victims all

十三、—victims的用法

  • 背景
pid:1103 , 是最早开启事务的空闲进程 T1
pid:1095 , 是第二早开启事务的ddl进程 T2
pid:502  , 是最后一个开启事务的dml进程 T3

事务顺序是: T1 锁住了 T2,T3,  T2锁住了T3 , T3被T1,T2锁住  

+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+---------------------------------------------------------+
| Id   | User          | Host                 | db           | Command          | Time   | State                                                         | Info                                                    |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+---------------------------------------------------------+
|    3 | repl          | xx.xxx.126.166:60528 | NULL         | Binlog Dump GTID | 507504 | Master has sent all binlog to slave; waiting for more updates | NULL                                                    |
|    4 | repl          | xx.xxx.126.165:48604 | NULL         | Binlog Dump GTID | 507492 | Master has sent all binlog to slave; waiting for more updates | NULL                                                    |
|  502 | job_heartbeat | xx.xxx.2.217:34626   | heartbeat_db | Query            |    328 | Waiting for table metadata lock | insert into heartbeat(ts) values('2018-11-15 13:43:50') |
| 1095 | dbadxxx       | localhost            | heartbeat_db | Query            |    329 | Waiting for table metadata lock                               | alter table heartbeat add column ts3 date               |
| 1103 | dbadxxx       | localhost            | heartbeat_db | Sleep            |    341 |                                                               | NULL                                                    |
| 1104 | dbadxxx       | localhost            | NULL         | Query            |      0 | starting                                                      | show processlist                                        |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+---------------------------------------------------------+
6 rows in set (0.00 sec)
  • —match-command=”Query|Sleep” —victims oldest —busy-time=3

只kill最老的command为Query|Sleep的最老的链接

pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --ignore-user="root|repl"   --match-command="Query|Sleep"      --busy-time=3  --victims oldest

Enter MySQL password:
# 2018-11-15T13:49:07 KILL 1103 (Sleep 330 sec) NULL
# 2018-11-15T13:49:09 KILL 1103 (Sleep 332 sec) NULL
  • —match-command=”Query|Sleep” —victims all —busy-time=3

kill 所有command=”Query|Sleep” 的所有链接

Enter MySQL password:
# 2018-11-15T13:48:41 KILL 1103(Sleep 304sec)NULL
# 2018-11-15T13:48:41 KILL 1095 (Query 292 sec) alter table heartbeat add column ts3 date
# 2018-11-15T13:48:41 KILL 502 (Query 291 sec) insert into heartbeat(ts) values('2018-11-15 13:43:50')
# 2018-11-15T13:48:41 KILL 1104 (Sleep 262 sec) NULL
  • —match-command=”Query” —victims all —busy-time=3

kill 所有 command=”Query”(默认不填也就是Query)的所有链接

Enter MySQL password:
# 2018-11-15T13:46:59 KILL 1095(Query 190sec)alter table heartbeat add column ts3 date
# 2018-11-15T13:46:59 KILL 502 (Query 189 sec) insert into heartbeat(ts) values('2018-11-15 13:43:50')
  • —match-command=”Query” —victims oldest —busy-time=3

kill 所有 command=”Query”(默认不填也就是Query)的最老的链接

Enter MySQL password:
# 2018-11-15T13:59:01 KILL 1095(Query 912sec)alter table heartbeat add column ts3 date

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK