1

联邦学习框架 FATE 问题排查记录

 2 years ago
source link: https://www.hollischuang.com/archives/6629
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

mysql 不断重启

docker-compose ps 查看容器状态,发现 Mysql 和 Python一直处于Restarting状态

# docker-compose ps
            Name                           Command                 State                   Ports
--------------------------------------------------------------------------------------------------------------
confs-168801_client_1           /bin/sh -c flow init -c /d ...   Up           0.0.0.0:20000->20000/tcp
confs-168801_clustermanager_1   /tini -- bash -c java -Dlo ...   Up           4670/tcp, 8080/tcp
confs-168801_fateboard_1        /bin/sh -c java -Dspring.c ...   Up           0.0.0.0:8080->8080/tcp
confs-168801_mysql_1            docker-entrypoint.sh mysqld      Restarting
confs-168801_nodemanager_1      /tini -- bash -c java -Dlo ...   Up           4671/tcp, 8080/tcp
confs-168801_python_1           container-entrypoint /bin/ ...   Restarting
confs-168801_rollsite_1         /tini -- bash -c java -Dlo ...   Up           8080/tcp, 0.0.0.0:9370->9370/tcp

先查看 mysql 的容器名:

docker ps
CONTAINER ID        IMAGE                                                    COMMAND                  CREATED             STATUS                          PORTS                                                      NAMES
8ddc6e8aeee9        hub.c.163.com/federatedai/mysql:8                        "docker-entrypoint.s…"   8 hours ago         Restarting (1) 53 seconds ago                                                              confs-168801_mysql_1

查看 MySQL 日志:

docker logs 8ddc6e8aeee9

看到错误日志:

2021-11-08T10:27:26.422440Z 1 [ERROR] [MY-012639] [InnoDB] Write to file ./ibtmp1 failed at offset 0, 1048576 bytes should have been written, only 0 were written. Operating system error number 28. Check that your OS and file system support files of this size. Check also that the disk is not full or a disk quota exceeded.
2021-11-08T10:27:26.422522Z 1 [ERROR] [MY-012640] [InnoDB] Error number 28 means 'No space left on device'
2021-11-08T10:27:26.422685Z 1 [ERROR] [MY-012267] [InnoDB] Could not set the file size of './ibtmp1'. Probably out of disk space
2021-11-08T10:27:26.422766Z 1 [ERROR] [MY-012926] [InnoDB] Unable to create the shared innodb_temporary.
2021-11-08T10:27:26.422857Z 1 [ERROR] [MY-012930] [InnoDB] Plugin initialization aborted with error Generic error.
2021-11-08T10:27:26.818621Z 1 [ERROR] [MY-010334] [Server] Failed to initialize DD Storage Engine
2021-11-08T10:27:26.818852Z 0 [ERROR] [MY-010020] [Server] Data Dictionary initialization failed.
2021-11-08T10:27:26.819105Z 0 [ERROR] [MY-010119] [Server] Aborting
2021-11-08T10:27:26.819512Z 0 [System] [MY-010910] [Server] /usr/sbin/mysqld: Shutdown complete (mysqld 8.0.21)  MySQL Community Server - GPL.

错误日志提示磁盘没有空间了,通过 df 命令查看,确认是磁盘空间被耗尽:

[root@kubefate001 confs-168801]# df -h
文件系统        容量  已用  可用 已用% 挂载点
/dev/vda1        40G   39G     0  100% /
devtmpfs        7.5G     0  7.5G    0% /dev
tmpfs           7.6G     0  7.6G    0% /dev/shm
tmpfs           7.6G  860K  7.6G    1% /run
tmpfs           7.6G     0  7.6G    0% /sys/fs/cgroup
(全文完)
扫描二维码,关注作者微信公众号 %E4%BA%8C%E7%BB%B4%E7%A0%81.png

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK