6

Gitlab sidekiq队列频繁崩溃问题解决

 1 year ago
source link: https://chegva.com/3712.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Gitlab sidekiq队列频繁崩溃问题解决

之前在公司将Gitlab从8.x版本编译升级到了11.x版本,存在两个大问题,一个就是有些仓库merge代码的时候一直转,查看数据库有行数据被锁死了,后面将mysql从5.5升级到5.7解决了。第二个问题就是在ci的时候有些任务匹配正则的时候导致sidekiq崩溃,当时查到的原因应该是re2这个依赖库兼容性有问题。后面离职后就没管了,最近前同事告诉我解决了,很棒,将大致思路分享下。


1.崩溃现象

通过日志可以发现

1. 所有的崩溃都是在执行PostReceive这个worker时发生的。2.通过call graph可以发现所有的崩溃root cause都是因为untrusted_regexp.rb:25里的段错误造成的





2020-02-24T20:21:34.562Z 30348 TID-grh05pnog UpdateMergeRequestsWorker JID-71d9b9c116cc958d619206d8 INFO:
start
/home/gitlab/gitlab/lib/gitlab/untrusted_regexp.rb:25: [BUG] Segmentation fault at 0x0000000000000000ruby 2.5.3p105 (2018-10-18 revision 65156) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0119 p:---- s:0627 e:000626 CFUNC  :error
c:0118 p:0068 s:0623 e:000620 METHOD /home/gitlab/gitlab/lib/gitlab/untrusted_regexp.rb:25 [FINISH]
c:0117 p:---- s:0614 e:000613 CFUNC  :new

2.故障分析处理

2.1尝试在崩溃代码前加日志,观察到底在处理什么业务相关的匹配。

# [root@xxx log]# cat /home/gitlab/gitlab/lib/gitlab/untrusted_regexp/ruby_syntax.rb

# frozen_string_literal: true
module Gitlab
 class UntrustedRegexp
   # This class implements support for Ruby syntax of regexps
   # and converts that to RE2 representation:
   # /<regexp>/<flags>
   class RubySyntax
     PATTERN = %r{^/(?<regexp>.*)/(?<flags>[ismU]*)$}.freeze
     # Checks if pattern matches a regexp pattern
     # but does not enforce it's validity
     def self.matches_syntax?(pattern)
       pattern.is_a?(String) && pattern.match(PATTERN).present?
     end
     # The regexp can match the pattern `/.../`, but may not be fabricatable:
     # it can be invalid or incomplete: `/match ( string/`
     def self.valid?(pattern, fallback: false)
   puts "xudy pattern is: \"#{pattern}\""
       !!self.fabricate(pattern, fallback: fallback)
end
     def self.fabricate(pattern, fallback: false)
       self.fabricate!(pattern, fallback: fallback)
     rescue RegexpError
       nil
     end
     def self.fabricate!(pattern, fallback: false)
       raise RegexpError, 'Pattern is not string!' unless pattern.is_a?(String)
       matches = pattern.match(PATTERN)
       raise RegexpError, 'Invalid regular expression!' if matches.nil?
begin
         create_untrusted_regexp(matches[:regexp], matches[:flags])
       rescue RegexpError
         raise unless fallback &&
             Feature.enabled?(:allow_unsafe_ruby_regexp, default_enabled: false)
         create_ruby_regexp(matches[:regexp], matches[:flags])
       end
     end
     def self.create_untrusted_regexp(pattern, flags)
       pattern.prepend("(?#{flags})") if flags.present?
   puts "xudy flaged pattern is: \"#{pattern}\" "
       UntrustedRegexp.new(pattern, multiline: false)
end
     private_class_method :create_untrusted_regexp
       def self.create_ruby_regexp(pattern, flags)
         options = 0
         options += Regexp::IGNORECASE if flags&.include?('i')
         options += Regexp::MULTILINE if flags&.include?('m')
         Regexp.new(pattern, options)
       end
       private_class_method :create_ruby_regexp
     end
end end

可以发现每次崩溃前都会输出这样的日志:
 2020-02-25T22:42:34.787Z 144464 TID-ov0hhbaas UpdateMergeRequestsWorker JID-da8363d5018135858fd6ee16 INFO:
 done: 0.187 sec
 2020-02-25T22:42:34.791Z 144464 TID-ov0hhbaas PostReceive JID-8e046a6ff89a71a5223db273 INFO: start
 2020-02-25T22:42:34.805Z 144464 TID-ov0hhbaas PostReceive JID-8e046a6ff89a71a5223db273 ERROR: xudy post
 receive worker: "fos"  "key-11733"  "{}"
 xudy post receive worker: "fos"  "key-11733"  "{}"
 2020-02-25T22:42:35.194Z 144464 TID-ov0hhbakc ProcessCommitWorker JID-e46ac41e6469409194fde0cb INFO: done:
 0.538 sec
 2020-02-25T22:42:35.199Z 144464 TID-ov0hhbakc PostReceive JID-818e393f097f60bd15565e32 INFO: start
 2020-02-25T22:42:35.200Z 144464 TID-ov0hhb88g ProcessCommitWorker JID-c32e2c1fd8f0edcc4f10397a INFO: done:
 0.588 sec
 2020-02-25T22:42:35.209Z 144464 TID-ov0hhb88g PostReceive JID-6b9b40108350752d3874c283 INFO: start
 2020-02-25T22:42:35.214Z 144464 TID-ov0hhbakc PostReceive JID-818e393f097f60bd15565e32 ERROR: xudy post
 receive worker: "office-company-api"  "key-16326"  "{}"
 xudy post receive worker: "office-company-api"  "key-16326"  "{}"
 2020-02-25T22:42:35.241Z 144464 TID-ov0hhb88g PostReceive JID-6b9b40108350752d3874c283 ERROR: xudy post
 receive worker: "fs-workbench-web"  "key-15813"  "{}"
 xudy post receive worker: "fs-workbench-web"  "key-15813"  "{}"
 xudy pattern is: "/^fds-cn-.*$/"
 xudy flaged pattern is: "^fds-cn-.*$"
 /home/gitlab/gitlab/lib/gitlab/untrusted_regexp.rb:25: [BUG] Segmentation fault at 0x0000000000000000
 ruby 2.5.3p105 (2018-10-18 revision 65156) [x86_64-linux]
 -- Control frame information -----------------------------------------------


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK