36

算法 - 找最长回文字符串, 从3s到30ms的解法说明

 5 years ago
source link: https://tomoya92.github.io/2019/04/30/algorithm-3/?amp%3Butm_medium=referral
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

刷力扣的时候, 碰到了一个找最长回文字符串的题, 解开了, 但只要提交就提示超时, 然后就开始想办法优化, 下面说一下我处理的方法

a2auIvI.png!web

有兴趣的可以先尝试着解一下这个题, 然后再看我下面整理的解法思路

分析: 这题一般能想到的解法是

  1. 找到所有的子字符串
  2. 判断所有的子字符串是否是回文字符串
  3. 统计回文子字符串的长度
  4. 排序找出最长的那个

照着这个思路就有了下面的解法

@Test
public void test() {
  long l = System.currentTimeMillis();
  // 共999个e
  String s =
      "eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee";
  System.out.println("字符串长度: " + s.length() + " 结果: " + longestPalindrome0(s));
  long l1 = System.currentTimeMillis();
  System.out.println("用时: " + (l1 - l) + "ms");
}

public String longestPalindrome0(String s) {
  if (s == null || s.equals("")) return "";
  if (s.length() == 1) return s;
  int maxLength = 0;
  String ns = "";
  // 找出所有的子串
  for (int i = 0; i < s.length(); i++) {
    for (int j = i + 1; j <= s.length(); j++) {
      // 两层循环找出所有的子串
      String s1 = s.substring(i, j);
      // 判断是回文字符串
      int len = s1.length() % 2 == 0 ? s1.length() / 2 : s1.length() / 2 + 1;
      boolean b = false;
      for (int k = 0; k < len; k++) {
        // 通过比较第一个和倒数最后一个, 第二个和倒数第二个...是否相等判断是否是回文字符串
        b = s1.substring(k, k + 1).equals(s1.substring(s1.length() - 1 - k, s1.length() - k));
        // 如果碰到有一对不一样, 直接break, 省一些时间
        if (!b) break;
      }
      if (b) {
        // 如果是回文字符串, 跟先前保存的长度做比较, 如果比先前保存的回文字符串长度长, 则替换
        if (maxLength < s1.length()) {
          maxLength = s1.length();
          ns = s1;
        }
      }
    }
  }
  return ns;
}

原链接文: https://tomoya92.github.io/2019/04/30/algorithm-3/

运行结果:

字符串长度: 999 结果: eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
用时: 3846ms

一看时间, 3s多, 这怎么能行, 然后我发现 StringBuffer 或者 StringBuilder 有个 reverse() 方法, 可以反转字符串, 比自己for循环的效率要高不少, 果断换了, 于是就有了下面的代码

public String longestPalindrome(String s) {
  if (s == null || s.equals("")) return "";
  if (s.length() == 1) return s;
  int maxLength = 0;
  String ns = "";
  // 找出所有的子串
  for (int i = 0; i < s.length(); i++) {
    for (int j = i + 1; j <= s.length(); j++) {
      // 封装成StringBuilder对象
      StringBuilder s1 = new StringBuilder(s.substring(i, j));
      // 使用reverse()方法判断是否是回文字符串, 下面逻辑就跟上面那种解法一样了
      if (s1.toString().equals(s1.reverse().toString())) {
        if (maxLength < s1.length()) {
          maxLength = s1.length();
          ns = s1.toString();
        }
      }
    }
  }
  return ns;
}

下面是测试结果

字符串长度: 999 结果: eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
用时: 656ms

这少了将近3s呀, 赶紧提交代码, 然而力扣还是嫌太费时了, 不能通过, 这可怎么办, 能想到的办法也就到这了呀

然后我上网搜了一下怎么快速判断回文字符串, 看到了 Manacher算法 也被翻译过来是马拉车算法, 通过插相同值的方法来处理的

下面来说一下大致思路(如果我这没说明白的, 网上关于这方面的介绍文章很多)

回文字符串处理起来长度要分奇数和偶数两种, 处理方式还不一样, 所以这算法首先把字符串插成了奇数个, 怎么个插法呢?

先在字符串两边各加上一个字符串中不可能出现的字符且两个字符不能相等, 然后字符串中间用同一个字符间隔插入, 举例

假如有一个字符串 ababc , 首先在两边插入两个不同的且不可能出现在字符串中的字符后变成这样 ^ababc$ 然后在新字符串每个字符中间插上一个相同的字符, 比如 # 就变成 ^#a#b#a#b#c#$

这样插值后, 字符串就变成一个长度为奇数的字符串了, 上面这个例子是奇数字符串插值后的结果, 偶数字符串插值后也会变成奇数字符串

这样就好处理了, 对这个新生成的字符串做循环, 从第二位一直循环到倒数第二位, 为啥不循环第一位和最后一位呢? 看下面解释

循环要做的逻辑操作就是找当前循环的字符的前一位(多位)和后一位(多位)相比较, 如果一样,则继续找, 不一样则跳过(也就证明了不是回文字符串), 举个例子

还是上面那个字符串 ^#a#b#a#b#c#$ 循环从第二位开始(下标为1), 假设现在循环到了下标为4的字符了, 那它会做啥呢?

  1. 先找下标为3和下标为5的两个字符是否一样
  2. 如果一样, 则继续往两边找下标为2和6的两个字符是否一样
  3. 以此类推, 直到找到下标为0和下标为8的两位是不一样的, 则跳过

在比较的过程中可以创建一个数组来存放比较了多少次, 当前字符默认为1次, 比较成功一次则+1, 直到失败, 这样比较完后, 可以得到下面这个表格

下标 0 1 2 3 4 5 6 7 8 9 10 11 12 插值后字符串 ^ # a # b # a # b # c # $ 比较相等的次数 0 0 1 0 3 0 3 0 1 0 1 0 0

观察上面插值后的字符串可以得知, 比较的次数就是回文串的长度, 仔细观察一下就可以看出来了

有这规律就好办了, 于是就有了下面的代码

public String longestPalindrome1(String s) {
  if (s == null || s.equals("")) return "";
  char[] chars = s.toCharArray();
  // 插入字符
  String ns = "^#";
  for (int i = 0; i < chars.length; i++) {
    ns = ns + chars[i] + "#";
  }
  ns += "$";
  char[] chars1 = ns.toCharArray();
  int[] p = new int[chars1.length];
  boolean b;
  int j = 1;
  int maxLength = 0;
  String ns1 = "";
  // 只循环第2到倒数第二之间的字符串
  for (int i = 1; i < chars1.length - 1; i++) {
    b = true;
    while (b) {
      // 从当前循环的字符开始,往两边查找字符比较
      char c1 = chars1[i - j];
      char c2 = chars1[i + j];
      if (c1 == c2) {
        // 比较通过, 则+1
        p[i] = p[i] + 1;
        j++;
      } else {
        // 不通过, 开始统计最大数, 当当前比较次数比之前记录的比较次数大, 则更新上面的记录变量
        if (maxLength < p[i]) {
          maxLength = p[i];
          ns1 = ns.substring(i - j + 1, i + j);
        }
        j = 1;
        b = false;
      }
    }
  }
  // 最后别忘了把#去掉
  return ns1.replace("#", "");
}

运行结果

字符串长度: 999 结果: eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
用时: 26ms

从昨天纠结到今天, 能想到的方法就第一种和第二种, 第三种插值的方法完全没有想到, 今天上网搜相关的解法的时候看到的, 不得不佩服别人

上面三种解法的时间复杂度从 O(n^3) 降到 O(n^2) 到最后的 O(n) 相对的时间也从最开始的3s降到最后的30ms, 感慨算法的伟大

写博客不易,转载请保留原文链接,谢谢!

原文链接:


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK