61

Augnito Coding Test Problem | Character Set Count

 1 year ago
source link: http://codeforces.com/blog/entry/91344
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
By g_49, 12 months ago, In English

Hi, everyone. This problem was asked in Augnito Backend Developer Hiring Challenge on HackerEarth. This contest is now over. I've no idea how to solve this problem.

Statement

Given a string of length having lowercase English alphabets.

Let be the set of characters of a string . For example, if then .

Two sets are different if one of the sets has a character that is not present in the other set.

Find the number of different sets that can be obtained from all the non-empty substrings of the given string .

Constraints

Number of test cases

Sample Input

3
4
abbc
3
bcb
5
abbca 

Sample Output

6
3
7

UPD: tiasmondal's approach seems to be correct. I've added the code based on it.

C++ code

12 months ago, # |

I think the main crux here is that duplicate letters don't matter but order matters, for example.. If you have a string something like aabcba and you want to find the number of unique substring sets starting from last letter i.e a, is same as number of unique substrings in this string acba.. (omitting the duplicates) and this modified string size can be at most 26. so just traverse the string, maintain a vector of unique letters and if you have a previous occurrence of the current letter already in the vector, remove it and push_back this letter, and find different sets using bitmask by traversing this 26 size(at most) vector in order and insert in set. Finally return the answer. Hope it helps.

  • 12 months ago, # ^ |

    Surely this isn't correct. For example, consider string , then omitting one of the will cause you to get less sets from substrings. Because, in particular, in the original string, there is a set , but if you remove the last occurence of character , then you won't have that set.

    • 12 months ago, # ^ |

      Rev. 2  

      +1

      I am not telling to modify the original string.. but modify the vector as you traverse and calculate this for every index..., for example in the string you mentioned abcda... for i=0; "a", for i=1 "ab", for i=2 "abc", for i=3 "abcd", for i=4 "bcda", For each index calculate bitmask traversing vector from end.. in this way you will not miss {d,a} as you can see you will incorporate that in the index i=4.

      • 12 months ago, # ^ |

        Hmm okay, your language was a bit confusing. I understand what you're saying.

12 months ago, # |

Rev. 2  

+3

Precompute over all lowercase alphabets where is the position we are at currently and is the last position where alphabet ' occured uptil . Now iterate over all alphabets and those which have push them in a vector. Note that the size of this vector can be atmost .

Also maintain a set of binary masks of length in a set. For all set bits of mask we say that those bits(alphabets here) are valid and have occured.

Now create a mask of bits

Sort this vector of alphabets by their positions and iterate on it from the end as we want our current to be included in all the substrings. Also set the bits in this mask as and when you iterate and check if it is contained in our set "seen". If not insert it and increment answer by 1.

We have solved the problem in time and space. It should pass. You can further reduce the space by or times by storing the masks in a .

  • 12 months ago, # ^ |

    Well explained!

5 hours ago, # |

Java Solution

   static int count_sets(int N, String A){
        List<Character> chs = new ArrayList<>();
        Set<Integer> visited = new HashSet<>();

        for (char ch :  A.toCharArray()) {
            int index = chs.indexOf(ch);
            if (index != -1)
                chs.remove(index);
        
            chs.add(ch);

            for (int i=0; i < chs.size(); i++) {
                int mask = 0;
                for (int j=i; j < chs.size(); j++) {
                    int bit = chs.get(j) - 'a';
                    mask |= (1 << bit);
                }

                visited.add(mask);
            }
        }

        return visited.size();
    
    }

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK