Home >Web Front-end >HTML Tutorial >Codeforces Round #244 (Div. 2)D (Suffix Automata)_html/css_WEB-ITnose
Codeforces Round #244 (Div. 2)D (Suffix Automata)
(The node marked 0 must be a null node and cannot be used in any case. Remember, in the future Can't be wrong)
If you use suffix automata for this question, you will have a deep enough understanding of many properties of suffix automata. I haven't thought about how to make a suffix array because I'm not happy about it. . . .
Question: Given two strings s1 and s2 whose length does not exceed 5000, find the shortest common substring that appears only once in these two strings.
Problem-solving idea: What you are looking for is a common substring, and then there is a limit on the number of occurrences. The first idea is a suffix automaton. A suffix automaton is perfect for handling the number of occurrences of a substring. The method is as follows: first establish the sam of s1, and use topology dp to find the number of occurrences of the representative string of each node. What is the purpose? In fact, I want to find ok[i][j], which means whether the substring from s1[i] ~ s1[j] only appears once. Now that we have found the number of occurrences of the representative string, how do we find this ok[i][j]? Use s1 to match on the established automaton. The current match is s1[i]. The record temp indicates the longest length of the current match, and now indicates which node the current match is on. There is a property here that is very similar to the AC automaton. If it matches now, it will definitely match fa[now]. Then go up along now and find the first node p whose occurrence number is greater than 1. Then the substring ending with i and having a length of val[p] 1 to temp must only appear once in s1. Record this into the ok array. The second step is to process s2. It is still the same process. Create sam and find the number of occurrences of the representative string of each point, which is the cnt[] array. The third step is to use s1 to match the sam of s2. The matching process is similar to the previous processing of the ok array of s1. Find the longest length of the current matching temp, the matched node now, follow the now upwards, and find The first node p with cnt greater than 1, in s2, ends with the end of the currently matched substring and has a length from val[p] 1 to temp. The string must only appear once in s2. Then enumerate j, from val[p] 1 to temp. If it is in s1, ending with i, the substring of length j only appears once (that is, ok[i-j 1][i] == 1), Then this j may become the answer, just use it to update ans.
Code:
#include<stdio.h>#include<string.h>#include<algorithm>using namespace std ;const int maxn = 5001 ;bool ok[maxn][maxn] ;int ans = 111111 ;struct SAM { int fa[maxn<<1] , val[maxn<<1] , c[26][maxn<<1] ; int cnt[maxn<<1] ; int tot , last ; int ws[maxn<<1] , wv[maxn<<1] ; inline int new_node ( int _val ) { val[++tot] = _val ; for ( int i = 0 ; i < 26 ; i ++ ) c[i][tot] = 0 ; cnt[tot] = fa[tot] = 0 ; return tot ; } void add ( int k ) { int p = last , i ; int np = new_node ( val[p] + 1 ) ; while ( p && !c[k][p] ) c[k][p] = np , p = fa[p] ; if ( !p ) fa[np] = 1 ; else { int q = c[k][p] ; if ( val[q] == val[p] + 1 ) fa[np] = q ; else { int nq = new_node ( val[p] + 1 ) ; for ( i = 0 ; i < 26 ; i ++ ) c[i][nq] = c[i][q] ; fa[nq] = fa[q] ; fa[q] = fa[np] = nq ; while ( p && c[k][p] == q ) c[k][p] = nq , p = fa[p] ; } } last = np ; } void init () { tot = 0 ; last = new_node ( 0 ) ; } void SORT () { for ( int i = 0 ; i < maxn ; i ++ ) wv[i] = 0 ; for ( int i = 1 ; i <= tot ; i ++ ) wv[val[i]] ++ ; for ( int i = 1 ; i < maxn ; i ++ ) wv[i] += wv[i-1] ; for ( int i = 1 ; i <= tot ; i ++ ) ws[wv[val[i]]--] = i ; } void get_cnt ( char *s , int n ) { SORT () ; int now = 1 , i ; memset ( cnt , 0 , sizeof ( cnt ) ) ; for ( i = 1 ; i <= n ; i ++ ) { int k = s[i] - 'a' ; now = c[k][now] ; cnt[now] ++ ; } for ( i = tot ; i >= 1 ; i -- ) { now = ws[i] ; cnt[fa[now]] += cnt[now] ; } } void gao ( char *s , int n ) { get_cnt ( s , n ) ; int now = 1 , i , j ; for ( i = 1 ; i <= n ; i ++ ) { int k = s[i] - 'a' ; now = c[k][now] ; int p = now ; while ( fa[p] && cnt[p] == 1 ) p = fa[p] ; for ( j = 1 ; j <= i - val[p] ; j ++ ) ok[j][i] = 1 ; } } void work ( char *s , int n ) { int temp = 0 , now = 1 , i , j ; for ( i = 1 ; i <= n ; i ++ ) { int k = s[i] - 'a' ; if ( c[k][now] ) { temp ++ ; now = c[k][now] ; int p = now ; while ( fa[p] && cnt[p] == 1 ) p = fa[p] ; for ( j = val[p] + 1 ; j <= temp ; j ++ ) if ( ok[i-j+1][i] ) { ans = min ( ans , j ) ; break ; } } else { while ( now && !c[k][now] ) now = fa[now] ; if ( !now ) now = 1 , temp = 0 ; else { temp = val[now] + 1 ; now = c[k][now] ; int p = now ; while ( fa[p] && cnt[p] == 1 ) p = fa[p] ; for ( j = val[p] + 1 ; j <= temp ; j ++ ) if ( ok[i-j+1][i] ) { ans = min ( ans , j ) ; break ; } } } } }} ac ;char s1[maxn] , s2[maxn] ;int main () { scanf ( "%s" , s1 + 1 ) ; ac.init () ; int n = strlen ( s1 + 1 ) , i , j ; for ( i = 1 ; i <= n ; i ++ ) ac.add ( s1[i] - 'a' ) ; ac.gao ( s1 , n ) ; scanf ( "%s" , s2 + 1 ) ; ac.init () ; int m= strlen ( s2 + 1 ) ; for ( i = 1 ; i <= m ; i ++ ) ac.add ( s2[i] - 'a' ) ; ac.get_cnt ( s2 , m ) ; ac.work ( s1 , n ) ; if ( ans == 111111 ) puts ( "-1" ) ; else printf ( "%d\n" , ans ) ; return 0 ;}