Official

F - SSttrriinngg in StringString Editorial by en_translator


We perform binary search over \(k\). From now on, we fix \(k\) and consider how to determine if \(g(T,k)\) is a substring of \(f(S,N)\).

Let \(s\) be the length of \(S\) and \(t\) of \(T\), and \(X[:i]\) denote the first \(i\) characters of a string \(X\). We assume that every character in \(T\) is contained in \(S\) as well (otherwise, the answer is obviously \(0\)).

Let us find the following value for \(i=1,2,\dots,t\) in order:

  • \(\mathrm{iter}_i\): the minimum \(j\) such that \(g(T[:i],k)\) is a substring of \(f(S,N)[:j]\)

Eventually, the answer is obtained by comparing \(\mathrm{iter}_t\) from \(N\times s\). Find \(\mathrm{iter}_{i+1}\) based on \(\mathrm{iter}_i\) is equivalent to find the position of the \(k\)-th occurrence of the \((i+1)\)-th character of \(T\) in \(f(S,N)\) after its \((\mathrm{iter}_i+1)\)-th character (including itself), so this problem is boiled down to processing the following query \(k\) times:

  • \(\mathrm{query}(a,b,c)\): given positive integers \(a\) and \(b\) and a character \(c\), find the position of the \(b\)-th occurrence of \(c\) in \(f(S,N)\) after its \(a\)-th character (including itself).

Suppose that \(c\) occurs \(\mathrm{cnt}_c\) times in \(S\). If \(a\) is greater than \(s\), it is reduced to the case where \(a\leq s\) using \(\mathrm{query}(a,b,c)=\mathrm{query}(a-s,b,c)+s\). If \(b\) is greater than \(\mathrm{cnt}_c\), it is reduce to the case where \(b\leq \mathrm{cnt}_c\) using \(\mathrm{query}(a,b,c)=\mathrm{query}(a+s,b-\mathrm{cnt}_c,c)\).

Therefore, it is sufficient to process the query with \(a\leq s,\ b\leq \mathrm{cnt}_c\). If these conditions are met, the answer of \(\mathrm{query}(a,b,c)\) is not greater than \(2s\). By precalculating for each character the position of its occurrence in \(f(S,2)\), the query can be processed fast.

Therefore, including the binary search over \(k\) mentioned first, the problem can be solved in a total of \(O(s\sigma+t\log Ns)\) or \(O(s+\sigma+t\log s\log Ns)\) time (where \(\sigma\) is the size of alphabet).

Sample code (C++):

#include <bits/stdc++.h>

using namespace std;

using ll = long long;

const ll inf = 1LL << 60;

int main() {
    ll n;
    string s, t;
    cin >> n >> s >> t;
    int sl = s.size(), tl = t.size();
    vector<vector<int>> pos(26);
    for (int i = 0; i < sl * 2; i++) {
        pos[s[i % sl] - 'a'].push_back(i);
    }
    vector pre(sl + 1, vector<int>(26)); // the number of occurremces of j in the first i characters of S
    for (int i = 0; i < sl; i++) {
        pre[i + 1] = pre[i];
        ++pre[i + 1][s[i] - 'a'];
    }
    vector<int> cnt(26);
    for (int i = 0; i < 26; i++) cnt[i] = pos[i].size() / 2;
    ll ok = 0, ng = inf;
    auto f = [&](ll x) -> bool {
        ll iter = 0;
        for (int i = 0; i < tl; i++) {
            int c = t[i] - 'a';
            if (!cnt[c]) return false;
            ll r = (x - 1) % cnt[c] + 1;
            ll q = (x - r) / cnt[c];
            if (q > inf / sl) return false;
            iter += sl * q;
            int nx = pos[c][pre[iter % sl][c] + r - 1]; // the position of the r-th occurrence of c after the (iter % sl)-th character
            iter += nx + 1 - iter % sl;
            if (iter > n * sl) return false;
        }
        return true;
    };
    while (ng - ok > 1) {
        ll mid = (ok + ng) / 2;
        if (f(mid)) ok = mid;
        else ng = mid;
    }
    cout << ok << endl;
}

posted:
last update: