Longest common subsequence & subarray & substring

LeetCode 718. Maximum Length of Repeated Subarray

问题描述

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Given two integer arrays A and B, return the maximum length of an subarray that appears in both arrays.

Example 1:

Input:
A: [1,2,3,2,1]
B: [3,2,1,4,7]
Output: 3
Explanation:
The repeated subarray with maximum length is [3, 2, 1].


Note:

1 <= len(A), len(B) <= 1000
0 <= A[i], B[i] < 100

思路

这道题给了我们两个数组A和B,让返回连个数组的最长重复子数组。那么如果将数组换成字符串,实际这道题就是求 Longest Common Substring 的问题了。注意需要跟最长子序列 Longest Common Subsequence 区分开。 既然是子数组,那么重复的地方一定是连续的。对于这种求极值的问题,动态规划 Dynamic Programming 一直都是一个很好的选择,这里使用一个二维的 DP 数组,其中 dp[i][j] 表示数组A的前i个数字和数组B的前j个数字的最长子数组的长度,如果 dp[i][j] 不为0,则A中第i个数组和B中第j个数字必须相等,比对于这两个数组 [1,2,2] 和 [3,1,2],dp 数组为:

1
2
3
4
  3 1 2
1 0 1 0
2 0 0 2
2 0 0 1
  • dp[i][j] := max length of (A[0:i], B[0:j])
  • dp[i][j] = dp[i – 1][j – 1] + 1 if A[i-1] == B[j-1] else 0
  • Time complexity: O(m*n)
  • Space complexity: O(m*n) -> O(n)

code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
class Solution:
def findLength(self, A: List[int], B: List[int]) -> int:

# A --> j
# B --> i

if not A or not B:
return 0


dp = [[0]*len(A) for i in range(len(B))]
res = 0

for j in range(len(A)):
if A[j] == B[0]:
dp[0][j] = 1
res = max(res,dp[0][j])

for i in range(len(B)):
if B[i] == A[0]:
dp[i][0] = 1
res = max(res,dp[i][0])


for i in range(1,len(A)):
for j in range(1,len(B)):
if B[i] == A[j]:
dp[i][j] = dp[i-1][j-1] + 1
res = max(res,dp[i][j])

return res

Leetoce 1143. Longest Common Subsequence

A subsequence of a string is a new string generated from the original string with some characters(can be none) deleted without changing the relative order of the remaining characters. (eg, “ace” is a subsequence of “abcde” while “aec” is not). A common subsequence of two strings is a subsequence that is common to both strings.

问题描述

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Given two strings text1 and text2, return the length of their longest common subsequence.

A subsequence of a string is a new string generated from the original string with some characters(can be none) deleted without changing the relative order of the remaining characters. (eg, "ace" is a subsequence of "abcde" while "aec" is not). A common subsequence of two strings is a subsequence that is common to both strings.



If there is no common subsequence, return 0.



Example 1:

Input: text1 = "abcde", text2 = "ace"
Output: 3
Explanation: The longest common subsequence is "ace" and its length is 3.
Example 2:

Input: text1 = "abc", text2 = "abc"
Output: 3
Explanation: The longest common subsequence is "abc" and its length is 3.
Example 3:

Input: text1 = "abc", text2 = "def"
Output: 0
Explanation: There is no such common subsequence, so the result is 0.


Constraints:

1 <= text1.length <= 1000
1 <= text2.length <= 1000
The input strings consist of lowercase English characters only.

思路

  • dp[i][j] stands for length of LCS between text1 up to i and text2 up to j.
  • dp[i][j] = dp[i-1][j-1] + 1 if text1[i] == text2[j]
  • Otherwise, dp[i][j] = max(dp[i][j-1], dp[i-1][j])

code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
class Solution:
def longestCommonSubsequence(self, text1: str, text2: str) -> int:
if not text1 or not text2:
return 0

dp = [[0] * len(text1) for i in range(len(text2))]
## text1 -->j; text2-->i

flag = False
for i in range(len(text2)):
if text2[i] == text1[0]:
flag = True
if flag:
dp[i][0] = 1

flag = False
for j in range(len(text1)):
if text1[j] == text2[0]:
flag = True
if flag:
dp[0][j] = 1

for i in range(1,len(text2)):
for j in range(1,len(text1)):
if text2[i] == text1[j]:
dp[i][j] = dp[i-1][j-1] + 1
else:
dp[i][j] = max(dp[i-1][j],dp[i][j-1])

return dp[i][j]

Leetoce 583. Delete Operation for Two Strings

问题描述

1
2
3
4
5
6
7
8
9
10
11
Given two words word1 and word2, find the minimum number of steps required to make word1 and word2 the same, where in each step you can delete one character in either string.

Example 1:

Input: "sea", "eat"
Output: 2
Explanation: You need one step to make "sea" to "ea" and another step to make "eat" to "ea".
Note:

The length of given words won't exceed 500.
Characters in given words can only be lower-case letters.

思路

这题实际上是求两个字符串的 Longest Common Subsequence. 然后用两个字符串的长度和减去 2*LCS. DP的思路与上题一样。

code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
class Solution:
def minDistance(self, word1: str, word2: str) -> int:
if not word1 or not word2:
same = 0
else:
dp = [[0] * len(word1) for i in range(len(word2))]
## text1 -->j; text2-->i

flag = False
for i in range(len(word2)):
if word2[i] == word1[0]:
flag = True
if flag:
dp[i][0] = 1

flag = False
for j in range(len(word1)):
if word1[j] == word2[0]:
flag = True
if flag:
dp[0][j] = 1

for i in range(1,len(word2)):
for j in range(1,len(word1)):
if word2[i] == word1[j]:
dp[i][j] = dp[i-1][j-1] + 1
else:
dp[i][j] = max(dp[i-1][j],dp[i][j-1])

same = dp[i][j]

return len(word1) + len(word2) - same*2

Leetcode 1092. Shortest Common Supersequence

问题描述

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Given two strings str1 and str2, return the shortest string that has both str1 and str2 as subsequences.  If multiple answers exist, you may return any of them.

(A string S is a subsequence of string T if deleting some number of characters from T (possibly 0, and the characters are chosen anywhere from T) results in the string S.)



Example 1:

Input: str1 = "abac", str2 = "cab"
Output: "cabac"
Explanation:
str1 = "abac" is a subsequence of "cabac" because we can delete the first "c".
str2 = "cab" is a subsequence of "cabac" because we can delete the last "ac".
The answer provided is the shortest such string that satisfies these properties.


Note:

1 <= str1.length, str2.length <= 1000
str1 and str2 consist of lowercase English letters.

思路

code

class Solution:
    def shortestCommonSupersequence(self, str1: str, str2: str) -> str:

        if not str1 or not str2:
            return ""


        dp = [[0] * (len(str2)+1) for i in range(len(str1)+1)]

        # dp[i][j] -->  i: str1, j: str2

        for i in range(1,len(str1)+1):
            for j in range(1,len(str2)+1):
                if str1[i-1] == str2[j-1]:
                    dp[i][j] = dp[i-1][j-1] + 1
                else:
                    dp[i][j] = max(dp[i-1][j],dp[i][j-1])

        l1 = len(str1); l2 = len(str2)

        res = []

        while l1 or l2:
            if not l1:
                char = str2[l2-1]
                l2 -= 1
            elif not l2:
                char = str1[l1-1]
                l1 -= 1
            elif str1[l1-1] == str2[l2-1]:
                char = str1[l1-1]
                l1 -= 1
                l2 -= 1
            elif dp[l1-1][l2] == dp[l1][l2]:
                char = str1[l1-1]
                l1 -= 1
            elif dp[l1][l2-1] == dp[l1][l2]:
                char = str2[l2-1]
                l2 -= 1

            res.append(char)

        return "".join(res[::-1])

reference from https://zxi.mytechroad.com/blog/dynamic-programming/leetcode-1092-shortest-common-supersequence/

Donate article here