jazzpeh

Convert One String to Another Problem

Jazz Peh, May 11, 2020

Given strings s1 and s2, convert s2 into s1 by deleting, inserting or replacing characters. Find out the minimum number of edit operations.

Example 1
-----------
s1: "catch"
s2: "carch"
output: 1
Explanation: Replace 'r' with 't'.
Example 2
-----------
s1: "table"
s2: "tbres"
Explanation: Insert 'a' at second position, replace 'r' with 'l' and delete 's'.

Solution

This problem can be solved by applying Divide and Conquer to break down the problem into smaller subproblems. Thus, by solving these subproblems, we will arrive at our final answer. Let’s look at Example 2 and try to derive the subproblems.

Delete
---------
s1 = "table"  --|-- f(2,3) ---|--- min
s2 = "tgable" --|             |
                              |
Insert                        |
---------                     |
s1 = "table" --|-- f(3,2)   --|
s2 = "tble"  --|              |
                              |
Replace                       |
---------                     |
s1 = "table" --|-- f(3,3)   --|
s2 = "tcble" --|              |

So, what the derived above subproblems, we know that there will be 3 operations, Insert, Replace and Delete. Hence, we explore how each operations will then move on to the next subproblem.

If it is a delete operation, we will stay on the same index for s1 and increment the index for s2 and hence, f(2,3).

For an insert operation, we should increment the index for s1 and stay on the same index for s2 since logically the length of s2 will increase by 1 and hence, f(3,2).

For replace operation, we should increment both the indexes for s1 and s2 since we then assume the current character is now identical after the replacement.

Lastly, getting the minimum of the 3 subproblems will get us to the final answer.

Code Algortihm

function minOps(s1: string, s2: string, i1: number, i2: number): void {
  if (i1 == s1.length) return s2.length - i2;

  if (i2 == s2.length) return s1.length - i1;

  if (s1[i1] === s2[i2]) return minOps(s1, s2, i1 + 1, i2 + 1);

  const insertOp = 1 + minOps(s1, s2, i1 + 1, i2);
  const replaceOp = 1 + minOps(s1, s2, i1 + 1, i2 + 1);
  const deleteOp = 1 + minOps(s1, s2, i1, i2 + 1);

  return Math.min(insertOp, replaceOp, deleteOp);
}

Optimisation

Let’s look at the an example of the recursion tree see if we need to optimise this solution.

                  minOps(0,0)
      /               |                   \
minOps(1,0)        minOps(0,1)        minOps(1,1)
  |                   |                   |
  |-minOps(2,0)      |-minOps(1,1)      |-minOps(2,1)
  |-minOps(1,1)      |-minOps(0,2)      |-minOps(1,2)
  |-minOps(2,1)      |-minOps(1,2)      |-minOps(2,2)

We are computing a new of the subproblems more than once such as minOps(1,1) and minOps(2,1). We can then apply Dynamic Programming to optimise the solution.

Top Down Approach

We will start off by applying Top Down Approach (also known as Memoization).

type HashTable = {
  [x: number]: {
    [y: number]: number;
  };
};

const dp: HashTable = {};

function minOps(s1: string, s2: string, i1: number, i2: number): number {
  if (!(i1 in dp) && !(i2 in dp[i1])) {
    if (!dp[i1]) {
      dp[i1] = {};
    }

    if (i1 === s1.length) {
      dp[i1][i2] = s2.length - i2;
    } else if (i2 === s2.length) {
      dp[i1][i2] = s1.length - i1;
    } else if (s1[i1] === s2[i2]) {
      dp[i1][i2] = minOps(s1, s2, i1 + 1, i2 + 1);
    } else {
      const insertOp = 1 + minOps(s1, s2, i1 + 1, i2);
      const replaceOp = 1 + minOps(s1, s2, i1 + 1, i2 + 1);
      const deleteOp = 1 + minOps(s1, s2, i1, i2 + 1);
      dp[i1][i2] = Math.min(insertOp, replaceOp, deleteOp);
    }
  }

  return dp[i1][i2];
}

Bottom Up Approach

We can now then apply some reverse engineer to use the Bottom Up Approach (also known as Tabulation). Let’s visualise how dp is filled using Top Down Approach.

B1B2B3B4B5B6
TABLE
A1TA2B2????
A2B?1+min(A3B2,A2B3,A3B3)A3B4??
A3R?1+min(A4B2,A3B3,A4B3)1+min(A5B4,A4B5,A5B5)1+min(A4B4,A3B5,A4B5)1+Min(A4B5,A3B6,A4B6)3
A4E?1+min(A5B2,A5B3,A5B3)1+min(A5B3,A4B4,A5B4)1+min(A5B4,A4B5,A5B5)A5B62
A5S?1+min(A6B2,A6B3,A6B3)1+min(A6B3,A5B4,A6B4)1+min(A6B4,A5B5,A6B5)1+min(A6B5,A5B6,A6B6)1
A643210

We then need to find where the first solution of the subproblem starts. And this is actually the A6B2 cell. And then the base solutions start to move to all the A6 and B6 cells. Finally, it slowlys move towards the subproblems that doesn’t have an answer and then to cell A1B1 where we will arrive at our final answer.

type HashTable = {
  [x: number]: {
    [y: number]: number;
  };
};

function minOps(s1: string, s2: string): number {
  const dp: HashTable = {};

  for (let i1 = 1; i1 < s1.length; ++i1) {
    if (!dp[i1]) {
      dp[i1] = {};
    }

    for (let i2 = 1; i2 < s2.length; ++i2) {
      if (s1[i1] === s2[i2]) {
        dp[i1][i2] = dp[i1 - 1][i2 - 1];
      } else {
        dp[i1][i2] = 1 + Math.min(dp[i1 - 1][i2], dp[i1][i2 - 1], dp[i1 - 1][i2 - 1]);
      }
    }
  }

  return dp[s1.length -1][s2.length - 1];
}

Explore more like this

© 2020 Jazz Peh, Built with Gatsby