algorithm idea: keep a sliding window from left to right,
use a Queue to insert words found in document as they appear in order from index 0 to (doc.length - 1) for each new keyword found apply the rule: first element in Queue must appear only once. If new element is equal to first element remove first element and append it to the end of Queue. Once all elements are present in Queue keep track of min window size found so far. This way the algorithm will test all sequences of keywords found in document.

1. split keywords by " " empty space
2. find all matches of keywords in document map them with their indexes in the form (keyword, index). combine All elements into a single list
3. sort resulting list by index position
4. from the resulting list iterate from left to right using the algorithm approach above

Implementation of the idea using Scala:

object MinKWindowSum {
  def main(args: Array[String]): Unit = {   
    val document = "MS(1) Awesome x y Is MS(5) x y MS(8) x Is Awesome x y z Awesome"    
    println(getMinWindowSize(document, "MS is awesome"))
  def getMinWindowSize(doc:String, s:String): Int = {
    val keywords = s.split(" ").toSet
    val idxs = => (k -> ("(?i)\\Q" + k + "\\E").r.findAllMatchIn(doc).map(_.start)))
    .map{ case (keyword,itr) => itr.foldLeft(List[(String,Int)]())((result, num) => result :+ (keyword, num))}
    .foldLeft(List[(String,Int)]())((res, list) => res ++ list)
    var min = Int.MaxValue    
    var minI = 0
    var minJ = 0
    var currWindow = ListBuffer[(String,Int)]()
    for( tuple <- idxs ) {  
      if (!currWindow.isEmpty && currWindow.head._1.equals(tuple._1)) currWindow.remove(0)         
      currWindow += tuple
      if (keywords.subsetOf( {
        val currMin = currWindow.last._2 - currWindow.head._2
        if (min > currMin) {
          min = currMin
          minI = currWindow.head._2
          minJ = currWindow.last._2          
    println("min = " + min + " ,i = " + minI + " j = " + minJ )    


input: "MS(1) Awesome x y Is MS(5) x y MS(8) x Is Awesome x y z Awesome"
output: min = 15 ,i = 6 j = 21

- guilhebl June 18, 2017 | Flag Reply

