Buffl

Definitionen

CS
by Carina S.

Use of the Burrows-Wheeler transform for searching for patterns in strings (11)

(nochmal anschauen als YT-Video z.B.)

  • Find occurrences of pattern P (aca) within a string S (acaaca$)

  • S’: Burrows-Wheeler Transform of S

  • Any pattern P that appears in S is the prefix of one of the suffixes

  • P is the prefix of suffixes 4 and 5

  • The Borrws-Wheeler Transform provides an index of the suffix array, that obviates the need to search for lines beginning with aca in the sorted suffix array of acaaca$

  • Assign to each alphabetic character in S a rank that specifies the number of times that character occurs previously in S

  • Attach the rank to each alphabetic character in the original string as a superscript

  • Compute the Burrows-Wheeler Matrix: the i-th occurrence of any character in the last column has the same rank as its i-th occurrence in the first column (the Burrows_Wheeler Transform of the original string)

  • Search for aca backwards:

    • starting with the final a: rows 2-5 begin with a

    • the character preceding the final a is c

    • the character in column 1 is preceded in the full string by the character in column 7 in the same row

    • for rows 2 and 3, the a in column 1 corresponds to the c in column 7, with ranks 0 and 1

    • we now know that the first and second occurrences of c are part of a ca substring

    • we also know that it is the first two occurrences of c (ranks 0 and 1) that vegin with ca

    • both have an a in column 7, completing the pattern aca

    • the corresponding ranks are 2 and 3, indicating that the two occurrences of aca begin with the first and third appearances of a^


Author

Carina S.

Information

Last changed