Problem G
Match-Finding in DNA Sequences
The bases that comprise a strand of deoxyribonucleic acid (DNA) are commonly represented by the symbols A, C, G and T, and the strand itself by a string like
A researcher from the startup GeniFreeze has hired you to write code that will solve a particular problem that the team believes may lead to a cure for hypothermia.
They will provide you with a strand (in the form of a sequence like that shown above) and a list of markers (each of which is a genetic subsequence). Your program should report if there are non-empty substrings between the markers (and maintaining the given order provided by the list) that are all equal.
For example, if they send the strand shown above, and the list of markers
then your program should report that yes, the strand contains those markers emboldened and separated by spaces here:
and the substrings between those emboldened markers all equal TGATC. A particular input may lead to multiple successful outputs, and each success should be reported.
Input
The first line gives the subsequences that act as markers. Every input file has at least three markers. The second line gives the strand, which will have length at least three.
Output
If the program fails to find equal substrings between the markers, then the program should print:
Failed to find equal substrings between the markers.
If the program successfully finds equal substrings between the markers, then, for each successful outcome, the program should print:
Equal substrings found: <substring>
Sample Input 1 | Sample Output 1 |
---|---|
AC ACG GA TACTGATCACGTGATCGAG |
Equal substrings found: TGATC |
Sample Input 2 | Sample Output 2 |
---|---|
TTT AAA CCC G ACGTACGTACGTACGTACGTACGT |
Failed to find equal substrings between the markers. |
Sample Input 3 | Sample Output 3 |
---|---|
AAG GGT TTG TAAGAGGTATTGGTAGGTATTTTGC |
Equal substrings found: A Equal substrings found: AGGTATT |