Home

Bioinformatics

Bioinformatics

Go to the Rosalind website and enter the Bioinformatics Stronghold. This is an introduction to bioinformatics, through a series of problems. For each problem, there is some explanation of the biology behind it, and then a task that you should solve with a Scala program.

You will need to log in to the Rosalind website. This can be done with a Google, Twitter, or Facebook account, or otherwise create a new Rosalind account. You will have to start with problem DNA. Once you have solved this problem by uploading a correct solution to the Rosalind website, the system allows you to work on the next problem RNA. Soon there will be multiple problems that you can work on.

See how far you can get in today's lab!

For each problem, the Rosalind system will give you a data file, called rosalind_xxx.txt, where xxx is the name of the problem. Write a Scala script xxx.scala that solves the problem and prints the answer to a file solution_xxx.scala. You can then run your script as (here for the problem dna):

> scala dna.scala rosalind_dna.txt

Check the "Writing text files" documentation to see how to write your output to a text file.

The Rosalind data files consist of one or several lines, each containing a string (such as DNA, RNA, proteins). An easy way to read these strings is as follows (here for the HAMM problem, where the input consists of two strings of the same length):

val F = scala.io.Source.fromFile(args(0))
val lines = F.getLines().toArray
F.close()

assert(lines.length == 2)

val s = lines(0)
val t = lines(1)

assert(s.length == t.length)