开发者

What do you think is the best language for Bioinformatics? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
开发者_JS百科

Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.

Closed 6 years ago.

Improve this question

I have done a couple research jobs in Bio-informatics and I have used Matlab for them. Matlab had a lot of powerful tools and was easy to use. I did thinks with genome sequencing and predicting metabolic pathways. I am wondering what other people think is best? or there might not be one specific language but a few that lend themselves best to Bio-informatics work that is math heavy and deals with a large amount of data.


You'll likely be interested in this thread over at BioStar:

  • Which are the best programming languages to study for a bioinformatician?

For most of us bioinformaticians, this includes Python, R, Perl, and bash command line utilities (like sed, awk, cut, sort, etc). There are also people who code in Java, Ruby, C++, and Matlab.

So the bottom line? Whichever language lets you get the work done most easily is the right one for you. Answering this question should include a careful survey of the libraries and other code that you can pull from, as well as information on your own preferences and experience. If you're doing microarray analysis, it's hard to beat the R/bioconductor libraries, but that's absolutely the wrong language for someone wrangling most types of large sequencing data sets.


There's no one right language for bioinformatics.

  • The important BLAST sequencing tool is written in C++

  • The MATT tool for aligning protein structures is written in C

  • Some of my colleagues in computational biology use Ruby.

In general, I see a lot of C and C++ for performance-critical code and a lot of scripting languages otherwise.


Python + scipy are decent (and FREE).

http://www.vetta.org/2008/05/scipy-the-embarrassing-way-to-code/

http://www.google.com/search?hl=en&source=hp&q=python+bioinformatics&aq=0&aqi=g9g-m1&aql=&oq=python+bio&gs_rfai=CeE1nPpMNTN2IJZ-yMZX6pcIKAAAAqgQFT9DLSgo

You do not even need to learn new syntax really when when dropping Matlab for SciPy.


Best or not, SAS is the de facto programming enviroment in biopharmas. If you were to work for the Pfizers, Mercks and Bayers of the world in bioinformatics, you had better have SAS skills. SAS programmers are in great demand.


What's the "best" language is both subjective and potentially different from task to task, but for bioinformatic work, I personally use R, Perl, Delphi and C (quite frequently a combination of several of these).


I work mainly with HMMs and protein sequences. I started out writing in C, but have since switched to Python, which I'm happy with. I find it's easier to prototype something quickly and results in easier to maintain code.


Here's a freely available academic paper written on the subject that evaluates the different languages, and in different situations: http://www.biomedcentral.com/1471-2105/9/82

They grouped 6 commonly used languages into 3 different levels.

2 compiled languages: C, C++
2 semi-compiled languages: C#, Java
2 interpreted languages: Perl, Python

Some general conclusions:

  1. Compiled languages outperformed interpreted languages in global alignments and Neighbour-Joining programs
  2. Interpreted languages generally used more memory
  3. All languages performed roughly the same for BLAST computations, except for Python
  4. Compiled languages require more written lines of code to perform the same tasks
  5. Compiled languages tend to be better for algorithm implementation
  6. Interpreted languages tend to be better for file parsing/manipulation

Here's another good free academic article discussing ways to build bioinformatics skills: http://dx.plos.org/10.1371/journal.pcbi.1000589

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜