Questions tagged [bioinformatics]
Bioinformatics is the use of software tools to analyse biological data.
153 questions
12
votes
4
answers
1k
views
Simple mutation simulation for use in science class
I've been giving my grade tens an introduction to computational phylogenomics, based around computing dissimilarity between DNA strands, and a simplified model of mutation. They've been exploring how ...
1
vote
1
answer
85
views
Needleman-Wunsch algorithm with affine gap cost
Needleman-Wunsch is a bioinformatics algorithm used to align 2 sequences. The algorithm outputs the score of the alignment and a Vec containing all operations to reconstruct the alignment. I do not ...
3
votes
1
answer
108
views
Finding repeats in DNA in R
I recently wrote some R code, which I would normally not do, to find repeats in DNA fasta files.
Here is an example fasta file:
...
6
votes
2
answers
245
views
Calculating probability of offspring having dominant phenotype given a random mating - Mendel's First Law
I'm a beginner to python and have been working through the Rosalind problems. If you're unfamiliar with Rosalind, they're a website where you can practice bioinformatic coding through problem solving.
...
2
votes
1
answer
186
views
First order hidden Markov model with Viterbi algorithm in Java
Introduction
A first order HMM (hidden Markov model) is a tuple \$(H, \Sigma, T, E, \mathbb{P})\$, where \$H = \{1, \ldots, \vert H \vert\}\$ is the set of hidden states, \$\Sigma\$ is the set of ...
2
votes
1
answer
108
views
Semi-dynamic range minimum query (RMQ) tree in Java
Introduction
I have this semi-dynamic range minimum query (RMQ) tree in Java. It is called semi-dynamic due to the fact that it cannot be modified after it is constructed. However, the values ...
6
votes
1
answer
106
views
Optimizing __getitem__ for a bioinformatics script in Python
I'm writing a script for a bioinformatics application in Python that iterates through several sequences looking for amino acids in specific positions to calculate their frequency in relation to a ...
5
votes
1
answer
109
views
Random FASTA file generator
This is my code to generate a FASTA file containing multiple records with randomized DNA sequences with distinct length. I am looking for feedback on how to write this script better.
...
2
votes
2
answers
131
views
Slow Bioinformatics algorithm - Clump finding algorithm in Haskell
I'm working on the famous clump finding problem to learn Haskell. Part of the problem involve breaking nucleotide sequences, called kmers, into subsequences as follows:
...
6
votes
1
answer
297
views
Converting an RNA sequence into the amino acid code and introducing a mutation
I have recently started coding in Python and this is my first proper project.
The user enters an RNA sequence and the program converts the code into the corresponding amino acid sequence. The user can ...
1
vote
1
answer
119
views
Show different concentrations on scatter plots with different colors [closed]
I have an app which parses data files on the clients browsers and displays a scatter chart. This is an example of a scatter chart with 10,000 data points:
This is an example of a scatter chart with 1....
1
vote
1
answer
80
views
Sensibly using fastq crate to modify fastq files
I'm learning Rust, and I have a simple program that I hope to use as a learning exercise. My goal here is to get a better idea of the proper way of doing things.
I'm trying to use the fastq crate to ...
5
votes
1
answer
446
views
Rust program to one hot encode genetic sequences from .fa files
I wanted to write some code which reads in a FASTA file and one hot encodes the sequence which is consequentially saved to a file. A FASTA file is a text based file format commonly used in ...
2
votes
1
answer
495
views
Python FASTA parser using dictionaries without using BioPython or other external libraries
I am writing my own parser for FASTA format. I can't use BioPython or anything else, because it's a part of an assignment and our teacher wants us to try to do it manually.
For now, I have done this:
<...
4
votes
1
answer
727
views
Function to calculate the GC content variation in a sequence
I came across this BMC Genomics paper:
Analysis of intra-genomic GC content homogeneity within prokaryotes
And I implemented some Python functions to make this available as part of a personal project.
...