Skip to main content
Add "Faulty premise" block
Source Link
user272752
user272752

Faulty premise

Consider an (unlikely but plausible) edge case:
In the first mutation, one particular base is replaced; say: 'A' -> 'T'. In an unlikely sequence - but possible from a fair RNG - only that same character is ever replaced:
T -> G -> C -> T -> C -> G -> C -> G -> T ...
That run of the simulation would flat line indicating only one mutation ('A' -> something not 'A') even though there is variation between every generation... Less unlikely edge cases would, similarly yield (perhaps marginally less-) misleading values and still corrupt any statistical results.

It's looking like less time should have been given to dealing with the complexities of Python and its arrays, lists and CSVs; and more thought given to how to produce data that would demonstrate the use for Phylogenomics.
Oversimplification can fuzz important nuances out of existence.


Faulty premise

Consider an (unlikely but plausible) edge case:
In the first mutation, one particular base is replaced; say: 'A' -> 'T'. In an unlikely sequence - but possible from a fair RNG - only that same character is ever replaced:
T -> G -> C -> T -> C -> G -> C -> G -> T ...
That run of the simulation would flat line indicating only one mutation ('A' -> something not 'A') even though there is variation between every generation... Less unlikely edge cases would, similarly yield (perhaps marginally less-) misleading values and still corrupt any statistical results.

It's looking like less time should have been given to dealing with the complexities of Python and its arrays, lists and CSVs; and more thought given to how to produce data that would demonstrate the use for Phylogenomics.
Oversimplification can fuzz important nuances out of existence.

added 85 characters in body
Source Link
user272752
user272752

EDIT:
There's

I want to keep the code as simple and intuitive as possible, with few abstractions.

There's been a small bit of discussion in various comments regarding the orientation of the output results (ie. generation increments vertical or horizontal). The algorithm used implements "row major" population of the data store wherein there is no cross-row dependence of the data. Things are unnecessarily made more complicated by transposing the matrix and invoking a "CSV writing" module for transfer to Google Sheets.

EDIT
There's been a small bit of discussion in various comments regarding the orientation of the output results (ie. generation increments vertical or horizontal). The algorithm used implements "row major" population of the data store wherein there is no cross-row dependence of the data. Things are unnecessarily made more complicated by transposing the matrix and invoking a "CSV writing" module for transfer to Google Sheets.

EDIT:

I want to keep the code as simple and intuitive as possible, with few abstractions.

There's been a small bit of discussion in various comments regarding the orientation of the output results (ie. generation increments vertical or horizontal). The algorithm used implements "row major" population of the data store wherein there is no cross-row dependence of the data. Things are unnecessarily made more complicated by transposing the matrix and invoking a "CSV writing" module for transfer to Google Sheets.

added 19 characters in body
Source Link
user272752
user272752

To make this exercise realistic (ie. Darwin's Natural Selection), although it is much more complex, consider teaching the kids about "codons". There could be arbitrary 'good', 'bad' and 'neutral' (aka "junk DNA") codons in a longer "string" of DNA. Mutation still happens, but each generation might require, say, >3 'good' codons and <2 'bad' codons ('neutral' codons have no impact) to survive into the next generation. (John Conway's "Game of Life"). A 'bad' or 'neutral' codon may mutate to become 'good', or whatever... Insufficient 'good' or too many 'bad' codons means the lineage dies out and the 'molecular clock' stops (aka "extinction").

To make this exercise realistic (ie. Darwin's Natural Selection), although it is much more complex, consider teaching the kids about "codons". There could be arbitrary 'good', 'bad' and 'neutral' (aka "junk DNA") codons in a longer "string" of DNA. Mutation still happens, but each generation might require, say, >3 'good' codons and <2 'bad' codons ('neutral' codons have no impact) to survive into the next generation. (John Conway's "Game of Life"). A 'bad' or 'neutral' codon may mutate to become 'good', or whatever... Insufficient 'good' or too many 'bad' codons means the lineage dies out and the 'molecular clock' stops.

To make this exercise realistic (ie. Darwin's Natural Selection), although it is much more complex, consider teaching the kids about "codons". There could be arbitrary 'good', 'bad' and 'neutral' (aka "junk DNA") codons in a longer "string" of DNA. Mutation still happens, but each generation might require, say, >3 'good' codons and <2 'bad' codons ('neutral' codons have no impact) to survive into the next generation. (John Conway's "Game of Life"). A 'bad' or 'neutral' codon may mutate to become 'good', or whatever... Insufficient 'good' or too many 'bad' codons means the lineage dies out and the 'molecular clock' stops (aka "extinction").

Add "EDIT" block promoting rethink of orientation and simplification of code that would result.
Source Link
user272752
user272752
Loading
Add 'codons' and 'natural selection' to the mix...
Source Link
user272752
user272752
Loading
added 772 characters in body
Source Link
user272752
user272752
Loading
added 10 characters in body
Source Link
user272752
user272752
Loading
added 10 characters in body
Source Link
user272752
user272752
Loading
added 339 characters in body
Source Link
user272752
user272752
Loading
added 646 characters in body
Source Link
user272752
user272752
Loading
Source Link
user272752
user272752
Loading