I'm trying to compare individual characters in a string in python and i'm not sure how to do it. In a file of strings, all strings belong to groups and I want to determine if 75% of the strings in a group have the same character at a given position, and if so, delete all of the strings getting compared to the original string.
I'm thinking something like the following, comparing char2 in the word big/bug:
count=0
group1_big
group1_big
group1_bigs
group1_bugs
group2_bug
for(string in file)
if(chars 1-7 of string == chars 1-7 of next string & char 9 is the same in both words)
if(75% are the same at position 9)
delete all other strings in the same group
In this case, if we compare chars 1-7, all group1 matches, and 75% have and 'i' at character position 9 delete all but the first one. Resulting in the following file output:
group1_big
group2_bug
group1_big group1_big group1_bigs group1_bugs1 group1_bugs2 group1_bug3 group2_bug