i need to search a fairly lengthy string for CPV (common procurement vocab) codes.
at the moment i'm doing this with a simple for loop and str.find()
the problem is, if the CPV code has been listed in a slightly different format, this algorithm won't find it.
what's the most efficient way of searching for all the different iterations of the code within the string? Is it simply a case of reformatting each of the up to 10,000 CPV codes and using str.find() for each instance?
An example of different formatting could be as follows
30124120-1
301241201
30124120 - 1
30124120 1
30124120.1
etc.
Thanks :)