I'm parsing a number of text files of different formats. Some are csv, some are xml and some even txt. I have a case statement that checks if a certain string is contained in the first 100 bytes of a file in order to identify it. I would like to replace this case statement with a hash-based solution because in the end I will cover 10 or more different file formats and I would like to avoid such a long case statement.
first_bytes = '<?xml version="1.0" encoding="UTF-8"?><Document xmlns="urn:iso:std:iso:20022:tech:xsd:camt.053.001.02"'
case first_bytes
when /camt.053.001/
'camt053'
when /camt.052.001.08/
'camt052'
when /"Auftragskonto";"Buchungstag";"Valutadatum";/
'SpK'
end
I started with the following hash but I'm not sure how to match this.
file_types = {
'/camt.053.001/' => 'camt053',
'/camt.052.001/' => 'camt052',
'/"Auftragskonto";"Buchungstag";"Valutadatum";/' => 'SpK'
}
file_types.keys.select { |key| first_bytes.match(key) }
This doesn't work. It produces an empty array.
"camt053"where as yourHashmethodology will return andArrayof the matching key(s) (Regexp) e.g.[/camt.053.001/]. Additionally you should probably useString#match?which will simply returntrueorfalserather than constructing aMatchDataobject, which you don't need.file_types[file_types.keys.select { |key| key.match(first_bytes) }.first]@engineersmnky