As always - the context !
Preliminaries
Recently I needed a way to transport data from one environment to another. Though the proper way (probably) to do this would be to use databases (which I don't know how to work with) I chose writing the data to a simple text file which in turn is being read and parsed when needed. So typical content of such a file (real example so the alignment) :
Ticket: 2 Type: ORDER_TYPE_SELL_STOP State: ORDER_STATE_CANCELED Reason: ORDER_REASON_EXPERT Start: 1507032000000 End: 1507140000000
Ticket: 3 Type: ORDER_TYPE_BUY_STOP State: ORDER_STATE_CANCELED Reason: ORDER_REASON_EXPERT Start: 1507723200000 End: 1507760140000
Ticket: 4 Type: ORDER_TYPE_SELL_STOP State: ORDER_STATE_FILLED Reason: ORDER_REASON_EXPERT Start: 1508375100000 End: 1508389780000
Ticket: 5 Type: ORDER_TYPE_BUY State: ORDER_STATE_FILLED Reason: ORDER_REASON_SL Start: 1508392000000 End: 1508392000000
Ticket: 6 Type: ORDER_TYPE_SELL_STOP State: ORDER_STATE_CANCELED Reason: ORDER_REASON_EXPERT Start: 1508522400000 End: 1508524540000
Ticket: 7 Type: ORDER_TYPE_BUY_STOP State: ORDER_STATE_CANCELED Reason: ORDER_REASON_EXPERT Start: 1509624000000 End: 1509638920000
Ticket: 8 Type: ORDER_TYPE_BUY_STOP State: ORDER_STATE_CANCELED Reason: ORDER_REASON_EXPERT Start: 1509671100000 End: 1509732000000
where each line is (must be) of the form :
Type : ORDER_TYPE_SELL_STOP State : ....
^ ^ ^ ^ ^ ^ ^ ^ ^
WPs KEY WPs D WPs VALUE WPs KEY D
where
WPs - any number of whitespace characters
D - delimiter character, separates KEY and VALUE
KEY, VALUE - WORDs : which contain neither a whitespace character nor the delimiter one
Note that one line may contain several KEY-VALUE pairs.
The Algorithm
Take a string S
Remove first WP characters if any (until the first non-WP is met) from S
if MAX_INIT_WORD of S is NOT EMPTY
KEY := MAX_INIT_WORD
Remove MAX_INIT_WORD from S
else
break
Remove first WP characters if any (until the first non-WP is met) from S
if S starts with D
Remove D from S
else
break
Remove first WP characters if any ( until the first non-WP is met) from S
if MAX_INIT_WORD of S is NOT EMPTY
VALUE := MAX_INIT_WORD
Remove MAX_INIT_WORD from S
where MAX_INIT_WORD is a maximal prefix of S that doesn't contain WP or D.
The C++ code
#include <map>
#include <string>
using dict_t = std::map<std::string, std::string>;
dict_t GetDict( std::string s,
char delim = ':',
const std::string& white = " \n\t\v\r\f" )
{
dict_t m;
if( white.empty() ) return m;
s += white[0];// necessary if s doesn't contain trailing spaces
size_t pos = 0;
auto removeLeading = [&](){ if( (pos = s.find_first_not_of( white )) != std::string::npos) s.erase( 0, pos ); };
auto maxInitWord = [&]() -> std::string { std::string word;
if( (pos = s.find_first_of( white + delim )) != std::string::npos ) {
word = s.substr( 0, pos );
s.erase( 0, pos );
}
return word; };
while( true )
{
std::string key;
removeLeading();
if( (key = maxInitWord()).empty() ) break;
removeLeading();
if( s.empty() or s[0] != delim ) break;
s.erase( 0, 1 );
removeLeading();
if( (m[key] = maxInitWord()).empty() ) break;
}
return m;
}
Example (C++ interpreter used)
root [1] GetDict( "....Am-I...:...Inside...A:...Simulation?", ':', "." )
(dict_t) { "A" => "Simulation?", "Am-I" => "Inside" }
Evidently (thanks to @JDługosz), I should show the results on some "illegal" inputs. Though there are infinitely many of them, they can be divided into two groups: those which produce an empty map, and those which produce a map with an empty value:
root [6] GetDict( "....:....an empty map", ':', "." )
(dict_t) { }
and
root [7] GetDict( "an empty value....:....", ':', "." )
(dict_t) { "an empty value" => "" }
Of course, you can see a "partially" correct content:
root [8] GetDict( "partially....:....correct...content...:....", ':', "." )
(dict_t) { "content" => "", "partially" => "correct" }