My intent is to perform "mail merge" where I can write strings like "hi <<name>>" and format them according to a HashMap.
Specifically, the string contains keys formatted as <<key>>, and the map contains the corresponding values.
I have a main concern that I'd like help with. I think it would be best to perform the parsing in multiple stages:
- first find the
Keys - second find the remaining
Chunks - and for more complicated parsing tasks, perhaps more stages
I couldn't figure that out and instead use the more expensive lookahead function notFollowedBy and one pass. That obviously wouldn't work well if I had a slightly more complicated need.
import Data.Functor.Identity (Identity)
import Data.HashMap.Lazy as HM
import Text.Parsec
import Text.Parsec.String
-- Parsing ----------
data Merge a = Chunk a | Key a deriving (Show)
key :: Parser (Merge String)
key = Key <$> between (string "<<") (string ">>") (many1 letter)
chunk :: Parser (Merge String)
chunk = Chunk <$> many1 (notFollowedBy key >> anyChar)
prose :: ParsecT String () Identity [Merge String]
prose = many1 $ key <|> chunk
-- Formatting ----------
format :: HM.HashMap String String -> [Merge String] -> String
format _ [] = ""
format hmap (Chunk x : xs) = x ++ format hmap xs
format hmap (Key k : xs) =
case HM.lookup k hmap of
-- I could obviate the `error` by working within a failure monad
Nothing -> error $ "missing key: " ++ k
Just v -> v ++ format hmap xs
-- Testing ----------
testString = "Hi <<name>>! Do you like <<thing>>?"
testMap = HM.fromList [("name", "Adam"), ("thing", "Apples")]
main = print $ format testMap <$> parse prose "" testString
<<? If they are not allowed, your language is regular. \$\endgroup\$<<<<>>? No, there should just be alpha characters inside aKey. Or if you meantKeys withinKeys, still no, just<<alpha>>. \$\endgroup\$breakOnwould do the job perfectly well. \$\endgroup\$