Your function name is not very helpful. Why is the function called
mimic_dict()? Perhaps the answer to that question is clear in the context of
your larger project. In any case, the reason is not evident to me, so I suggest
you put some effort into creating a better name. In the function's doc-string,
I attempt to describe the function's general behavior.
When feasible, use specific names rather than generic names. Your code uses
mostly generic names, such as str, lst, elem. But more specific names are
readily available: text, words, and curr. Notice in particular how much
better curr is than the bland elem. The latter tells us nearly nothing, but
curr emphasizes the contrast with prev and also ties into the function's
doc-string.
def mimic_dict(text):
'''
Takes some text, splits it into words, and returns a dict
where each previous word is the key for the current word (stored
in a list to accommodate duplicates).
'''
words = text.split()
prev = ''
d = {}
for curr in words:
d.setdefault(prev, []).append(curr)
prev = curr
return d
When testing, don't stringify expected data. It's good that your question
includes some test code. However, your tests are a bit awkward because you
stringify the expected results. That's unnecessary. Just check for equality
against the expected dict.
Unify test data, linking input to expected output. As noted in a comment,
there was a mismatch between your number of test cases and expected outputs.
You can avoid such problems by unifying the test data from the outset, as shown
below.
def test_mimic_dict():
# Each test input is linked directly to expected output.
checks = [
(
"Uno dos tres cuatro cinco",
{'': ['Uno'], 'Uno': ['dos'], 'dos': ['tres'], 'tres': ['cuatro'], 'cuatro': ['cinco']},
),
(
"a cat and a dog a fly",
{'': ['a'], 'a': ['cat', 'dog', 'fly'], 'cat': ['and'], 'and': ['a'], 'dog': ['a']},
),
(
"Uno dos\tdos\nsinco",
{'': ['Uno'], 'Uno': ['dos'], 'dos': ['dos', 'sinco']},
),
]
for inp, exp in checks:
got = mimic_dict(inp)
if got == exp:
print('ok')
else:
print(got)
print(exp)
An alternative implementation using itertools.pairwise. Your current
approach is fine, but here's a different way to do it. You could
also use a collections.defaultdict rather than dict.setdefault.
from itertools import pairwise, chain
def mimic_dict(text):
d = {}
for prev, curr in pairwise(chain([''], text.split())):
d.setdefault(prev, []).append(curr)
return d