2

I am trying to replace multiple strings in a file. But in the following code, only my last key value gets replaced. How can I replace all the key,value in the file?

fp1 = open(final,"w")
data = open(initial).read()
for key, value in mydict.items():
    fp1.write(re.sub(key,value, data)
fp1.close()
4
  • 2
    Your code doesn't work because you're writing the results to the file directly, so it outputs the data multiple times. If you replace fp1.write(re.sub(key,value, data) with data = re.sub(key,value, data) it works. Also, is there a specific reason for using re.sub instead of data.replace(key, value)
    – Wolph
    Commented Mar 6, 2010 at 13:42
  • 1
    @WoLpH: Why not post this as an answer? Commented Mar 6, 2010 at 13:46
  • I don't think just calling replace on the same set of data constantly is the best solution, although I'm having doubts about what would be the most fitting solution here. Also, since it's an in-place replacement it might be better to do it streaming anyway.
    – Wolph
    Commented Mar 6, 2010 at 14:31
  • Are these only strings, or actual regexes? Because then you don't need re.sub., only str.replace. Also, it helps greatly if we know the strings(/regexes) are distinct, so can't get multiple hits. (Should we split input at word-boundaries? whitespace?) Then we could simply build a dict of replacements and use it, no need to iterate over it.
    – smci
    Commented Jan 19, 2022 at 22:59

3 Answers 3

5

This is one task for which regular expressions can really help:

import re

def replacemany(adict, astring):
  pat = '|'.join(re.escape(s) for s in adict)
  there = re.compile(pat)
  def onerepl(mo): return adict[mo.group()]
  return there.sub(onerepl, astring)

if __name__ == '__main__':
  d = {'k1': 'zap', 'k2': 'flup'}
  print replacemany(d, 'a k1, a k2 and one more k1')

Run as the main script, this prints a zap, a flup and one more zap as desired.

This focuses on strings, not files, of course -- the replacement, per se, occurs in a string-to-string transformation. The advantage of the RE-based approach is that looping is reduced: all strings to be replaced are matched in a single pass, thanks to the regular expression engine. The re.escape calls ensure that strings containing special characters are treated just as literals (no weird meanings;-), the vertical bars mean "or" in the RE pattern language, and the sub method calls the nested onerepl function for each match, passing the match-object so the .group() call easily retrieves the specific string that was just matched and needs to be replaced.

To work at file level,

with open(final, 'w') as fin:
  with open(initial, 'r') as ini:
    fin.write(replacemany(mydict, ini.read()))

The with statement is recommended, to ensure proper closure of the files; if you're stuck with Python 2.5, use from __future__ import with_statement at the start of your module or script to gain use of the with statement.

1
  • 1
    Thanks, my vote! This is the only answer I found that handles overlapping replacement text. ie. { 'k1':'k2', 'k2':'k1' }
    – hopia
    Commented Jan 25, 2011 at 23:11
0
fp1 = open("final","w")
fp2 = open("file", 'r')
for line in fp2:
    sline=line.rstrip().split()
    for n,item in enumerate(sline):
        if item in d:
            sline[n]=d[item]
    fp1.write(' '.join(sline) +"\n")
0

This should be better.

fp1 = open(final,"w")
fp2 = open(initial, 'r')
data = fp2.read()
fp2.close()
for key, value in mydict.items():
    data = data.replace(key, value)
fp1.write(data)
fp1.close()
3
  • 1
    this method will change "blog" to "bLOG" when dictionary is {"log":"LOG"}. Of course this is assuming OP wants to change every occurence, regardless of whether a word is on boundary or not.
    – ghostdog74
    Commented Mar 6, 2010 at 14:56
  • 1
    wrong answer: data.replace doesn't alter data in the least (strings are immutable!!!). Commented Mar 6, 2010 at 17:08
  • Also, the whole point of having a dict of replacements is to use it as a dict, no need to iterate over it. (Assuming it's guaranteed no overlaps or multiple hits. If there were, then we iterate on each line and keep replacing, until we get no change in the output.)
    – smci
    Commented Jan 19, 2022 at 23:01

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.