0

I need to write a function which replaces multiple format strings into downcase.

For example, a paragraph contains a word 'something' in different formats like 'Something', 'SomeThing', 'SOMETHING', 'SomeTHing' need to convert all format words into downcase 'something'.

How to write a function with replacing with downcase?

2
  • 1
    Hi Prasanna. Could you please post some sample code to show what have you tried so far? Commented Dec 20, 2017 at 6:03
  • I am using replace method which is not efficient. eg: output.replace("SomeThing", "something").replace("SomeTHing", "something") Commented Dec 20, 2017 at 6:44

3 Answers 3

2

You can split your paragraph into different words, then use the slugify module to generate a slug of each word, compare it with "something", and if there is a match, replace the word with "something".

In [1]: text = "This paragraph contains Something, SOMETHING, AND SomeTHing"

In [2]: from slugify import slugify

In [3]: for word in text.split(" "): # Split the text using space, and iterate through the words
   ...:     if slugify(unicode(word)) == "something": # Compare the word slug with "something"
   ...:           text = text.replace(word, word.lower())

In [4]: text
Out[4]: 'This paragraph contains something, something AND something'
2
  • Now you don't have any commas, dots etc in the output text. That was not desired.
    – Psytho
    Commented Dec 20, 2017 at 7:38
  • @Psytho Nice catch. Rectified it now. Commented Dec 20, 2017 at 8:08
1

Split the text into single words and check whether a word in written in lower case is "something". If yes, then change the case to lower

if word.lower() == "something":
    text = text.replace(word, "something")

To know how to split a text into words, see this question.

Another way is to iterate through single letters and check whether a letter is the first letter of "something":

text = "Many words: SoMeThInG, SOMEthING, someTHing"
for n in range(len(text)-8):
    if text[n:n+9].lower() == "something": # check whether "something" is here
        text = text.replace(text[n:n+9], "something")

print text
2
  • The second way is simple and elegant. Didn't think about that! Commented Dec 20, 2017 at 8:14
  • It could be slow if the text is very long, I think.
    – Psytho
    Commented Dec 20, 2017 at 8:16
1

You can also use re.findall to search and split the paragraph into words and punctuation, and replace all the different cases of "Something" with the lowercase version:

import re

text = "Something, Is: SoMeThInG, SOMEthING, someTHing."

to_replace = "something"

words_punct = re.findall(r"[\w']+|[.,!?;: ]", text)

new_text = "".join(to_replace if x.lower() == to_replace else x for x in words_punct)

print(new_text)

Which outputs:

something, Is: something, something, something.

Note: re.findall requires a hardcoded regular expression to search for contents in a string. Your actual text may contain characters that are not in the regular expression above, you will need to add these as needed.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.