0

Hi I am trying to replace all the expressions containing 'www...' and 'http://..' with just 'URL'. I tried this but I am getting this error.

TypeError: expected string or buffer

My code is:

df['text_1'] = re.sub('((www\.[^\s]+)|(https?://[^\s]+))','URL',df['text'])

df[text] contains tweets, so I want to keep only the text in there. I am in Python 2 Thanks.

4
  • Is df[text] a list of tweets, i.e. a list of strings, or a single string? Have you tried ... = [re.sub('<regex>','URL',s) for s in df['text']]?
    – tobias_k
    Commented May 15, 2017 at 22:05
  • in each value of df[text] there is one tweet. This is what are you asking? Commented May 15, 2017 at 22:09
  • Please clarify what data type df actually is. We know it's not a string and not a buffer, I'm assuming it's a pandas DataFrame.
    – acidtobi
    Commented May 15, 2017 at 22:21
  • yes it is DataFrame, would you recommend me where can I read more about these differences? String, DataFrames and Buffer? Now I am a bit confuse about this. Thanks Commented May 15, 2017 at 22:33

2 Answers 2

2

Assuming df is a pandas DataFrame, don't use re.sub. Use pandas.DataFrame.replace instead:

df['text_1'] = df['text'].replace('((www\.[^\s]+)|(https?://[^\s]+))',
                                  'URL',
                                  regex=True)

This will generate a new column text_1 with all values of text replaced according to your regular expression.

1

It sounds like you're getting that error because you're not supplying a string or buffer as the third argument to re.sub.

>>> re.sub('\W', 'REPLACED', 'this is my text')
'thisREPLACEDisREPLACEDmyREPLACEDtext'
>>> re.sub('\W', 'REPLACED', None)
Traceback (most recent call last):
...
TypeError: expected string or buffer

Ensure that df['text'] contains a proper string before you try using it for re.sub

1
  • That worked, yes I am using dataframes, thanks everybody Commented May 15, 2017 at 22:17

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.