I need to process small amounts of texts (i.e. strings in python).
I want to remove certain punctuation
(like '.', ',', ':', ';',
)
but keep punctuation indicative of emotions like ('...', '?', '??','???', '!', '!!', '!!!'
)
Also, I want to remove non-informative words as 'a', 'an', 'the'
.
Also, the biggest challenge so far is how to parse "I've" or "we've" to get "I have" and "we have" eventually? the apostrophe makes it difficult for me.
What is the best/simplest way to do this in python?
For example:
"I've got an A mark!!! Such a relief... I should've partied more."
The result I want to get:
['I', 'have', 'got', 'A', 'mark', '!!!', 'Such', 'relief', '...',
'I', 'should', 'have', 'partied', 'more']