-1

I'm trying to find URLs in html. This is the example I'm trying to match:

href="http://(.+)"(?:.+)

<a href="http://www.etf.rs/" target="_top">

This matches: www.etf.rs/" target=

And it should: www.etf.rs**

It's not important if it matches some rubish thing, but it's iportant that all URLs are matched. Thanks!

6

1 Answer 1

1

You can use re.search:

import re

s = '<a href="http://www.etf.rs/" target="_top">'
print re.search('"http://(.*)"\s', s).group(1)

Output:

www.etf.rs/
2
  • leave a comment if you don't like the answer, otherwise it's useless
    – midori
    Commented Jan 26, 2016 at 2:17
  • you are welcome, my comment was for someone who downvoted without comment, it's hard to understand what's not right in the answer without a comment
    – midori
    Commented Jan 26, 2016 at 2:48

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.