Skip to main content

All Questions

Tagged with
-1 votes
1 answer
62 views

extracting a string from html using HTMLParser & Python2.7

I am developing an Alexa skill and therefore, I have the standard Python (2.7) libraries available for us. Therefore, I don't have BeautifulSoup4 available to use. I'm trying to identify the below ...
thefragileomen's user avatar
1 vote
1 answer
113 views

Replace all <img> tags with one word in XML file

I have XML file, that consits of many twit with html tags in them. Among all the other tasks, I need to replace all tags with a word @emoji I have written the following code: for word in re.findall(...
Elina Schadrin's user avatar
-1 votes
2 answers
43 views

Python - Why isn't this specific text being found by findall regex?

EDIT: PLEASE DO NOT DOWNVOTE WITHOUT COMMENTING ON WHY YOU ARE DOWNVOTING. I AM TRYING MY BEST TO WRITE THIS PROPERLY! I am trying to print all of the URL links of watches on a website. I have all ...
user88720's user avatar
  • 342
0 votes
1 answer
123 views

Python - How to use finditer regex?

I would like to find every instance of img src="([^"]+)" that is preceded by the div class="grid" and succeeded by div class="orderplacebut" in some HTML code i.e. I want to find all the images in the ...
user88720's user avatar
  • 342
0 votes
0 answers
20 views

Python regex only returning one result when DOTALL used [duplicate]

I am trying to return a heap of image URLs and want to include every character, such as new lines, in my findall function. However when I used the DOTALL flag and use .* in my regex, I go from having ...
user88720's user avatar
  • 342
0 votes
2 answers
864 views

How to change original match in re.sub

I want to split text in my html using <br> tags. If the text is longer than 50 characters, I want to replace last space before 10 characters by <br>. The text is in <span class="value"&...
Milano's user avatar
  • 18.8k
1 vote
1 answer
142 views

Replace all html tag attributes with regex

I'm trying to figure out how can I add attribute id=ID_<number> to all tags in html snippet and remove another attributes. For example: <div class="...">...</div> to: <div id="...
Milano's user avatar
  • 18.8k
2 votes
2 answers
1k views

How to remove any html tags within a specific pattern in beautifulsoup

<p> A <span>die</span> is thrown \(x = {-b \pm <span>\sqrt</span> {b^2-4ac} \over 2a}\) twice. What is the probability of getting a sum 7 from both the ...
waranlogesh's user avatar
  • 1,024
0 votes
2 answers
89 views

Parsing javascript using re.findall

So I have several problems that I am trying to tackle. First I am trying to parse this javascript I got from html. $(document).ready(function() { $('#commodity-show-thumbnails').bxSlider({...
b0baboi's user avatar
  • 27
2 votes
1 answer
96 views

Scrape data from an ill-formed pdf table

I am trying to scrape data from a poorly laid out pdf (URL in the following code). I will need to use information about the position of the lines/borders of the table to make meaningful data records. ...
Astrophe's user avatar
  • 574
0 votes
3 answers
80 views

Python How to get a specific code in website using re

I'm trying to make python challange. http://www.pythonchallenge.com/pc/def/ocr.html Ok. I know, I can just copy paste the code from source to a txt file and make things like that but I want to take it ...
Dr. UK's user avatar
  • 73
0 votes
1 answer
67 views

Find the start and the end of Programming Code in a whole Text [closed]

i have html with text and also programming code (Generic), without any distinction or mark. There is a way in order to puts a mark for the start and the end of the code, suitable for any programming ...
RedVelvet's user avatar
  • 1,923
1 vote
1 answer
363 views

Unable to select only first occurrence of href in anchor tag?

Here is my HTML code: <ul class="asidemenu_h1"> <li class="top"> <h3>Mobiles</h3> </li> <li> <a href="http://www.mega.pk/mobiles-...
Mansoor Akram's user avatar
-1 votes
1 answer
265 views

Regex to capitalize paragraphs in HTML python

I want to take everything in an HTML document and capitalize the sentences (within paragraph tags). The input file has everything in all caps. My attempt has two flaws - first, it removes the ...
Xodarap777's user avatar
  • 1,376
2 votes
1 answer
94 views

Python Regex matching string between abcd="_blank"> and </a>

How can I match strings between abcd="_blank"> and </a> using Regex in Python 2.7. For example for abcd="_blank">ABBA</a> the result should be ABBA.
TJ1's user avatar
  • 8,560

15 30 50 per page