0

Right, i'm relatively new to Python, which you will likely see in my code, but is there any way to iterate through a list within regex?

Basically, i'm looping through each filename within a folder, getting a code (2-6 digits) from the filename, and i'm wanting to compare it with a list of codes in a text file, which have a name attached, in the format "1234_Name" (without the quotation marks). If the code exists in both lists, I want to print out the list entry, i.e. 1234_Name. Currently my code only seems to look at the first entry in the text file's list and i'm not sure how to make it look through them all to find matches.

import os, re

sitesfile = open('C:/Users/me/My Documents/WORK_PYTHON/Renaming/testnames.txt', 'r')
filefolder = r'C:/Users/me/My Documents/WORK_PYTHON/Renaming/files/'

sites = sitesfile.read()
site_split = re.split('\n', sites)


old = []
newname = []

for site in site_split:
    newname.append(site)


for root, dirs, filenames in os.walk(filefolder):
    for filename in filenames:
        fullpath = os.path.join(root, filename)
        filename_split = os.path.splitext(fullpath) 
        filename_zero, fileext = filename_split
        filename_zs = re.split("/", filename_zero)
        filenm = re.search(r"[\w]+", str(filename_zs[-1:]))#get only filename, not path
        filenmgrp = filenm.group()

        pacode = re.search('\d\d+', filenmgrp)
        if pacode:
            pacodegrp = pacode.group()
            match = re.match(pacodegrp, site)
            if match:
                 print site

Hope this makes sense - thanks a lot in advance!

1 Answer 1

0

So, use this code instead:

import os
import re
def locate(pattern = r'\d+[_]', root=os.curdir):
    for path, dirs, files in os.walk(os.path.abspath(root)):
        for filename in re.findall(pattern, ' '.join(files)):
            yield os.path.join(path, filename)

..this will only return files in a folder that match a given regex pattern.

with open('list_file.txt', 'r') as f:
     lines = [x.split('_')[0] for x in f.readlines()]

print_out = []

for f in locate(<your code regex>, <your directory>):
    if f in lines: print_out.append(f)

print(print_out)

...find the valid codes in your list_file first, then compare the files that come back with your given regex.

7
  • Does fnmatch.filter accept regex? I thought it only accepted unix-style globs. Commented Aug 29, 2013 at 0:52
  • Ahh, you're right. Dug that out of my funcs.py :) I still think it'll work for what he wants to do, just in a slightly different format. See here for acceptable pattern matching, docs.python.org/2/library/fnmatch.html
    – blakev
    Commented Aug 29, 2013 at 0:55
  • I can't seem to get anything to add to the list of strings. As i'm looking for numbers, shouldn't '[0123456789]' work?
    – hansolo
    Commented Aug 29, 2013 at 8:14
  • I'm sorry but i'm still getting nothing appended to print_out. I feel like I've tried every combination of everything in <your code regex>. Would you mind suggesting what it should look like, specifically what string the regex should be searching? Apologies for being stupid.
    – hansolo
    Commented Aug 29, 2013 at 21:03
  • It's no problem man :) take a screenshot of a folder or something with some files you want matched with others in there you don't want and we can come up with a regex. Maybe include a sample from the file with accepted numbers in it too. Then we can clean up your "is it there, is it not" logic and make it work. You can use pastebin or shoot me an e-mail at [email protected]
    – blakev
    Commented Aug 29, 2013 at 21:25

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.