0

I have been creating a few regex patterns to search a file. I basically need to search each line of a text file as a string of values. The issue I am having is that the regexs I have created work when used against a list of of values; however, I can not use the same regex when I search a string using the same regex. I'm not sure what I am missing. My test code is below. The regex works against the list_primary, but when I change it to string2, the regex does not find the date value I'm looking for.

import re

list_primary = ["Wi-Fi", "goat", "Access Point", "(683A1E320680)", "detected", "Access Point detected",  "2/5/2021", "10:44:45 PM",  "Local",  "41.289227",  "-72.958748"]
string1 = "Wi-Fi Access Point (683A1E320680) detected puppy Access Point detected 2/5/2021 10:44:45 PM Local 41.289227 -72.958748"
#Lattitude = re.findall("[0-9][0-9][.][0-9][0-9][0-9][0-9][0-9][0-9]")
#Longitude = re.findall("[-][0-9][0-9][.][0-9][0-9][0-9][0-9][0-9][0-9]")
string2 = string1.split('"')
# print(string2)

list1 = []

for item in string2:

    data_dict = {}

    date_field = re.search(r"(\d{1})[/.-](\d{1})[/.-](\d{4})$",item)
    print(date_field)

    if date_field is not None:
        date = date_field.group()
    else:
        date = None
3
  • for item in string2: means you iterate over each char in string1. You need to re.search against string1 Commented Mar 23, 2021 at 14:47
  • I gave that a shot. It just prints a none value for each object. However, if I run that against the list that has the date "2/5/2021", the regex finds the value. Commented Mar 23, 2021 at 15:50
  • See ideone.com/itV7XS. To use it with a list you need something like rx = re.compile(r"(?<!\d)\d{1,2}[/.-]\d{1,2}[/.-]\d{4}(?!\d)") and then print(list(filter(rx.search, list_primary))) Commented Mar 23, 2021 at 15:53

1 Answer 1

1

For your current expression to work on the string, you need to delete the dollar sign from the end. Also, in order to find double digit dates (meaning 11/20/2018), you need to change your repetitions (since with your regex you can only find singular digits dates like 2/5/2011):

import re

list_primary = ["Wi-Fi", "goat", "Access Point", "(683A1E320680)", "detected", "Access Point detected",  "2/5/2021", "10:44:45 PM",  "Local",  "41.289227",  "-72.958748"]
string1 = "Wi-Fi Access Point (683A1E320680) detected puppy Access Point detected 2/5/2021 10:44:45 PM Local 41.289227 -72.958748"
#Lattitude = re.findall("[0-9][0-9][.][0-9][0-9][0-9][0-9][0-9][0-9]")
#Longitude = re.findall("[-][0-9][0-9][.][0-9][0-9][0-9][0-9][0-9][0-9]")
string2 = string1.split('"')
# print(string2)

list1 = []

for item in string2:

    data_dict = {}

    date_field = re.search(r"(\d{1,2})[/.-](\d{1,2})[/.-](\d{4})",item)
    print(date_field)

    if date_field is not None:
        date = date_field.group()
    else:
        date = None

Output:

re.Match object; span=(71, 79), match='2/5/2021'>

If you want to extract the date from your string (rather than just search if it exists), include a capturing group around your whole expression in order to see your date as one string and not as 3 different numbers:

date_field = re.findall(r"(\d{1,2}[/.-]\d{1,2}[/.-]\d{4})",string1)
print(date_field)

Output:

['2/5/2021']
Sign up to request clarification or add additional context in comments.

1 Comment

Removing the $ worked perfect. Thank you

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.