How do i find and replace characters in nested list using python?

Question

I have a nested list with time values. I want to check and replace times that do not in time format "HH:MM". The first step i want to do is adding ":00" for numbers that have not ":" . My list is look like the below list (mylist) .

mylist = [['x', '6 - 9:30 AM - 10:30 AM - 2 PM - 5 PM - 9 PM], ['y',  7:30 AM - 2:30 PM, 7:30 AM - 2:30 PM, 7:30 AM - 1:30 PM']]

res = [['x', '6:00 - 9:30 AM - 10:30 AM - 2:00 PM - 5:00 PM - 9:00 PM], ['y',  7:30 AM - 2:30 PM, 7:30 AM - 2:30 PM, 7:30 AM - 1:30 PM]]

I have tried this code:

for idx, (id,name) in enumerate(mylist):

    for n2,j in  enumerate(name.split(' - ')) :
        if ':' not in j and id not in j:
            print(name)
            if ":" not in name.split('-')[0] and ":" not in name.split('-')[1]:
                list1[idx][n2] = name.split('-')[0].split(' ')[0] + ':00' + ' AM' + ' - ' + \
                                name.split('-')[1].split(' ')[1].strip() + ':00' + ' PM'
                # print(name)
            elif ":" not in name.split('-')[0]:
                list1[idx][n2] = name.split('-')[0].split(' ')[0] + ':00' + ' AM' + ' - ' + \
                                name.split('-')[1].split(' ')[1].strip() + ' PM'

            elif ":" not in name.split('-')[1]:
                list1[idx][n2] = name.split('-')[0].split(' ')[0] + ' AM' + ' - ' + name.split('-')[1].split(' ')[
                    1].strip() + ':00' + ' PM'
            else:
                list1[idx][n2] = name.split('-')[0].split(' ')[0] + ' AM' + ' - ' + name.split('-')[1].split(' ')[
                    1].strip() + ' PM'

but it rised the below error:

name.split('-')[1].split(' ')[1].strip() + ' PM' IndexError: list assignment index out of range

How can i solve the issue?

The code you provided contains some basic syntax errors. Can you fix these first, that would make it easier to look at your question. For example: the quotes in the definition of mylist are not consistent, and list1 is not defined. — Christiaan Herrewijn
– Christiaan Herrewijn, Commented Aug 1, 2020 at 13:38

sortas · Accepted Answer · 2020-08-01 13:49:58Z

2

The whole logic you use it's correct, but you need to replace splits with some regex. For example, if you want to be sure that all the time values in x are with :00, you can apply something like this:

test_text = "6 - 9:30 AM - 10:30 AM - 2 PM - 5 PM - 9 PM"
print(re.sub(r'(\s|^)(\d+)(\s)', r'\1\2:00\3', test_text))

6:00 - 9:30 AM - 10:30 AM - 2:00 PM - 5:00 PM - 9:00 PM

The task here was to insert :00, so:

Firstly we check that it's hours (either start of the string or first number after the empty space): (\s|^)
Then we check that it's must be a number (or multiple numbers): (\d+)
Then we check that it doesn't have minutes (empty space after): (\s)
Then we mention all the groups (\1, \2, \3) so re.sub won't touch them, and just insert :00 in between.

You can apply the same logic to all possible tasks you have here.

answered Aug 1, 2020 at 13:49

sortas

1,7415 gold badges21 silver badges32 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

BBG Over a year ago

Thanks @sortas. I'll try your solution. Great.

BBG Over a year ago

Thanks @sortas. I solved my issue based on your solution. I have another question. Using "re" how can i add '0' before one digit numbers (6,2,5,9:30)? For example: 06,02,05

sortas Over a year ago

I mean, you can check that there's a one-digit number by putting \d. Because there's no + it counts as "single number". So (\s|^)(\d)(\s\:) is "starts with a space/start of the string, then goes a single number, then goes another space or :". So, update the second part, "what to replace/insert", and you're good to go.

Dharman · Accepted Answer · 2020-08-01 14:33:21Z

Another way is to model a function that hides the complexity of the task by applying the time extraction task to each component of your input list. Here is a solution:

Your input list to which I have added missing single quotes:

mylist = [['x', '6 - 9:30 AM - 10:30 AM - 2 PM - 5 PM - 9 PM'], ['y', '7:30 AM - 2:30 PM, 7:30 AM - 2:30 PM, 7:30 AM - 1:30 PM']]

Define a function f() that will parse into HH:MM each of the input values (assuming they are all separated either by a comma or a dash):

def f(time):
    t = re.findall(r'\d+', time)
    suffix = ""
    if "AM" in time:
        suffix = "AM"
    elif "PM" in time:
        suffix = "PM"
    if len(t) > 1:
        return ':'.join(t) + suffix
    return t[0] + ":00" + suffix

What it does is basically extracting digits using a regular expression on the input values, parse them into hours and minutes and finally apply the correct suffix (either empty/AM/PM according to the requirements).

As example this will print your values:

for ls in mylist:
    ls = re.split('-|,', ls[1])
    print([f(x) for x in ls])

Collectives™ on Stack Overflow

How do i find and replace characters in nested list using python?

2 Answers 2

3 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Related