13

I have the following code:

def extract_table_date(bucket_path: str) -> str:
    event_date = re.search(r"date=([^/]+)", bucket_path)
    return event_date.group(1)[0:10].replace("-", "")

mypy throws error on the last line:

Item "None" of "Optional[Match[str]]" has no attribute "group"

I think I can solve that by assigning a type to event_date, and I can:

from typing import Match

def extract_table_date(bucket_path: str) -> str:
    event_date: Match = re.search(r"date=([^/]+)", bucket_path)
    return event_date.group(1)[0:10].replace("-", "")

but mypy now throws another error on the first line of the function:

Incompatible types in assignment (expression has type "Optional[Match[Any]]", variable has type "Match[Any]")

I don't really know how to inform mypy that the result won't be optional but nonetheless I followed the advice at Optional types and the None type by adding an assert:

from typing import Match

def extract_table_date(bucket_path: str) -> str:
    assert bucket_path is not None
    event_date: Match = re.search(r"date=([^/]+)", bucket_path)
    return event_date.group(1)[0:10].replace("-", "")

but mypy still raises the same error.

I try to fix by changing the type defined for event_date:

from typing import Match, optional, Any

def extract_table_date(bucket_path: str) -> str:
    assert bucket_path is not None
    event_date: Optional[Match[Any]] = re.search(r"date=([^/]+)", bucket_path)
    return event_date.group(1)[0:10].replace("-", "")

but (as expected) I'm now back to almost the same original error:

Item "None" of "Optional[Match[Any]]" has no attribute "group"

Any advice on how to fix this?

2 Answers 2

21

The thing that's Optional is event_date, because re.search is not guaranteed to return a match. mypy is warning you that this will raise an AttributeError if that's the case. You can tell it "no, I'm very confident that will not be the case" by doing an assert to that effect:

def extract_table_date(bucket_path: str) -> str:
    event_date = re.search(r"date=([^/]+)", bucket_path)
    assert event_date is not None
    return event_date.group(1)[0:10].replace("-", "")

If you're wrong, this code will still raise an exception (AssertionError, because your assert will fail), but mypy will no longer error because there is now no way for event_date to be None when you access its group attribute.

Note that there is no need to assert on bucket_path because it's already explicitly typed as str.

Sign up to request clarification or add additional context in comments.

9 Comments

Then you'd have to type the return as Optional[str], which may not be what is desired. Returning None when a str is expected is likely going to lead to another exception somewhere higher up the stack -- if you have a programming error, it's better to fail fast and loud than slowly and/or silently since it gives you a more useful stack trace to work with when you're debugging. :)
Another option may be to return an empty string. That way the result is consistent and document the meaning of the empty str. Though I prefer Optional[str] if this is a simple code.
The "correct" thing to do here for an API that's being used by other people is likely to have it raise a specific exception (I'd probably use ValueError) if the string doesn't have the expected content, and document the interface as "returns str, raises ValueError if there was no date in bucket_path". (This is similar to how int("foo") raises ValueError because the value of its parameter isn't a valid number.) But for a private function where you're less worried about having a super clean API, assert is quick and easy because it accomplishes mostly the same thing with one line of code.
We actually do raise an error if the value doesn't have the expected content but I left that out of the question for reasons of brevity
@jamiet If you'd left that in I could have given you a better answer! :) I assume your code is doing a whole separate regex to catch that case -- instead I'd do that checking by way of if event_date is not None: raise (whatever) after your main re.search call. That should satisfy mypy without needing the extra assert and it provides a pretty strong guarantee that your error checking is catching all the cases.
|
-2

Another possibility would be to use isinstance.

def extract_table_date(bucket_path: str) -> str:
    event_date = re.search(r"date=([^/]+)", bucket_path)
    if isinstance(event_date, str):
        return event_date.group(1)[0:10].replace("-", "")
    return ""

Here you would return "" instead of None.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.