1
\$\begingroup\$

How could i improve the following code. This part was easy to implement but there is a lot of redondancy and i use pandas to return as dict which seems quite odd.

def pipeline_place_details(place_id, fields, lang):
    """Return a dataframe with useful information

    Args:
        place_id ([string]): Id retrieved from google api
        fields ([type]): Field retrieved from json output
    """
        fields = ['name', 'formatted_address', 'international_phone_number', 'website', 'rating', 'review']
        lang = 'fr'
        # details will give us a dict which is the result of serialized json returned from google api
        details = get_place_details(place_id, fields, "fr")
        try:
            website = details['result']['website']
        except KeyError:
            website = "" 
    
        try:
            address = details['result']['formatted_address']
        except KeyError:
            address = ""
            
        try:
            phone_number = details['result']['international_phone_number']
        except KeyError:
            phone_number = ""
            
        try:
            reviews = details['result']['reviews']
        except KeyError:
            reviews = []
        rev_temp = []
        for review in reviews:
            author_name = review['author_name']
            user_rating = review['rating']
            text = review['text']
            time = review['relative_time_description']
            rev_temp.append((author_name, user_rating, text, time))
        rev_temp_2 = pd.DataFrame(rev_temp, columns = ['author_name', 'rating', 'text', 'relative_time'])
        rev_temp_2['place_id'] = i
        rev_temp_2['address'] = address
        rev_temp_2['phone_number'] = phone_number
        rev_temp_2['website'] = website
        
        review_desc = review_desc.append(rev_temp_2, ignore_index = True)
    
    return review_desc.to_dict('records')
```
\$\endgroup\$

2 Answers 2

1
\$\begingroup\$

1. Getting data from dict

When details is a dict, then instead of writing:

try:
    address = details['result']['formatted_address']
except KeyError:
    address = ""

you can do:

address = details.get('result', {}).get('formatted_address', '')

Second parameter of .get represents the default value which is returned when a dictionary doesn't contain a specific element

2. Modularization

Function pipeline_place_details is not short, so it might be a good idea to break it up into a smaller functions. Each time when you write a loop, it's worth to consider if moving body of the loop to a separate function will increase the code readability.

For example that part:

    author_name = review['author_name']
    user_rating = review['rating']
    text = review['text']
    time = review['relative_time_description']
    rev_temp.append((author_name, user_rating, text, time))

can be easiliy extracted to the new function:

def process_review(review):
        return (
            review['author_name'],
            review['rating'],
            review['text'],
            review['relative_time_description']
        )

and you can use it as below:

rev_temp = [process_review(review) for review in reviews]

3. Variables naming

Good code should be not only working and efficient but it should be also easy to read and understand. So it's a good rule to use really meaningful variable names. I belive that you can find better names than for example rev_temp or rev_temp2. ;)

\$\endgroup\$
1
\$\begingroup\$

I worked a bit on code and suggest that (i could encapsulate some parts as trivel suggest), what do you think ?

I directly work with dictionaries and try to avoid redundance letting the try except jon in get_from_dict function. It looks better (but perhaps perform more slowly ...).

def get_from_dict(dic, keys):
    """Iterate nested dictionary"""
    return reduce(lambda d, k: d.get(k, ""), keys, dic)

def parse_place_details(place_id):
        fields = ['name', 'formatted_address', 'rating', 'review','geometry']
        details = get_place_details(place_id, fields, 'fr')
        core_keys = {'adress':('result','formatted_address'),
                        'name':('result','name'),
                        'longitude':('result','geometry','location','lng'),
                        'latitude':('result','geometry','location','lat')}
        review_keys = {'author_name':('author_name',),
                       'user_rating':('rating',),
                       'text':('text',),
                       'time':('relative_time_description',)}
        results =[]
        core = {'place_id':place_id}
        for k,path in core_keys.items():
            core[k] = get_from_dict(details,path)
        reviews = get_from_dict(details,('result','reviews'))                     
        for review in reviews:
            descr ={k:get_from_dict(review,rkey) for k,rkey in review_keys.items()}
            descr.update(core)
            results.append(descr)       
        return results
    

```
\$\endgroup\$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.