0

I have two lists like this,

a=[['a', 'b', 'c'], ['b', 'c'], ['a', 'd'], ['x']]
b=[[1, 2, 3], [4,5], [6,7], [8]] (the size of a and b is always same)

Now I want to create two list with the sum of unique elements, so the final lists should look like,

 a=['a', 'b', 'c', 'd', 'x']
 b=[7, 6, 8, 7, 8] (sum of all a, b, d, d and x)

I could do this using for loop but looking for some efficient way to reduce execution time.

1
  • The obvious way using for loops with a temporary dictionary is asymptotically optimal; anything else can only be more efficient by a constant factor, and most likely will be less efficient. I suggest write it with for loops, and then if your program is not fast enough, profile it to see where the bottleneck actually is. Any significantly faster implementation will likely need to use vectorization (e.g. numpy). Commented Feb 22, 2021 at 12:50

5 Answers 5

1

Not so pythonic but will do the job:

a=[['a', 'b', 'c'], ['b', 'c'], ['a', 'd'], ['x']]
b=[[1, 2, 3], [4,5], [6,7], [8]]

mapn = dict()
for elt1, elt2 in zip(a, b):
    for e1, e2 in zip(elt1, elt2):
        mapn[e1] = mapn.get(e1, 0) + e2

elts = mapn.keys()
counts = mapn.values()

print(mapn)
print(elts)
print(counts)
Sign up to request clarification or add additional context in comments.

Comments

0

You can use zip and collections.Counter along the following lines:

from collections import Counter

c = Counter()
for la, lb in zip(a, b):
    for xa, xb in zip(la, lb):
        c[xa] += xb
 
list(c.keys())
# ['a', 'b', 'c', 'd', 'x']
list(c.values())
# [7, 6, 8, 7, 8]

Comments

0

Here some ideas.

First, to flatten your list you can try:

a=[['a', 'b', 'c'], ['b', 'c'], ['a', 'd'], ['x']]
b=[[1, 2, 3], [4,5], [6,7], [8]]

To have uniques elements, you can do something like

A = set([item for sublist in a for item in sublist])

But what I would do first (perhaps not the more efficient) is :

import pandas as pd
import bumpy as np
LIST1 = [item for sublist in a for item in sublist]
LIST2 = [item for sublist in b for item in sublist]
df = pd.DataFrame({'a':LIST1,'b':LIST2})
df.groupby(df.a).sum()

OUTPUT:

enter image description here

Comments

0

At the end of the day, you're going to have to use two for loops. I have a one liner solution using zip and Counter.

The first solutions works only in this specific case where all the strings are a single character, because it creates a string with the right number of each letter, and then gets the frequency of each letter.

from collections import Counter

a = [['a', 'b', 'c'], ['b', 'c'], ['a', 'd'], ['x']]
b = [[1, 2, 3], [4,5], [6,7], [8]]

a, b = zip(*Counter(''.join(x*y for al, bl in zip(a, b) for x, y in zip(al, bl))).items())

For the more general case, you can do:

a, b = zip(*Counter(dict(p for al, bl in zip(a, b) for p in zip(al, bl))).items())

Comments

0

You can combine the lists and their internal lists using zip(), then feed the list of tuples to a dictionary constructor to get a list of dictionaries with values for each letter. Then convert those dictionaries to Counter and add them up.

a = [['a', 'b', 'c'], ['b', 'c'], ['a', 'd'], ['x']]
b = [[ 1,   2,   3 ], [ 4,   5 ], [ 6,   7 ], [ 8 ]]

from collections import Counter
from itertools import starmap

mapn        = sum(map(Counter,map(dict,starmap(zip,zip(a,b)))),Counter())
elts,counts = map(list,zip(*mapn.items()))

print(mapn)   # Counter({'c': 8, 'x': 8, 'a': 7, 'd': 7, 'b': 6})    
print(elts)   # ['a', 'b', 'c', 'd', 'x']    
print(counts) # [ 7,   6,   8,   7,   8]

detailed explanation:

  • zip(a,b) combines the lists into pairs of sublists. e.g. (['a','b','c'],[1,2,3]), ...
  • starmap(zip,...) takes these list pairs and merges then together into sublist of letter-number pairs: [('a',1),('b',2),('c',3)], ...
  • Each of these lists of pairs is converted to a dictionary by map(dict,...) and then into a Counter object by map(counter,...)
  • We end up with a list of Counter objects corresponding to the pairing of each sublist. Applying sum(...,Counter()) computes the totals for each letter into a single Counter object.
  • Apart from being a Counter object, mapn is excatly the same as the dictionary that you produced.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.