Problem
Given the following data:
[
{
"users": [
{
"id": "07bde76f-aff0-407d-9241-a12b323d4af8",
"transactions": [
{
"category": "purchase"
},
{
"category": "unknown"
}
]
},
{
"id": "40aa040f-7961-4e06-a31b-32052be67fcc",
"transactions": [
{
"category": "sale"
},
{
"category": "unknown"
}
]
}
],
"groups": [
{
"id": "00c61181-b133-4be9-9d44-dc3c224b3beb",
"transactions": [
{
"category": "atm"
},
{
"category": "cash"
}
]
},
{
"id": "eb959ff1-da1d-41e5-b5b7-45fef3dbc2df",
"transactions": [
{
"category": "atm"
},
{
"category": "cash"
}
]
}
]
},
{
"users": [
{
"id": "af095f1c-fe43-43fb-9571-dabe2dd56bcf",
"transactions": [
{
"category": "bill"
}
]
}
],
"groups": [
{
"id": "c5bafe16-c5ec-428e-8c7c-30cbd9963750",
"transactions": [
{
"category": "fee"
},
{
"category": "cash"
}
]
}
]
}
]
... I want to produce the following output:
{
"groups_atm": 2,
"groups_fee": 1,
"groups_cash": 3,
"users_purchase": 1,
"users_unknown": 2,
"users_bill": 1,
"users_sale": 1
}
Implementation
I've approached this by first mapping over transactions and summing their occurrences:
const sum = (transactions) =>
transactions.map((transaction) => transaction.category).reduce((acc, transactionCategory) => {
return {
...acc,
[transactionCategory]: (acc[transactionCategory] || 0) + 1,
};
}, tally);
... then aggregating by scope ("user", "group") per element of the data list and merge the counts by category:
const aggregate = (datum, tally) =>
['user', 'group'].reduce((acc, scope) => {
const aggregates = datum[`${scope}s`].reduce(
(agg, data) => sum(agg, data.transactions),
{},
);
return {
...acc,
[scope]: acc[scope] ? merge(acc[scope], aggregates) : aggregates,
};
}, tally);
const difference = (arrA, arrB) => arrA.filter((x) => !arrB.includes(x));
const intersection = (arrA, arrB) => arrA.filter((x) => arrB.includes(x));
const merge = (objA, objB) => {
const acc = {};
const aKeys = Object.keys(objA);
const bKeys = Object.keys(objB);
intersection(aKeys, bKeys).forEach((key) => (acc[key] = objA[key] + objB[key]));
difference(aKeys, bKeys).forEach((key) => (acc[key] = objA[key]));
difference(bKeys, aKeys).forEach((key) => (acc[key] = objB[key]));
return acc;
};
... then re-reducing over the whole dataset:
const aggregates = data.reduce((acc, datum) => aggregateScope(datum, acc), {});
... and finally reformatting the aggregates to match the expected output:
const format = (aggregates) =>
Object.keys(aggregates).reduce((acc, scope) => {
Object.keys(aggregates[scope]).forEach((category) => {
acc[`${scope}_${category}`] = aggregates[scope][category];
});
return acc;
}, {});
Questions
- what are alternative ways of breaking down the problem?
- what is it's Big O complexity? Can it be reduced?
- can
mergebe avoided? - are there language features (JS/ES6) that can make this more idiomatic?
- is "aggregation" the correct terminology?