Questions tagged [aggregation]
The aggregation tag has no summary.
47 questions
6
votes
2
answers
84
views
How to handle irrelevant categorical variables in aggregated data?
I’m working with ad server data where I can’t get user-level data — only aggregated reports. The data is aggregated on multiple categorical dimensions (e.g., day × product × medium × source × campaign ...
7
votes
1
answer
70
views
Suitable method to disaggregate time-series from national level to regional level?
I have data for five variables (in the form of time-series) that are reported at both national and regional levels. The response variable (also a time-series) is only reported at the national level- ...
2
votes
0
answers
25
views
Combining Pearson/Spearman coefficients from different experiments
In my research setup I have multiple experiments where I calculate pearson/spearman coefficients between model predictions and ground truth and I want a way to aggregate these values. I have tried to ...
1
vote
1
answer
92
views
How to aggregate classes for higher overall accuracy?
I have trained a classifier on a dataset that comprises a large number of classes. Some classes are easy to predict, whereas others are frequently misclassified.
I would like to aggregate the classes ...
1
vote
1
answer
81
views
How to impute and aggregate data with ID variant variables for predictive modeling?
I have a dataset that looks like so
ID
var_id_invariant_1
...
var_id_invariant_p
var_id_variant_1
...
var_id_variant_k
target
315
25
...
a
2.4
...
A
1
246
31
...
nan
5.7
...
B
0
315
25
...
a
9.4
...
...
1
vote
2
answers
2k
views
Learning from aggregated data
Online and in the literature there seems to be a general consensus that training a machine learning model using aggregated data is harder and/or fundamentally different from training on raw event data....
0
votes
1
answer
90
views
How to aggregate the metrics from two different regression problems?
I'm about to conduct some tests to compare two solutions to regression problems. And to make the results more robust, I want to apply both on a few different datasets (all problems will be a ...
1
vote
0
answers
49
views
Optimizing Rank Aggregation of Two Different Methods in Information Retrieval
As the title suggests, I would like to train a rank-aggregating model. My target problem is to rank text2s from a database as best as possible to a given query, <...
0
votes
1
answer
62
views
How to aggregate qualitative results from a simulation
I am trying to look for methods to aggregate and find the average of qualitative data that are outputted from a simulation.
There are 20 qualitative measures, each divided unevenly into 4 cycles ...
2
votes
0
answers
23
views
Learning the Average of a 0/1 Dependent Variable
uppose I have a matrix 𝑋 and a dependent vector 𝑦 whose entries are each
in {0,1}
dependent on the corresponding row of 𝑋
Given this dataset, I'd like to learn a model, so that given some other ...
0
votes
1
answer
276
views
How can we predict a value after several rows of data?
I have a regression problem in which for each week I have several rows (variable between rows i.e 1 week might have 1800 rows and other might have 5000 rows).
My target is to predict a value at end of ...
0
votes
1
answer
365
views
Tableau: keeping results independent of view / filter
I am using Tableau Desktop 2021.1.4
Suppose that my source sales data consists of 4 columns Region (dimension with values: N,E,W,S), Type (dimension with values: Furniture, Electronics, Appliances), ...
2
votes
1
answer
96
views
Labeling and aggregating features issue
I am trying build a simple binary classifier (some tree based algorithm for now) and my training data will have features aggregated at the user level. So I'll have a unique records of each user. These ...
0
votes
1
answer
39
views
Concatenating Data in two years
I have to use a Machine Learning Model to predict the Electricity consumption and carbon emission based on some buildings' features. (Area, year of construction ...) Here is the link to the data.
The ...
-1
votes
1
answer
101
views
How to aggregate data inserted by users to avoid outliers?
I'm developing a new application based on machine learning. In this application users can insert new data to improve the prediction system.
As you may guess, users could insert data that doesn't make ...