Skip to main content

Questions tagged [dataset]

A dataset is a collection of data, often in tabular or matrix form. This tag is NOT intended for data requests ("where can I find a dataset about ...") --> see OpenData

0 votes
1 answer
93 views

I am working on a research project aimed at classifying babies' cries based on their needs. However, I have encountered difficulties in obtaining a suitable cry dataset. The only dataset I was able to ...
0 votes
1 answer
692 views

I have a data set that boils down to Three clomuns: 1.Supplier name 2. Number of transactions with supplier 3. Total value of those transaction. I'm trying to find the best way to rank all suppliers ...
1 vote
1 answer
99 views

I have a bunch of projects for my job that are largely unrelated except they use the same data, which is pretty big on disk in csv format. I want these to exist separately from each other and I ...
0 votes
1 answer
386 views

I have tried a simple algorithm to anonymize my data using the de-identification technique. But the code doesn't work for me. I want to anonymize the data by slightly changing the values. The data ...
2 votes
1 answer
338 views

I'm trying to fake data for the coffee shop. I've two features age and menu. Menu includes various type of drinks such as coffee [latte, espresso, mocca, etc], tea [milktea, lemontea], milk [freshmilk,...
1 vote
1 answer
58 views

I am a tableau developer, but I know Python, stats, and, in short, I think you all will be best able to solve my problem. There is a universal filter on Facility. This means that any dataset/sheet ...
0 votes
1 answer
109 views

I'm new to data analysis and I need to do a data analysis project using clustering methods for a course in R. I have no idea how to start and choose my data set. I'm looking for some resources. Is ...
0 votes
1 answer
167 views

I'm making a data transformation pipeline on a dataset, and I am getting an error: all the input array dimensions except for concatenation axis must match exactly, but along dimension 0, the array at ...
2 votes
1 answer
376 views

I have a system as a black box that has two correct outputs for a single input sample. now I want to train a neural network to generate at least one of the correct outputs for that input sample. what ...
1 vote
1 answer
187 views

I am looking for a public data-set of images that differ from each other only slightly, so that after applying PCA they can be reconstructed with a small error from ...
5 votes
3 answers
127 views

I'm in a debate with someone about a problem where there are duplicates over features (i.e. $ X_1 = X_2 $ but $ Y_1 \ne Y_2 $). My point of view is that we should keep those datas, as they can be ...
0 votes
1 answer
1k views

I have a dataset that contains missing values in some columns. I would like to know what is the best approach to deal with this missing data. Should I remove rows with missing data or fill in missing ...
0 votes
1 answer
259 views

I am working on a neural network regression code. The dataset includes 14 features in the range value between -1 and 1. while the target variable is changing among (0.000759) to (1100). The target ...
2 votes
2 answers
173 views

I have trained my classifier on pictures with a mixture of several classes on each picture, e.g. A-F. The classifier is able to (nearly) correctly segment those classes on the images. Now I got more ...
1 vote
1 answer
767 views

I am working on two text datasets, one is having 68k text samples and other is having 100k text samples. I have encoded the text datasets into bert embedding. ...

15 30 50 per page
1
2 3 4 5
101