Skip to main content

Questions tagged [dataset]

Requests for datasets are off-topic on this site. Use this tag for questions concerning creating, processing, or maintaining datasets.

3 votes
1 answer
89 views

How should I handle a mass-point in the dependent variable when running OLS regression in R? I’m working with a a household expenditure dataset (Living Costs 2019) where the dependent variable is the ...
Jim's user avatar
  • 31
5 votes
3 answers
533 views

I’m working on a project where I need to build a predictive model for wine quality based on its chemical properties. The goal is to find which features best explain or predict the quality score. I’ve ...
QualityX's user avatar
4 votes
5 answers
704 views

It’s confusing to understand how quartile values can actually be used to give insights into a dataset. Please assist with examples. I struggle to interpret the values in the context of providing ...
Buchi's user avatar
  • 41
1 vote
0 answers
47 views

Neural Network Beginner here. I am currently implementing a CNN on PyTorch for recognizing Japanese handwritten letters, which has 46 classes of outputs. I found a dataset on Kaggle https://www.kaggle....
Krish Thyagarajan's user avatar
1 vote
0 answers
99 views

Out of curiosity, I am looking for an example of an authentic variable (which one would find in a data set) with an exceptionally small coefficient of variation:  $\text{CV} = \frac{s}{\bar{x}}$.  To ...
Gregg H's user avatar
  • 7,077
1 vote
1 answer
136 views

This question is inspired by a blog post by https://www.argmin.net/p/in-defense-of-typing-monkeys and several rumors I've heard from other people who works in machine learning. The gist of it is that ...
Your neighbor Todorovich's user avatar
1 vote
2 answers
123 views

If I got it correct, the standard error is a statistic that measures the variability of a sample’s data and how accurately a statistic represents the corresponding parameter. Please suggest any ...
okman's user avatar
  • 315
0 votes
1 answer
50 views

I have a question that relates to the use of IAT scores across timepoints. As part of a large health-based intervention my colleagues and I have obtained IAT scores at different timepoints, from which ...
Jonathan Kim's user avatar
2 votes
1 answer
102 views

If we have a high-dimensional dataset (7-10 columns) of continuous variables like Time, Temperature etc. recorded from experiments (not performed by us) are there established methods to quantitatively ...
Sunera Wijeratne's user avatar
1 vote
1 answer
94 views

I am looking to apply a calibration/correction approach on a set of sensors and I just wanted to know that the approach I am going to use is statistically correct and acceptable. I am using a set of ...
Milad's user avatar
  • 157
2 votes
1 answer
83 views

I have a task: for the store, where customers may pay for their items on registers with cashiers, were added self-service checkouts. I have 4 months of transaction data of customers who make their ...
remon's user avatar
  • 21
0 votes
0 answers
39 views

Suppose one must share a data file – could be a simple CSV file – where each datapoint has several variates, let's say a nominal one, an ordinal one, and a continuous-real one. Are there any standard ...
pglpm's user avatar
  • 1,356
0 votes
0 answers
49 views

In a longitudinal hospitalization survey dataset, where patients are asked to fill out a survey each time they are admitted into the hospital, one of the questions is no longer asked. This question ...
Kevin's user avatar
  • 353
1 vote
1 answer
95 views

The MNIST dataset can be obtained directly using Keras by running the following lines of Python code. ...
user3728501's user avatar
0 votes
0 answers
49 views

I'm using this dataset for a regression project, and the goal is to predict the beneficiary risk score(Bene_Avg_Risk_Scre). Now, to protect beneficiary identities and safeguard this information, CMS ...
Anirudh's user avatar

15 30 50 per page
1
2 3 4 5
129