Questions tagged [data-formats]
The data-formats tag has no summary.
37 questions
0
votes
0
answers
40
views
What to include in fact and dimension table from election database
I am working with election dataset of India of year 2014
and data for 2019
I also have table for party names and descriptions
and finally state name with code.
I am not getting how do i create a ...
0
votes
1
answer
279
views
How to drop the previous rows of a database based on a matching value in a column?
So I am currently trying to sort through a data frame containing attribute classes and values of teams. However, my data has multiple rows of different classes and values of the same Team ID/Attribute ...
1
vote
1
answer
4k
views
Date time conversion in a CSV column [closed]
I am new to data science. I am attempting to write a program using regression techniques, and all of my values are numerical, except for the date and time (UTC), which are written in this format: HH:...
1
vote
1
answer
28
views
Is there any way to analyze the format of text strings? [closed]
I have a lot of data which basically consists of alphanumeric text on individual lines which can very in length and contain delimiters.
Since there are many thousands of lines of text, I'm looking to ...
3
votes
2
answers
1k
views
Storing Large dataset for processing and analysis of data
I am new to data engineering and wanted to know , what is the best way to store more than 3000 GB of data for further processing and analysis ? I am specifically looking for open source resources . I ...
0
votes
1
answer
230
views
Python: convert variables into correct format for DataFrame
I have 3 variables that I would like to use to build my dataset but since they are in a weird shape/format, I had no success so far. I'm quite new to this and really appreciate any help!!
The 3 ...
2
votes
1
answer
4k
views
How to store efficiently very large sparse 3D matrices
To train a CNN, I have stacked arrays of images over observations [observations x width x length]. The dataset is very sparse ($95\%$). What would be an efficient ...
0
votes
1
answer
590
views
Running a query in R after establishing dbconnect
I do not seem to figure out what is wrong it the following statement. The connection to the DWH is established but the query statement in R seems not to work, with the following error :
...
1
vote
2
answers
91
views
Converting data format
I'm trying to use the recent COVID-19 data from the site of Italian Civil Protection, but they use a rather complicated time format that I'm finding troublesome as a novice to plot as data in a graph.
...
1
vote
1
answer
122
views
Best file format for transfer of EHR data
I am working on a clinical trial where we have several sites sending us EHR data. The sites are currently sending the data in excel files. I have a feeling someone's opening the files because 3 of ...
1
vote
1
answer
54
views
Containing multicomponent data in rows or columns
I have been working with DNA sequences and compiled a table with features from those sequences. I have a column called Trimer, which contains strings. For some DNA sequences there is one trimer of ...
3
votes
1
answer
8k
views
Getting stock data in a discipline manner from Yahoo finance
I used the below code for downloading stock data from yahoo finance:-
...
0
votes
1
answer
823
views
NCHW vs NHWC in Machine Learning
As I've been introducing myself to the various deep learning frameworks, I've noticed a difference in the default placement of channels for images. Is there a substantial difference between NCHW vs ...
1
vote
1
answer
3k
views
.h5 file format does not close properly
import h5py #added
hf = h5py.File('../images.h5', 'w') #added
hf.close() #added
h5_file = tables.open_file("images.h5", mode="w")
I also tried:
...
1
vote
0
answers
39
views
Labeling data as having an error?
I am curating a large quantity of data from different sensors. If I know that a particular sensor was broken or poorly calibrated for a particular time range, what would be a useful way of annotating ...