Timeline for Create 1% Sample Using Multiprocessing in Python
Current License: CC BY-SA 4.0
12 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Feb 10, 2020 at 17:25 | vote | accept | giacomo1488 | ||
| Feb 6, 2020 at 18:53 | answer | added | Ben A | timeline score: 2 | |
| Jan 2, 2020 at 17:46 | comment | added | Zchpyvr | @giacomo1488 Could you share some more context about the problem you're trying to solve? It sounds like you want to create a 1% sample based on one variable in each line? Does that mean you only care about 2 fields in every line-- one for ID and the other for the variable you measuring against? It really sounds like a Python script is not the best tool for this... | |
| Dec 30, 2019 at 2:26 | comment | added | AMC | I somehow forgot about this question, but I will return to it... | |
| Dec 26, 2019 at 14:50 | comment | added | giacomo1488 | Yes, a 1% sample of the 4 million IDs. So I'm extracting the claims for 40,000 people. | |
| Dec 24, 2019 at 16:00 | comment | added | AMC | The 4 million individual IDs are used to determine which claims to extract? | |
| Dec 24, 2019 at 15:20 | comment | added | giacomo1488 | It is insurance claims data, which is privacy protected so I don't know of any sample data that's out there. There are ~300 million lines in the file. Each line represents a claim line and has 171 variables that are delimited with *. I make the 1% sample at the person level, using a list of 4 million person ids represented by integers and contained in idunique_ids_final. Let me know if there's any other useful information I can share. | |
| Dec 23, 2019 at 1:54 | comment | added | AMC | Can you share some information about the data itself? Ideally we would have enough to run the program, since matters of performance are so dependent on benchmarking and profiling. | |
| Dec 18, 2019 at 15:00 | history | tweeted | twitter.com/StackCodeReview/status/1207314750960472064 | ||
| Dec 18, 2019 at 0:43 | history | edited | greybeard | CC BY-SA 4.0 |
decorate code snippets as such, include title from hyperlink
|
| Dec 17, 2019 at 19:45 | review | First posts | |||
| Dec 18, 2019 at 0:43 | |||||
| Dec 17, 2019 at 19:41 | history | asked | giacomo1488 | CC BY-SA 4.0 |