All Questions
Tagged with python-multithreading pandas
39 questions
0
votes
1
answer
40
views
Python threads 'starved' by pandas operations
I am creating a UI application with Qt in Python. It performs operations on pandas DataFrames in a separate threading.Thread to keep the UI responsive; no individual pandas instruction takes noticable ...
0
votes
0
answers
49
views
python concurrent.futures.ThreadPool blocking
I have the code below. read_sql is method on my DBReader class and it's using pd.read_sql.
Im parallelizing sql selects to the Postgres sql table.
import pandas as pd
def read_sql(self, sql, params =...
0
votes
0
answers
162
views
MultiThreading with python and and pandas
I would need advice how to use multithreading with python. I'm currently using the concurrent.futures librairy but the results i got seem to be slower than expected.
Here an example :
I need to ...
0
votes
1
answer
44
views
unexpected behavior of for after using Map and Partial method
I'm using Partial method to pass 2 parameters which are not iterables, thus i shouldn't use that in the Map() function. I'm also using ThreadPoolExecutor for I\O bound task that i have here.
the ...
0
votes
0
answers
44
views
Pandas concat not adding rows after the first loop
I have functions.py file that is executed every time my main.py file is run:
import subprocess
import pandas as pd
from move import move_to_obsidian
pd.options.mode.copy_on_write = False
...
0
votes
1
answer
50
views
Aggregate the output of several threads' Dataframes into a single Pandas Dataframe
My use case appears to be different from the suggested answers to similar questions. I need to iterate over a list of Git repos using the GitPython module, do a shallow clone, iterate over each branch,...
1
vote
2
answers
361
views
Parallelize dummy data generation in pandas
I would like to generate a dummy dataset composed of a fake first name and a last name for 40 milion records using multiple processor n cores.
Below is a single task loop that generates a first name ...
1
vote
1
answer
860
views
How to use python-multiprocessing to concat many files/dataframes?
I'm relatively new to python and programming and just use it for the analysis of simulation data.
I have a directory "result_1/" with over 150000 CSV files with simulation data I want to ...
0
votes
0
answers
519
views
How can I efficiently iterate through a pandas dataset that has millions of rows and pass a function to every row?
I have a pandas dataframe with 7 million instances of flight data. the flight data comes with the location and the time which I am using to pull weather for that time. Right now for 1000 instances, my ...
1
vote
1
answer
1k
views
Python Threads with Pandas Dataframe does not improve performance
i have a Dataframe of 200k lines, i want to split into parts and call my function S_Function for each partition.
def S_Function(df):
#mycode here
return new_df
Main program
N_Threads = 10
...
1
vote
1
answer
1k
views
Parallel web requests with GPU on Google Collab
I need to obtain properties from a web service for a large list of products (~25,000) and this is a very time sensitive operation (ideally I need this to execute in just a few seconds). I coded this ...
0
votes
1
answer
91
views
Python: implementing multi-thread flag option in a script
I'm writing a simple script that takes as input two TSV files (file_a.tsv and file_b.tsv) and parses all the values of file_a.tsv in order to check if they're included within a range specified in ...
0
votes
0
answers
44
views
What is the best pythonic approach to read large set of small .txt files from the disc apply data extraction logic and store result in database?
I have 60,000 .txt files each of 15-20 KBs(total data is 30 GBs) I want to apply some data extraction logic in each file and store the result in the database.
I tried running the script sequentially ...
0
votes
0
answers
427
views
Why is this code not multithreaded (trying to read a file with pandas read_csv in chunks and process each chunk in a separate thread)
As my title, and this is part of the code:
def read_chunk(_chunk):
return str(_chunk.dtypes[0])
def read_file_in_chunks(_input_file_path, _col_index):
_dtypes = []
_df_chunked = pd....
0
votes
0
answers
106
views
Python: Add on/off functionality in the function with button
I have written a small multithreaded program which appends a list to a pandas dataframe in an elementwise fashion, this append function runs on a separate thread and appends an element from the list ...