Skip to main content

All Questions

0 votes
1 answer
40 views

Python threads 'starved' by pandas operations

I am creating a UI application with Qt in Python. It performs operations on pandas DataFrames in a separate threading.Thread to keep the UI responsive; no individual pandas instruction takes noticable ...
AirToTec's user avatar
0 votes
0 answers
49 views

python concurrent.futures.ThreadPool blocking

I have the code below. read_sql is method on my DBReader class and it's using pd.read_sql. Im parallelizing sql selects to the Postgres sql table. import pandas as pd def read_sql(self, sql, params =...
mike01010's user avatar
  • 6,098
0 votes
0 answers
162 views

MultiThreading with python and and pandas

I would need advice how to use multithreading with python. I'm currently using the concurrent.futures librairy but the results i got seem to be slower than expected. Here an example : I need to ...
Magic Mushroom's user avatar
0 votes
1 answer
44 views

unexpected behavior of for after using Map and Partial method

I'm using Partial method to pass 2 parameters which are not iterables, thus i shouldn't use that in the Map() function. I'm also using ThreadPoolExecutor for I\O bound task that i have here. the ...
Mostafa Bouzari's user avatar
0 votes
0 answers
44 views

Pandas concat not adding rows after the first loop

I have functions.py file that is executed every time my main.py file is run: import subprocess import pandas as pd from move import move_to_obsidian pd.options.mode.copy_on_write = False ...
Sarp Yy's user avatar
0 votes
1 answer
50 views

Aggregate the output of several threads' Dataframes into a single Pandas Dataframe

My use case appears to be different from the suggested answers to similar questions. I need to iterate over a list of Git repos using the GitPython module, do a shallow clone, iterate over each branch,...
iwonder's user avatar
  • 97
1 vote
2 answers
361 views

Parallelize dummy data generation in pandas

I would like to generate a dummy dataset composed of a fake first name and a last name for 40 milion records using multiple processor n cores. Below is a single task loop that generates a first name ...
Mohamed Mostafa El-Sayyad's user avatar
1 vote
1 answer
860 views

How to use python-multiprocessing to concat many files/dataframes?

I'm relatively new to python and programming and just use it for the analysis of simulation data. I have a directory "result_1/" with over 150000 CSV files with simulation data I want to ...
Johannes Klee's user avatar
0 votes
0 answers
519 views

How can I efficiently iterate through a pandas dataset that has millions of rows and pass a function to every row?

I have a pandas dataframe with 7 million instances of flight data. the flight data comes with the location and the time which I am using to pull weather for that time. Right now for 1000 instances, my ...
Erick Agudelo's user avatar
1 vote
1 answer
1k views

Python Threads with Pandas Dataframe does not improve performance

i have a Dataframe of 200k lines, i want to split into parts and call my function S_Function for each partition. def S_Function(df): #mycode here return new_df Main program N_Threads = 10 ...
Moun's user avatar
  • 315
1 vote
1 answer
1k views

Parallel web requests with GPU on Google Collab

I need to obtain properties from a web service for a large list of products (~25,000) and this is a very time sensitive operation (ideally I need this to execute in just a few seconds). I coded this ...
neldev's user avatar
  • 13
0 votes
1 answer
91 views

Python: implementing multi-thread flag option in a script

I'm writing a simple script that takes as input two TSV files (file_a.tsv and file_b.tsv) and parses all the values of file_a.tsv in order to check if they're included within a range specified in ...
Iacopo Passeri's user avatar
0 votes
0 answers
44 views

What is the best pythonic approach to read large set of small .txt files from the disc apply data extraction logic and store result in database?

I have 60,000 .txt files each of 15-20 KBs(total data is 30 GBs) I want to apply some data extraction logic in each file and store the result in the database. I tried running the script sequentially ...
Samarth Singh Thakur's user avatar
0 votes
0 answers
427 views

Why is this code not multithreaded (trying to read a file with pandas read_csv in chunks and process each chunk in a separate thread)

As my title, and this is part of the code: def read_chunk(_chunk): return str(_chunk.dtypes[0]) def read_file_in_chunks(_input_file_path, _col_index): _dtypes = [] _df_chunked = pd....
user1261558's user avatar
0 votes
0 answers
106 views

Python: Add on/off functionality in the function with button

I have written a small multithreaded program which appends a list to a pandas dataframe in an elementwise fashion, this append function runs on a separate thread and appends an element from the list ...
abhishake's user avatar
  • 141

15 30 50 per page