I am using a DataTable and the MySqlAdapater to insert around 100 000 rows with around 40 columns to a MySQL database.
I chose this way, because my program needs run on mono (unix / mac) where I could not establish a connection to my database via the System.Data.Sql package, but only via the .NET connector Mysql.Data.MySqlClient which has no SqlBulkCopy-method.
Additionally the amount of columns in my table is flexible, so I was looking for an easy way to iterate the columns. Thats why I used the way:
- Parsing my data
- Loading it into a
DataTableObject - Create a MySQL table dynamically based on my
DataTableObject - Use a MySqlDataAdapter to bulk insert the DataTable to the Database via its
.Update()-method
Right now, it takes around 3-4 minutes to insert 100 000 rows this way - which is subjectively slow (mac mini 2012, i7, 512 GB SSD, 16 GB ram)
Is there any way to speed this up with the Mysql.Data.MySqlClient ?
public static void Import(DataTable dt, string tableName, string conString)
{
using (MySqlConnection con = new MySqlConnection(conString))
{
con.Open();
string createTableString = GetCreateTableSql(dt, tableName);
MySqlCommand cmd = new MySqlCommand(createTableString, con);
cmd.ExecuteNonQuery();
using (MySqlCommand selectcmd = new MySqlCommand("SELECT * FROM " + dt.TableName, con))
{
using (MySqlDataAdapter sda = new MySqlDataAdapter())
{
MySqlCommandBuilder builder = new MySqlCommandBuilder(sda);
sda.SelectCommand = selectcmd;
builder.GetUpdateCommand();
using (DataTable dttemp = new DataTable())
{
sda.Fill(dttemp);
// prevent duplicating data
if (dttemp.Rows.Count == 0)
{
dttemp.Merge(dt);
sda.Update(dttemp);
}
else
{
Console.WriteLine("Data already present. Delete manually");
}
}
}
}
}
}
UpdateBatchSizeon theMySqlDataAdapterto something like 500 or 1000? \$\endgroup\$Mergeis that it is sloooow. We threw away one that took 8+ hours. As a stored procedure it now runs in seconds. This was the absolute worst culprit, but in general C# pretending to be a RDBMS is a poor performer. An RDBMS engine is optimal for doing relational set operations. Use it. The "oh look, Mo! Working disconnected!" Microsoft propaganda is a pig in a poke. P.S. Ours is an Oracle DB. \$\endgroup\$sda.Update()without indexes. @RobH thanks for the hint, but I tried (1000) and it took twice as much time as before \$\endgroup\$