Very slow Recipient Import
Moderator: moderator
-
- Posts: 8
- Joined: Thu Nov 02, 2006 10:08 pm
- Location: Toronto
Very slow Recipient Import
Hi:
The import of recipient lists is very very slow. Even on moderately sized lists of around 5K. I tried a list of 185K and I got a heap error. Is this something that you are aware of and are looking into? Is there a quick and dirty way I can bulk load a .csv file into the database? Maybe a simple SQL query from the command-line? Other than that, killer system folks, well done!!
Cheers,
Dave Cook
The import of recipient lists is very very slow. Even on moderately sized lists of around 5K. I tried a list of 185K and I got a heap error. Is this something that you are aware of and are looking into? Is there a quick and dirty way I can bulk load a .csv file into the database? Maybe a simple SQL query from the command-line? Other than that, killer system folks, well done!!
Cheers,
Dave Cook
Aside from logging in through SSH, starting top, and renice-ing the mysqld process, I don't have any fast solution.
To optimize MySQL settings using mysqladmin, I had to look up the man page and play around with max settings.
Even when optimized, I still cannot reliably upload more than 6000+ email addresses (minimum of 5 fields only, CSV). It takes me about an hour to upload a list of this size, and you have to babysit it.
To optimize MySQL settings using mysqladmin, I had to look up the man page and play around with max settings.
Even when optimized, I still cannot reliably upload more than 6000+ email addresses (minimum of 5 fields only, CSV). It takes me about an hour to upload a list of this size, and you have to babysit it.
-
- Site Admin
- Posts: 2628
- Joined: Thu Aug 03, 2006 10:20 am
- Location: Munich, Germany
- Contact:
import performance
Could you please post the OpenEMM and the MySQL version you are using?
OpenEMM Maintainer
I have seen this problem and the single issue is that none of the tables have more than one index on them (primary key) which causes everything to be very slow.
Rule of thumb: If you are looking up a field (column) in WHERE or ORDER BY that field needs to have an index. Yes you can end up with a lot of extra space used but not having any keys is absolutely stupid. Whoever came up with the schema for these tables needs to get a book on SQL and read the sections on Indexing and Optimizations.
I will hopefully post some sql for you guys in a few days with proper indexes created on them.
Example: customer_1_tbl has no keys other than primary.
Add indexes on the following: email, mailtype, datasourrce_id, and any custom defined fields. When you run an import and verify email address against all records it will take 1 day to complete if you have several thousand records.
Does any one have a list of most of the sql queries used in the import and sending sections? I am not good with Java so it will take me a lot longer to find the queries than for someone just to list them all out.
Whoever is in charge of the SQL please, please go read a manual then go back an optimize the table Schemas to add the proper indexes.
-Austin
Rule of thumb: If you are looking up a field (column) in WHERE or ORDER BY that field needs to have an index. Yes you can end up with a lot of extra space used but not having any keys is absolutely stupid. Whoever came up with the schema for these tables needs to get a book on SQL and read the sections on Indexing and Optimizations.
I will hopefully post some sql for you guys in a few days with proper indexes created on them.
Example: customer_1_tbl has no keys other than primary.
Add indexes on the following: email, mailtype, datasourrce_id, and any custom defined fields. When you run an import and verify email address against all records it will take 1 day to complete if you have several thousand records.
Does any one have a list of most of the sql queries used in the import and sending sections? I am not good with Java so it will take me a lot longer to find the queries than for someone just to list them all out.
Whoever is in charge of the SQL please, please go read a manual then go back an optimize the table Schemas to add the proper indexes.
-Austin
-
- Site Admin
- Posts: 2628
- Joined: Thu Aug 03, 2006 10:20 am
- Location: Munich, Germany
- Contact:
Field email is indexed. If you plan to use a different field (like customer ID) for check of duplicates or synchronization you should put an index on this field.
If a MySQL expert can provide an improvement for our code we are happy to accept it since we are more familiar with Oracle.
If a MySQL expert can provide an improvement for our code we are happy to accept it since we are more familiar with Oracle.
OpenEMM Maintainer
I have same problem. I try to monitor mysql server and find the following sql query could cause this problem:
select count(*) from ( SELECT distinct cust.customer_id, cust.gender, cust.firstname, cust.lastname, cust.email FROM customer_1_tbl cust WHERE 1 ) as tmp_tbl
I don't know why using this query to count the total number of email. when I run this query, it needs more 4 minutes to get the result (I have 462464 emails), but if I use "select count(*) from customer_1_tbl", it will get same result and just spend less than 1 second.
select count(*) from ( SELECT distinct cust.customer_id, cust.gender, cust.firstname, cust.lastname, cust.email FROM customer_1_tbl cust WHERE 1 ) as tmp_tbl
I don't know why using this query to count the total number of email. when I run this query, it needs more 4 minutes to get the result (I have 462464 emails), but if I use "select count(*) from customer_1_tbl", it will get same result and just spend less than 1 second.
I've had the same problem. I tried to import ~50k user and it was awfully slow. After 1/2 an hour I stopped the process and looked deeper into the database. Here is my solution:
- connect to the mysql-server an change to the openemm-db
- add indexes to 'email', 'gender', 'firstname' and 'lastname' within the customer_1_tbl-table:
After that I was able to import the ~50k users within ~30 seconds.
I added some more indexes to customer_1_tbl and customer_1_binding_tbl which will hopefully speed up mail creation, too (well, I actually can't compare ... but indexes are usefull most of the time
)
- connect to the mysql-server an change to the openemm-db
- add indexes to 'email', 'gender', 'firstname' and 'lastname' within the customer_1_tbl-table:
Code: Select all
mysql> ALTER TABLE customer_1_tbl ADD INDEX(email);
mysql> ALTER TABLE customer_1_tbl ADD INDEX(gender);
mysql> ALTER TABLE customer_1_tbl ADD INDEX(firstname);
mysql> ALTER TABLE customer_1_tbl ADD INDEX(lastname);
I added some more indexes to customer_1_tbl and customer_1_binding_tbl which will hopefully speed up mail creation, too (well, I actually can't compare ... but indexes are usefull most of the time

Just for the record:
with growing addresses the import slows down again. I needed ~1 minute for the last 9k users of my ~340k list.
The recipient overview slows down, too. I think the problem is the creation of temp-tables, I don't know why they are used (this needs a lot of diskspace, too ... ~600MB in my case).
with growing addresses the import slows down again. I needed ~1 minute for the last 9k users of my ~340k list.
The recipient overview slows down, too. I think the problem is the creation of temp-tables, I don't know why they are used (this needs a lot of diskspace, too ... ~600MB in my case).