MySQL databases are often populated by loading text files directly into tables. MySQL makes this very easy to do with the LOAD DATA INFILE
statements. For example:
LOAD DATA LOCAL INFILE 'sample_data_1.txt' INTO TABLE data.sample_data_1
LOAD DATA INFILE Statements
LOAD DATA INFILE
statements can read data into MySQL tables at very high speeds. This will be much faster than running many single insert statements. In order to use the LOAD DATA INFILE
statement you must provide:
1) The location of the file being loaded. This can be a absolute or relative path. If the LOCAL
keyword is used (i.e. LOAD DATA LOCAL INFILE
) MySQL will expect the file is located on the same machine as the MySQL Client, otherwise the file is expected to be on the same machine as the MySQL Server.
2) The database table to insert the file into.
3) Optionally specifying characters that indicate things such as field delimiters and line endings.
Statement Defaults
As mentioned above, it is possible to use LINES
and FIELDS
clauses to specify how to parse fields and lines for insertion into the database. If these clauses are not provided, MySQL will use defaults. Using defaults is equivalent to writing:
FIELDS TERMINATED BY '\t' ENCLOSED BY '' ESCAPED BY '\\'
LINES TERMINATED BY '\n' STARTING BY ''
Line Endings on Mac and Windows Files
Files created in Mac and Windows environments often have different line endings than those expected by default. For Windows files you will often need to use LINES TERMINATED BY '\n\r'
. For files created on Mac, you will often have to use LINES TERMINATED BY '\r'
.
Handling Duplicates
If you have a unique key on your database table and inserting data will introduce duplicates, you can use the REPLACE
or IGNORE
keywords to specify how to handle these records. For example:
LOAD DATA LOCAL INFILE 'sample_data_1.txt' REPLACE INTO TABLE data.sample_data_1
Using the REPLACE
keyword will replace existing records with the same (unique) key with the new record being inserted. Using the IGNORE
keyword will “ignore” new records if the key already exists in the database table.
Loading Nulls
To load values into the database as a NULL, the value must exist in the text file as '\N'
.
Example Loading Text File into MySQL
The text file we would like to load is named sample_data_1.txt, and can be found here. The fields in this file are delimited by tabs (“\t”) and have Unix/Linux line endings (“\n”). Here is a sample of what our data looks like:
id | first_name | last_name | country | ip_address | |
---|---|---|---|---|---|
1 | Barbara | Day | [email protected] | \N | 252.148.11.9 |
2 | Jennifer | Roberts | [email protected] | China | 130.208.191.99 |
3 | Helen | Gray | [email protected] | China | \N |
First we need a table to load data into. We will create a MySQL table to hold this data with this following command:
CREATE TABLE sample_data_1 (
id INT,
first_name VARCHAR(100),
last_name VARCHAR(100),
email VARCHAR(100),
country VARCHAR(100),
ip_address VARCHAR(100)
)
Next, from the MySQL shell, we execute the following command to load the data in our file into the database.
LOAD DATA LOCAL INFILE 'sample_data_1.txt'
INTO TABLE data.sample_data_1
FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n';
At this point data from our file should now be in our database table and ready to query.
mysql> select * from data.sample_data_1 LIMIT 10;
+------+------------+------------+--------------------------+------------+-----------------+
| id | first_name | last_name | email | country | ip_address |
+------+------------+------------+--------------------------+------------+-----------------+
| 0 | first_name | last_name | email | country | ip_address |
| 1 | Barbara | Day | [email protected] | NULL | 252.148.11.9 |
| 2 | Jennifer | Roberts | [email protected] | China | NULL |
| 3 | Helen | Gray | [email protected] | China | 167.188.111.177 |
| 4 | Maria | Day | [email protected] | Uzbekistan | 196.109.242.208 |
| 5 | Bonnie | Diaz | [email protected] | Malaysia | 36.75.163.246 |
| 6 | Jessica | Gonzalez | [email protected] | China | 108.67.149.140 |
| 7 | Amanda | Stone | [email protected] | Slovenia | 164.20.18.161 |
| 8 | Roger | Sims | [email protected] | Argentina | 247.169.248.113 |
| 9 | Angela | Williamson | [email protected] | China | 249.120.41.104 |
+------+------------+------------+--------------------------+------------+-----------------+
10 rows in set (0.00 sec)
More details on LOAD DATA INFILE statements in MySQL can be found here.