PDO: Learn how the PDO SQL parser causes bug reports
July 3rd, 2008
The PDO SQL parser causes many bugs. All drivers suffer from it. By default the parser replaces placeholder with '<bound_value>'.'<bound_value>' is a string. This can clash with the SQL syntax. PDO can make an INSERT fail. Please learn why and stop reporting bogus bugs.
Why does PDO parse my SQL commands?
PDO has chosen Prepared Statements as its preferred way of running database queries. The entire API is build around prepared statements. Wez has explained that primarily security considerations have caused this decision. Read PDO_MYSQLND: Prepared Statements, again for a discussion of prepared statements in the context of PDO.
Not every database system supports prepared statements. And those who do use a different syntax for placeholder. Some systems support named parameters (SELECT * FROM test WHERE id = :name), some support positional/ordinal parameter (SELECT * FROM test WHERE id = ?) and some support both. The cause of the differences is the lack of a standard. PDO tries to hide the differences from the user. This is done by a prepared statement emulation. The emulation includes a SQL parser which allows you to use whatever format you prefer for placeholder. If the underlying database system does not support the placeholder format you have chosen, PDO emulates it for you.
In many cases the emulation works flawless. But the bug databases on bugs.php.net and pecl.php.net show plenty of reports that boil down to the PDO SQL parser. Please learn how the SQL parser works. This should help avoiding duplicates in the bug database.
(...)
Read the rest of PDO: Learn how the PDO SQL parser causes bug reports (1,030 words)
PDO: Learn how the PDO SQL parser causes bug reports
July 3rd, 2008
The PDO SQL parser causes many bugs. All drivers suffer from it. By default the parser replaces placeholder with '<bound_value>'.'<bound_value>' is a string. This can clash with the SQL syntax. PDO can make an INSERT fail. Please learn why and stop reporting bogus bugs.
Why does PDO parse my SQL commands?
PDO has chosen Prepared Statements as its preferred way of running database queries. The entire API is build around prepared statements. Wez has explained that primarily security considerations have caused this decision. Read PDO_MYSQLND: Prepared Statements, again for a discussion of prepared statements in the context of PDO.
Not every database system supports prepared statements. And those who do use a different syntax for placeholder. Some systems support named parameters (SELECT * FROM test WHERE id = :name), some support positional/ordinal parameter (SELECT * FROM test WHERE id = ?) and some support both. The cause of the differences is the lack of a standard. PDO tries to hide the differences from the user. This is done by a prepared statement emulation. The emulation includes a SQL parser which allows you to use whatever format you prefer for placeholder. If the underlying database system does not support the placeholder format you have chosen, PDO emulates it for you.
In many cases the emulation works flawless. But the bug databases on bugs.php.net and pecl.php.net show plenty of reports that boil down to the PDO SQL parser. Please learn how the SQL parser works. This should help avoiding duplicates in the bug database.
(...)
Read the rest of PDO: Learn how the PDO SQL parser causes bug reports (1,030 words)
Notable no-shows at SourceForge.net Community Choice Award finals
July 3rd, 2008Notable no-shows at SourceForge.net Community Choice Award finals
July 3rd, 2008MySQL support fun, multiplication
July 3rd, 2008There was a question how to do an aggregate multiplication in MySQL. MySQL does not provide such functionality, so we were looking at various workarounds.
We discussed UDF interface that allows to construct custom aggregates, also did look at @a:=@a*field hack, and how different initializers have results wrapped differently.
Then Scott killed our discussion with this simple query:
select exp(sum(log(c)))
Of course, thats nice mathematical approach to solve the multiplication issue with just SUM() at hands, but while we were still in awe, Scott explained it with this wit:
The Great Flood is over, and as the animals are departing 3×3, Noah is blessing each, saying “Go forth, be fruitful and multiply.” Two snakes come down the ramp and say to Noah, “We can’t. We’re adders.”
Noah groans and says, “That’s the worst pun I’ve heard in 40 days and nights. Go sit in that pile of sticks until I can deal with you!”
After mucking out the ark, Noah returns to the sticks and lifts them up, to find baby snakes everywhere. “What happened?” he asks. “I thought you were having problems!”
The snakes reply, “Even adders can multiply with logs.”
:-)
MySQL support fun, multiplication
July 3rd, 2008There was a question how to do an aggregate multiplication in MySQL. MySQL does not provide such functionality, so we were looking at various workarounds.
We discussed UDF interface that allows to construct custom aggregates, also did look at @a:=@a*field hack, and how different initializers have results wrapped differently.
Then Scott killed our discussion with this simple query:
select exp(sum(log(c)))
Of course, thats nice mathematical approach to solve the multiplication issue with just SUM() at hands, but while we were still in awe, Scott explained it with this wit:
The Great Flood is over, and as the animals are departing 3×3, Noah is blessing each, saying “Go forth, be fruitful and multiply.” Two snakes come down the ramp and say to Noah, “We can’t. We’re adders.”
Noah groans and says, “That’s the worst pun I’ve heard in 40 days and nights. Go sit in that pile of sticks until I can deal with you!”
After mucking out the ark, Noah returns to the sticks and lifts them up, to find baby snakes everywhere. “What happened?” he asks. “I thought you were having problems!”
The snakes reply, “Even adders can multiply with logs.”
:-)
How to load large files safely into InnoDB with LOAD DATA INFILE
July 3rd, 2008Recently I had a customer ask me about loading two huge files into InnoDB with LOAD DATA INFILE. The goal was to load this data on many servers without putting it into the binary log. While this is generally a fast way to load data (especially if you disable unique key checks and foreign key checks), I recommended against this. There are several problems with the very large transaction caused by the single statement. We didn't want to split the file into pieces for the load for various reasons. However, I found a way to load the single file in chunks as though it were many small files, which avoided splitting the file and let us load with many transactions instead of one huge transaction.
The smaller file is 4.1GB and has 260M lines in it; each row is just two bigints. The bigger file was about 20GB and had wider rows with textual data and about 60M lines (as I recall).
When InnoDB loads the file, it creates one big transaction with a lot of undo log entries. This has a lot of costs. To name a few:
- the big LOAD DATA INFILE clogs the binary log and slows replication down. If the load takes 4 hours on the master, it will cause the slave to fall 4 hours behind.
- lots of undo log entries collect in the tablespace. Not only from the load -- but from other transactions' changes too; the purge thread cannot purge them, so everything gets bloated and slow. Even simple SELECT queries might have to scan through lots of obsolete, but not-yet-purged, row versions. Later, the purge thread will have to clean these up. This is how you make InnoDB behave like PostgreSQL
- If the undo log space grows really big, it won't fit in the buffer pool and InnoDB essentially starts swapping between its buffer pool and the tablespace on disk.
Most seriously, if something should happen and the load needs to roll back, it will take a Very Long Time to do -- I hate to think how long. I'm sure it would be faster to just shut everything down and re-clone the machine from another, which takes about 10 or 12 hours. InnoDB is not optimized for rollbacks, it's optimized for transactions that succeed and commit. Rollback can take an order of magnitude longer to do.
For that reason, we decided to load the file in chunks of a million rows each. (InnoDB internally does operations such as ALTER TABLE in 10k row chunks, by the way; I chose 1M because the rows were small). But how to do this without splitting the file? The answer lies in the Unix fifo. I created a script that reads lines out of the huge file and prints them to a fifo. Then we could use LOAD DATA INFILE on the fifo. Every million lines, the script prints an EOF character to the fifo, closes it and removes it, then re-creates it and keeps printing more lines. If you 'cat' the fifo file, you get a million lines at a time from it. The code is pretty simple and I've included it in Maatkit just for fun. (It's unreleased as of yet, but you can get it with the following command: "wget http://www.maatkit.org/trunk/fifo").
So how did it work? Did it speed up the load?
Not appreciably. There actually was a tiny speedup, but it's statistically insignificant IMO. I tested this first on an otherwise idle machine with the same hardware as the production machines. First, I did it in one big 4.1GB transaction, then I did it 1 million rows at a time. Here's the CREATE TABLE:
-
CREATE TABLE load_test (
-
col1 bigint(20) NOT NULL,
-
col2 bigint(20) DEFAULT NULL,
-
KEY(col1),
-
KEY(col2)
-
) ENGINE=InnoDB DEFAULT CHARSET=utf8
Here's the result of loading the entire 4GB file in one chunk:
-
time mysql -e "set foreign_key_checks=0; set sql_log_bin=0; set unique_checks=0; load data local infile 'infile.txt' into table load_test fields terminated by '\t' lines terminated by '\n' (col1, col2);"
-
-
real 234m53.228s
-
user 0m1.098s
-
sys 0m5.959s
While this ran, I captured vmstat output every 5 seconds and logged it to a file; I also captured the output of "mysqladmin ext -ri5 | grep Handler_write" and logged that to a file.
To load the file in chunks, I split my screen session in two and then ran (approximately -- edited for clarity) the following in one terminal:
-
perl mk-fifo-split infile.txt --fifo /tmp/my-fifo --lines 1000000
And this in the other terminal:
-
while [ -e /tmp/my-fifo ]; do
-
mysql -e "..... same as above.... ";
-
sleep 1;
-
done
After I was done, I ran a quick Perl script on the vmstat and mysqladmin log files to grab out the disk activity and rows-per-second to see what the progress was. Here are some graphs. This one is the rows per second from mysqladmin, and the blocks written out per second from vmstat.

And this one is the bytes/sec from Cacti running against this machine. This is only the bytes out per second; for some reason Cacti didn't seem to be capturing the bytes in per second.

You can see how the curves are roughly logarithmic, which is what you should expect for B-Tree indexes. The two curves on the Cacti graph actually show both files being loaded. It might seem counter-intuitive, but the second (smaller) curve is actually the larger file. It has fewer rows and that's why it causes less I/O overall.
I also used 'time' to run the Perl fifo script, and it used a few minutes of CPU time during the loads. So not very much at all.
Some interesting things to note: the load was probably mostly CPU-bound. vmstat showed from 1% to 3% I/O wait during this time. (I didn't think to use iostat to see how much the device was actually used, so this isn't a scientific measurement of how much the load was really waiting for I/O). The single-file load showed about 1 or 2 percent higher I/O wait, and you can see the single-file load uses more blocks per row; I can only speculate that this is the undo log entries being written to disk. (Peter arrived at the same guess independently.)
Unfortunately I didn't think to log the "cool-down period" after the load ended. It would be fun to see that. Cacti seemed to show no cool-down period -- as soon as the load was done it looked like things went back to normal. I suspect that's not completely true, since the buffer pool must have been overly full with this table's data.
Next time I do something like this I'll try smaller chunks, such as 10k rows; and I'll try to collect more stats. It would also be interesting to try this on an I/O-bound server and see what the performance impact is, especially on other transactions running at the same time.
Entry posted by Baron Schwartz | One comment
How to load large files safely into InnoDB with LOAD DATA INFILE
July 3rd, 2008Recently I had a customer ask me about loading two huge files into InnoDB with LOAD DATA INFILE. The goal was to load this data on many servers without putting it into the binary log. While this is generally a fast way to load data (especially if you disable unique key checks and foreign key checks), I recommended against this. There are several problems with the very large transaction caused by the single statement. We didn't want to split the file into pieces for the load for various reasons. However, I found a way to load the single file in chunks as though it were many small files, which avoided splitting the file and let us load with many transactions instead of one huge transaction.
The smaller file is 4.1GB and has 260M lines in it; each row is just two bigints. The bigger file was about 20GB and had wider rows with textual data and about 60M lines (as I recall).
When InnoDB loads the file, it creates one big transaction with a lot of undo log entries. This has a lot of costs. To name a few:
- the big LOAD DATA INFILE clogs the binary log and slows replication down. If the load takes 4 hours on the master, it will cause the slave to fall 4 hours behind.
- lots of undo log entries collect in the tablespace. Not only from the load -- but from other transactions' changes too; the purge thread cannot purge them, so everything gets bloated and slow. Even simple SELECT queries might have to scan through lots of obsolete, but not-yet-purged, row versions. Later, the purge thread will have to clean these up. This is how you make InnoDB behave like PostgreSQL
- If the undo log space grows really big, it won't fit in the buffer pool and InnoDB essentially starts swapping between its buffer pool and the tablespace on disk.
Most seriously, if something should happen and the load needs to roll back, it will take a Very Long Time to do -- I hate to think how long. I'm sure it would be faster to just shut everything down and re-clone the machine from another, which takes about 10 or 12 hours. InnoDB is not optimized for rollbacks, it's optimized for transactions that succeed and commit. Rollback can take an order of magnitude longer to do.
For that reason, we decided to load the file in chunks of a million rows each. (InnoDB internally does operations such as ALTER TABLE in 10k row chunks, by the way; I chose 1M because the rows were small). But how to do this without splitting the file? The answer lies in the Unix fifo. I created a script that reads lines out of the huge file and prints them to a fifo. Then we could use LOAD DATA INFILE on the fifo. Every million lines, the script prints an EOF character to the fifo, closes it and removes it, then re-creates it and keeps printing more lines. If you 'cat' the fifo file, you get a million lines at a time from it. The code is pretty simple and I've included it in Maatkit just for fun. (It's unreleased as of yet, but you can get it with the following command: "wget http://www.maatkit.org/trunk/fifo").
So how did it work? Did it speed up the load?
Not appreciably. There actually was a tiny speedup, but it's statistically insignificant IMO. I tested this first on an otherwise idle machine with the same hardware as the production machines. First, I did it in one big 4.1GB transaction, then I did it 1 million rows at a time. Here's the CREATE TABLE:
-
CREATE TABLE load_test (
-
col1 bigint(20) NOT NULL,
-
col2 bigint(20) DEFAULT NULL,
-
KEY(col1),
-
KEY(col2)
-
) ENGINE=InnoDB DEFAULT CHARSET=utf8
Here's the result of loading the entire 4GB file in one chunk:
-
time mysql -e "set foreign_key_checks=0; set sql_log_bin=0; set unique_checks=0; load data local infile 'infile.txt' into table load_test fields terminated by '\t' lines terminated by '\n' (col1, col2);"
-
-
real 234m53.228s
-
user 0m1.098s
-
sys 0m5.959s
While this ran, I captured vmstat output every 5 seconds and logged it to a file; I also captured the output of "mysqladmin ext -ri5 | grep Handler_write" and logged that to a file.
To load the file in chunks, I split my screen session in two and then ran (approximately -- edited for clarity) the following in one terminal:
-
perl mk-fifo-split infile.txt --fifo /tmp/my-fifo --lines 1000000
And this in the other terminal:
-
while [ -e /tmp/my-fifo ]; do
-
mysql -e "..... same as above.... ";
-
sleep 1;
-
done
After I was done, I ran a quick Perl script on the vmstat and mysqladmin log files to grab out the disk activity and rows-per-second to see what the progress was. Here are some graphs. This one is the rows per second from mysqladmin, and the blocks written out per second from vmstat.

And this one is the bytes/sec from Cacti running against this machine. This is only the bytes out per second; for some reason Cacti didn't seem to be capturing the bytes in per second.

You can see how the curves are roughly logarithmic, which is what you should expect for B-Tree indexes. The two curves on the Cacti graph actually show both files being loaded. It might seem counter-intuitive, but the second (smaller) curve is actually the larger file. It has fewer rows and that's why it causes less I/O overall.
I also used 'time' to run the Perl fifo script, and it used a few minutes of CPU time during the loads. So not very much at all.
Some interesting things to note: the load was probably mostly CPU-bound. vmstat showed from 1% to 3% I/O wait during this time. (I didn't think to use iostat to see how much the device was actually used, so this isn't a scientific measurement of how much the load was really waiting for I/O). The single-file load showed about 1 or 2 percent higher I/O wait, and you can see the single-file load uses more blocks per row; I can only speculate that this is the undo log entries being written to disk. (Peter arrived at the same guess independently.)
Unfortunately I didn't think to log the "cool-down period" after the load ended. It would be fun to see that. Cacti seemed to show no cool-down period -- as soon as the load was done it looked like things went back to normal. I suspect that's not completely true, since the buffer pool must have been overly full with this table's data.
Next time I do something like this I'll try smaller chunks, such as 10k rows; and I'll try to collect more stats. It would also be interesting to try this on an I/O-bound server and see what the performance impact is, especially on other transactions running at the same time.
Entry posted by Baron Schwartz | One comment







