PostgreSQL scaling on 6.2 and 7.0

Posted by Kevin Way Wed, 11 Apr 2007 22:40:00 GMT

My previous posts have documented the utterly insane lack of scalability of PostgreSQL (at least according to that particular metric) on FreeBSD 6.2.

I ran the same test, on the same machine, using 7.0 with the 4BSD and the ULE schedulers. Postgres was tuned as in the previous posts, and WITNESS, INVARIANTS and all malloc debugging options were off.

Results:

Hopefully nobody will extrapolate this to mean more than it really does (it’s just one test, but it’s one that happens to matter to me.)

More is better?

Posted by Kevin Way Wed, 11 Apr 2007 07:55:00 GMT

Not always. I did some more tests, and confirmed that this particular workload actually gets slower as you add more than 2 cores.

sysbench --num-threads=${i} --test=oltp --pgsql-user=bench --pgsql-db=bench --db-driver=pgsql --max-time=60 --max-requests=0 --oltp-read-only=on run

Here’s the results, with the tests run on absolutely identical hardware. (HP DL360, 2.00 GHz E5335 Xeon, 10K RPM SAS drives, 10 GB of RAM)

I know 6.2 isn’t the greatest for SMP, but I was still surprised to learn that we could get significantly lower overall performance by adding more processors.

I tried to test 7.0, but the 7.0 installer didn’t seem to recognize the iLO virtual keyboard tonight. Maybe tomorrow will go better.

What did I do wrong? 2

Posted by Kevin Way Tue, 10 Apr 2007 01:09:00 GMT

I’m hoping that somebody will look at this and say “hey, dummy, you did this all wrong.” I’m hoping that, because the results are shocking.

There have been a buzz about FreeBSD’s SMP performance compared to Linux’s, particularly a particular 8-core SMP test, using FreeBSD 7.0, sysbench, and mysql. A number of people mentioned that there were known problems with this test and PostgreSQL, so I decided to see how bad the problems were.

The results are so shocking, that my first thought was that I must have done something wrong.

The hardware:

Machine 1:
HP DL360 G5
1 5140 (dual core) 2.33 GHz (1333 FSB) Xeon
4 GB RAM
2 15K RPM SAS hard drives, running as mirrors through an e400i drive array

Machine 2:
HP DL360 G5
2 E5335 (quad core) 2.33 GHz (1333 FSB) Xeons
10 GB RAM
2 10K RPM SAS hard drives, running as mirrors through an e400i drive array

Operating System: Both machines are running FreeBSD 6.2-p3. After all, I wanted to see what performance levels to expect with production-quality software, not the 7.0 sweetness.

Kernel: Both are running identically configured SMP-enabled amd64 kernels.

sysctl.conf: both have the following
kern.ipc.shmmax=2147483647
kern.ipc.shmall=524288
kern.ipc.semmsl=512
kern.ipc.semmap=256
kern.ipc.somaxconn=2048
kern.maxfiles=65536
vfs.read_max=32
loader.conf: both have the following
kern.ipc.semmni=256
kern.ipc.semmns=2048
postgresql.conf: both have the following settings changed
shared_buffers = 1GB                 
work_mem = 64MB
maintenance_work_mem = 32MB
max_fsm_pages = 204800   
random_page_cost = 3.0
effective_cache_size = 512MB
update_process_title = off

Both drive arrays are partitioned identically, and postgres was installed as the only running application in each case.

The test was run as follows:
sysbench --num-threads=${i} --test=oltp --pgsql-user=bench --pgsql-db=bench --db-driver=pgsql --max-time=120 --max-requests=0 --oltp-read-only=on run

And yet here are the results:

Yes, the 8-core, 10GB machine is massively slower than the 2-core 4GB machine.

This can’t be right, can it?

Followup: I ran this single-cpu test:
openssl speed rsa
1×2-Core 2.33GHz 5140 Xeon
                  sign    verify    sign/s verify/s
rsa  512 bits   0.0003s   0.0000s   3905.9  45028.8
rsa 1024 bits   0.0009s   0.0001s   1164.6  18787.2
rsa 2048 bits   0.0049s   0.0002s    205.2   6332.2
rsa 4096 bits   0.0320s   0.0005s     31.2   1917.9

2×4-Core 2.33 GHz E5335 Xeon
rsa  512 bits   0.0003s   0.0000s   3381.1  38972.0
rsa 1024 bits   0.0010s   0.0001s   1003.5  15991.7
rsa 2048 bits   0.0057s   0.0002s    176.7   5481.7
rsa 4096 bits   0.0373s   0.0006s     26.8   1638.6

I wanted to see if this was simply one algorithm, or if other algorithms showed the same sort of slowdown, so I ran the following:

openssl speed md5 sha1 aes-256-cbc
The 2-core machine:
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md5              14172.75k    45816.42k   114118.73k   182293.03k   220175.36k
sha1             14901.45k    42795.69k    97387.27k   143547.69k   166264.67k
aes-256 cbc      91518.54k    94159.65k    93817.92k    92696.25k    93907.24k
The 8-core machine:
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md5              12305.58k    39149.96k    97745.88k   155771.63k   188252.02k
sha1             12740.06k    36648.90k    83318.76k   122733.05k   142119.67k
aes-256 cbc      78109.56k    80589.95k    81222.08k    81406.47k    81401.94k

The 8-core is a little slower than the 2-core again, but not by the enormous levels we see on the pgsql sysbench.

When running RSA across multiple threads, the results are closer to what one would expect. The command I used to do this:


openssl speed rsa -multi N

And now I’m left with the question: Did I screw something up, or is this what Kris Kennaway was talking about when he wrote this message?

I must admit, that while I expected the performance to fall off, it never occurred to me that an 8-core system could be slower than a 2-core system right out of the gate.

DELETE FROM important WHERE overly broad condition;

Posted by Kevin Way Wed, 14 Mar 2007 03:20:00 GMT

A client had an interesting problem. They had a critical PostgreSQL database, and they accidentally ran an overly broad delete on a table.

And their newest backup was two weeks old.

Could we help? Of course we could help.

The procedure used was as follows:

1) Stop the database.

2) Backup the database files to a secure location.

3) Make a second copy of the database files, to a new, empty postgresql instance.

4) Now if you have a dump prior to the error, and xlogs for the entire time period, you can do a Point-In-Time Recovery, with a properly configured recovery.conf. Unfortunately we didn’t have that. We had a two-week old dump and 6 hours of xlogs.

5) Try to determine an approximate transaction ID of the mistake. We did this by looking at timestamps of files on the newest pg_clog file, finding one that happened before the mistake, and then multiplying it’s name by 1048576.

6) Then we used that xid with pg_resetxlog, on the secondary database that we had created. It worked, and we had a view of the data. We then played a bit, to find an xid that was acceptably close to the point of failure.

7) From there, we just did a straight SQL COPY, to get the data from the db server we had just setup onto the original database, and to get it loaded.

It wasn’t the most elegant solution, but it worked beautifully, and the client was pleased.

Here’s to hoping you never need this information!