当前位置：首页 >> 编程问答 >> 其他数据库 >> postgres和MYSQL的测试，呵呵。

栏目分类

热门排行

postgres和MYSQL的测试，呵呵。

来源：网络整理时间:2016/6/4 15:36:37 关键词:Mysql,post

关于网友提出的“postgres和MYSQL的测试，呵呵。”问题疑问，本网通过在网上对“postgres和MYSQL的测试，呵呵。”有关的相关答案进行了整理，供用户进行参考，详细问题解答如下：

问题：postgres和MYSQL的测试，呵呵。
描述:

Comparing PostgreSQL 9.1 vs. MySQL 5.6 using Drupal 7.x

Its tough to come across much information about running Drupal on PostgreSQL I find beisdes the basics of installing Drupal on PostgreSQL. In particular, I’m interested in comparisons of running Drupal on PostgreSQL versus MySQL. Previous posts such as this article from 2bits compares performance of MySQL versus PostgreSQL on Drupal 5.x and seems a bit outdated. This post from the high performance drupal group is also pretty dated and has some information with similar comparisons.

In this post, I wanted to run similar tests to what was done in the article from 2bits but on a more recent version of Drupal - 7.x. I also wanted to test out a few more complex queries that can get generated by the view module and see how they perform in MySQL versus PostgreSQL.

For this post, I used the latest GA version of PostgreSQL and for kicks, I went with an aplha release of MySQL - 5.6. I would expect to see similar results for 5.5 in tests like this. I didn’t use default configurations after installation since I didn’t see much benefit in testing that. The configurations I used for both systems are documented below.

Environment Setup

All results were gathered on EC2 instances. The base AMI used for these results is an official AMI of Ubuntu 10.04 provided by Canonical. The particular AMI used as the base image for the results gathered in this post was ami-0baf7662.

Images used were all launched in the US-EAST-1A availability zone and were large instance types. After launching this base image I installed MySQL 5.6 and Drupal 7.12. The steps I took to install these components along with the my.cnffile I used for MySQL are outlined in this gist.

The PostgreSQL 9.1 setup I performed on a separate instance along with the postgresql.conf settings I used are outlined in this gist.

APC was installed and its default configuration was used on both servers.

Data Generation

I used drush and the devel modules to generate data. I generated the following data:

users	50000
tags	1000
vocabularies	5000
menus	5000
nodes	100000
max comments per node	10

I generated this data in the MySQL installation first. The data was then migrated to the PostgreSQL instance using thedbtng_migrator module. This ensures the same data is used for all tests against MySQL and PostgreSQL. I covered how to perform this migration in a previous post.

pgbouncer

One additional setup item I performed for PostgreSQL was to install pgbouncer and configure Drupal to connect through pgbouncer instead of directly to PostgreSQL.

Installation and configuration on Ubuntu 10.04 is straightforward. The steps to install pgbouncer and the configuration I used are outlined in this gist.

The main reason for this change is the ApacheBench based test unfairly favors MySQL due to its process model. Each connection results in a new thread being spawned whereas with PostgreSQL, each new connection results in a new process being forked. The overhead of forking a new process is much larger than spawning a new thread. I did collect numbers for PostgreSQL without using pgbouncer and I do report them in the ApacheBench test section below.

pgbouncer maintains a connection pool that Drupal connects so in my settings.php file for my Drupal PostgreSQL instance, I modified my database settings to be:

$databases = array ( 'default' => array ( 'default' => array ( 'database' => 'drupal', 'username' => 'drupal', 'password' => 'drupal', 'host' => 'localhost', 'port' => '6432', 'driver' => 'pgsql', 'prefix' => '', ), ),);

I performed this configuration step after I generated data in MySQL and migrated it to PostgreSQL.

Anonymous Users Testing with ApacheBench

First, loading the front page for each Drupal site with the devel module enabled and reporting on query execution times, the following was reported:

Database	Query Exec Times
MySQL	Executed 65 queries in 31.69 ms
PostgreSQL (with pgbouncer)	Executed 66 queries in 49.84 ms
PostgreSQL	Executed 66 queries in 95 ms

Straight out the gate, we can see there is not much difference here. 31 versus 50 ms is not going to be felt by many end users. If pgbouncer is not used, query execution time is 3 times slower though.

Next, I went to do some simple benchmarks using ApacheBench. The command used to run ab was (the number of concurrent connections, X, was the only parameter varied):

ab -c X -n 100 http://drupal.url.com/

The ab command was always run from a separate EC2 instance in the same availability zone and never on the same instance as which Drupal was running.

Results obtained with default Drupal configuration (page cache disabled) but all other caching enabled are shown in the figure below. The raw numbers are presented in the table after the figure.

mysql post注入,mysql 还是postgersql,curl post,ajax post,jquery post,httpclient post请求,jquery ajax post,php post,postgraduat

Database	c = 1	c = 5	c = 10
MySQL	11.71	16.53	16.28
PostgreSQL (using pgbouncer)	8.44	11.03	11.10
PostgreSQL	4.81	7.32	7.22

The next test was run after all caches were cleared using drush. The command issued was:

drush cc

Option 1 was then chosen to clear all caches. This was done before each ab command was run. Results are shown in the figure with raw numbers presented in the table after the figure.

mysql post注入,mysql 还是postgersql,curl post,ajax post,jquery post,httpclient post请求,jquery ajax post,php post,postgraduat

Database	c = 1	c = 5	c = 10
MySQL	10.50	14.08	6.28
PostgreSQL (using pgbouncer)	7.92	9.23	7.32
PostgreSQL	5	7.04	6.79

Finally, the same test was run with Drupal’s page cache enabled. Results are shown in the figure below with raw numbers presented in the table after the figure.

mysql post注入,mysql 还是postgersql,curl post,ajax post,jquery post,httpclient post请求,jquery ajax post,php post,postgraduat

Database	c = 1	c = 5	c = 10
MySQL	144	282	267
PostgreSQL (using pgbouncer)	120	205	202
PostgreSQL	35	45	46

Views Queries

The views module is known to sometimes generate queries that can cause performance problems for MySQL.

Image Gallery View

The first SQL query I want to look is generated by one of the sample templates that come with the Views module. If you click ‘Add view from template’ in the Views module, by default, you will only have 1 template to choose from - the Image Gallery template. After creating a view from this template and not modifying anything about that view, I see 2 problematic queries being generated.

The first query is a query that counts the number of the rows in the result set for this view since this is a paginated view. The second query actually retrieves the results with a LIMIT clause and the appropriate OFFSET dependending on what page of the results the user is currently on. For this post, we’ll just look at the second query that retries results. That query is:

SELECTtaxonomy_index.tidAStaxonomy_index_tid,taxonomy_term_data.nameAStaxonomy_term_data_name,Count(node.nid)ASnum_recordsFROMnodenodeLEFTJOINusersusers_nodeONnode.uid=users_node.uidLEFTJOINfield_data_field_imagefield_data_field_imageONnode.nid=field_data_field_image.entity_idAND(field_data_field_image.entity_type='node'ANDfield_data_field_image.deleted='0')LEFTJOINtaxonomy_indextaxonomy_indexONnode.nid=taxonomy_index.nidLEFTJOINtaxonomy_term_datataxonomy_term_dataONtaxonomy_index.tid=taxonomy_term_data.tidWHERE(((field_data_field_image.field_image_fidISNOTNULL)AND(node.status='1')))GROUPBYtaxonomy_term_data_name,taxonomy_index_tidORDERBYnum_recordsASCLIMIT24offset0

The response time of the query in MySQL versus PostgreSQL is shown in the figure below.

mysql post注入,mysql 还是postgersql,curl post,ajax post,jquery post,httpclient post请求,jquery ajax post,php post,postgraduat

As seen in the image above, PostgreSQL can execute the query in question in 300ms or less whereas MySQL consistently takes 2800 ms to execute the query.

The MySQL execution plan looks like:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: field_data_field_image type: ref
possible_keys: PRIMARY,entity_type,deleted,entity_id,field_image_fid
          key: PRIMARY
      key_len: 386
          ref: const
         rows: 19165
        Extra: Using where; Using temporary; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: node type: eq_ref
possible_keys: PRIMARY,node_status_type
          key: PRIMARY
      key_len: 4
          ref: drupal.field_data_field_image.entity_id
         rows: 1
        Extra: Using where
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: users_node type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: drupal.node.uid
         rows: 1
        Extra: Using where; Using index
*************************** 4. row ***************************
           id: 1
  select_type: SIMPLE
        table: taxonomy_index type: ref
possible_keys: nid
          key: nid
      key_len: 4
          ref: drupal.field_data_field_image.entity_id
         rows: 1
        Extra: NULL
*************************** 5. row ***************************
           id: 1
  select_type: SIMPLE
        table: taxonomy_term_data type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: drupal.taxonomy_index.tid
         rows: 1
        Extra: NULL

MySQL starts from the field_date_field_image table and since there is no selective predicates in the query, chooses to scan the table using the PRIMARY key of the table. It then filters the rows scanned using the field_image_fid IS NOT NULL predicate. Since MySQL only has 1 join algorithm, nested loops, it is used to perform the remainder of the joins. A temporary table is created in memory to store the results of these joins. This is then sorted and the result set limited to the 24 requested.

The PostgreSQL execution plan looks drastically different.

Limit (cost=11712.83..11712.89 rows=24 width=20) ->  Sort (cost=11712.83..11829.24 rows=46564 width=20) Sort Key: (count(node.nid)) ->  HashAggregate (cost=9946.90..10412.54 rows=46564 width=20) ->  Hash Left Join (cost=6174.69..9597.67 rows=46564 width=20) Hash Cond: (taxonomy_index.tid = taxonomy_term_data.tid) ->  Hash Right Join (cost=6140.19..8922.92 rows=46564 width=12) Hash Cond: (taxonomy_index.nid = node.nid) ->  Seq Scan on taxonomy_index (cost=0.00..1510.18 rows=92218 width=16) ->  Hash (cost=5657.14..5657.14 rows=38644 width=4) ->  Hash Join (cost=2030.71..5657.14 rows=38644 width=4) Hash Cond: (node.nid = field_data_field_image.entity_id) ->  Seq Scan on node (cost=0.00..2187.66 rows=76533 width=8) Filter: (status= 1) ->  Hash (cost=1547.66..1547.66 rows=38644 width=8) ->  Seq Scan on field_data_field_image (cost=0.00..1547.66 rows=38644 width=8) Filter: ((field_image_fid IS NOT NULL) AND ((entity_type)::text ='node'::text) AND (deleted= 0::smallint)) ->  Hash (cost=22.00..22.00 rows=1000 width=12) ->  Seq Scan on taxonomy_term_data (cost=0.00..22.00 rows=1000 width=12)

PostgreSQL has a number of other join algorithms available for use. In particular, for this query, the optimizer has decided that a hash join is the optimal choice.

PostgreSQL starts by scanning the tiny (1000 rows) taxonomy_term_data table and constructing an in-memory hash table (the build phase in a hash join). It then probes this hash table for possible matches of taxonomy_index.tid = taxonomy_term_data.tid for each row that results from a hash join of taxonomy_index and node. This hash join was a result of the field_data_field_image and node table being join with the field_data_field_image being used to build a hash table and a sequential scan of node being used to probe that hash table. Aggregation is then performed and the result set is then sorted by the aggregated value (in this case a count of node ids). Finally, the result set is limited to 24.

One neat thing about PostgreSQL is planner nodes can be disabled. So to make PostgreSQL execute the query in a similar manner to MySQL, I did:

drupal=> set enable_hashjoin=off;
SET drupal=> set enable_hashagg=off;
SET drupal=> set enable_mergejoin=off;
SET drupal=>

And the execution plan PostgreSQL chose then was:

Limit (cost=52438.04..52438.10 rows=24 width=20) ->  Sort (cost=52438.04..52552.82 rows=45913 width=20) Sort Key: (count(node.nid)) ->  GroupAggregate (cost=50237.67..51155.93 rows=45913 width=20) ->  Sort (cost=50237.67..50352.45 rows=45913 width=20) Sort Key: taxonomy_term_data.name, taxonomy_index.tid
                     ->  Nested Loop Left Join (cost=0.00..46682.48 rows=45913 width=20) ->  Nested Loop Left Join (cost=0.00..33783.81 rows=45913 width=12) ->  Nested Loop (cost=0.00..18575.38 rows=38644 width=4) ->  Seq Scan on field_data_field_image (cost=0.00..1547.66 rows=38644 width=8) Filter: ((field_image_fid IS NOT NULL) AND ((entity_type)::text ='node'::text) AND (deleted= 0::smallint)) ->  Index Scan using node_pkey on node (cost=0.00..0.43 rows=1 width=8) Index Cond: (nid= field_data_field_image.entity_id) Filter: (status= 1) ->  Index Scan using taxonomy_index_nid_idx on taxonomy_index (cost=0.00..0.36 rows=3 width=16) Index Cond: (node.nid = nid) ->  Index Scan using taxonomy_term_data_pkey on taxonomy_term_data (cost=0.00..0.27 rows=1 width=12) Index Cond: (taxonomy_index.tid = tid)

The above plan takes 2 seconds to execute against PostgreSQL. You can see it is very similar to the MySQL plan. It starts with the field_data_field_image table and performs nested loop joins to join the remainder of the tables. In this case, a sort must be performed before the aggregation that is expensive to perform. Using the HashAggregate operator in PostgreSQL would greatly reduce that cost.

So you can see out of the box, PostgreSQL performs much better on this query.

Simple View

I created a simple view that filters and sorts on content criteria. A screenshot of my view construction page can be seen here.

The resulting SQL query that gets executed by this view is:

SELECTDISTINCTnode.titleASnode_title,node.nidASnid,node_comment_statistics.comment_countASnode_comment_statistics_comment_count,node.createdASnode_createdFROMnodenodeINNERJOINnode_comment_statisticsnode_comment_statisticsONnode.nid=node_comment_statistics.nidWHERE(((node.status='1')AND(node.commentIN('2'))AND(node.nid>='111')AND(node_comment_statistics.comment_count>='2')))ORDERBYnode_createdASCLIMIT50offset0

The response time of the query in MySQL versus PostgreSQL is shown in the figure below.

mysql post注入,mysql 还是postgersql,curl post,ajax post,jquery post,httpclient post请求,jquery ajax post,php post,postgraduat

As seen in the image above, PostgreSQL can execute the query in question in 200ms or less whereas MySQL can take up to 1000 ms to execute the query.

The MySQL execution plan looks like:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: node type: index
possible_keys: PRIMARY,node_status_type
          key: node_created
      key_len: 4
          ref: NULL
         rows: 100
        Extra: Using where; Using temporary
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: node_comment_statistics type: eq_ref
possible_keys: PRIMARY,comment_count
          key: PRIMARY
      key_len: 4
          ref: drupal.node.nid
         rows: 1
        Extra: Using where

MySQL chooses to start from the node table and scans an index on the created column. A temporary table is then created in memory to store the results of this index scan. The items stored in the temporary table are then processed to eliminate duplicates (for the DISTINCT). For each distinct row in the temporary table, MySQL then performs a join to the node_comment_statistics table by performing an index lookup using its primary key.

The PostgreSQL execution plan for this query looks like:

Limit (cost=6207.15..6207.27 rows=50 width=42) ->  Sort (cost=6207.15..6250.75 rows=17441 width=42) Sort Key: node.created
         ->  HashAggregate (cost=5453.36..5627.77 rows=17441 width=42) ->  Hash Join (cost=1985.31..5278.95 rows=17441 width=42) Hash Cond: (node.nid = node_comment_statistics.nid) ->  Seq Scan on node (cost=0.00..2589.32 rows=38539 width=34) Filter: ((nid >= 111) AND (status= 1) AND (comment= 2)) ->  Hash (cost=1546.22..1546.22 rows=35127 width=16) ->  Seq Scan on node_comment_statistics (cost=0.00..1546.22 rows=35127 width=16) Filter: (comment_count >= 2::bigint)

PostgreSQL chooses to start by scanning the node_comment_statistics table and building an in-memory hash table. This hash table is then probed for possible mathces of node.nid = node_comment_statistics.nid for each row that results from a sequential scan of the node table. The result of this hash join is then aggregated (for the DISTINCT) before being sorted and limited to 50 rows.

Its worth noting that with out of the box settings, the above query would do a disk based sort (sort method is viewable using EXPLAIN ANALYZE in PostgreSQL). When doing a disk based sort, the query takes about 450 ms to execute. I was running all my tests with work_mem set to 4MB though which results in a top-N heapsort being used.

Conclusion

In my opinion, the only issue with using PostgreSQL as your Drupal database is that some contributed modules will not work out of the box with that configuration.

Certainly, from a performance point of view, I see no issues with using PostgreSQL with Drupal. In fact, for Drupal sites using the Views module (probably the majority), I would say PostgreSQL is probably even a better option than MySQL due to its more advanced optimizer and execution engine. This does assume pgbouncer is being used and Drupal is not connecting directly to PostgreSQL. Users who do not use pgbouncer and perform simple benchmarks like the ones I did with ab are likely to see poor performance against PostgreSQL.

I’m working a lot with Drupal on PostgreSQL these days. I’ll be sure to share any interesting experiences I have here.

http://posulliv.github.io/2012/06/29/mysql-postgres-bench/

呵呵

解决方案1:

引用来自“kenyon_君羊”的答案

参数没有贴出来，结果上看，这个场景下的postgres完爆了mysql的菊花解决方案2:

pgsql是辆跑车，如果在城市路上比赛再快也不过几个红灯距离

mysql就在市区中溜达吧，上高速或者下赛道就现形了～

解决方案3:

哪个OSCER给翻译一下？

解决方案4:

参数没有贴出来，结果上看，这个场景下的postgres完爆了mysql的菊花

解决方案5:

1楼说的好

解决方案6:

坐等两个阵营互喷

解决方案7:

注定被某些人喷的贴子坐等

解决方案8:

把别人的文章抄过来，什么意思？

解决方案9:

myisam引擎？

解决方案10:

怎么不翻译下英文的我就只看图了说的啥？老外的JJ比亚洲人长？最后两张图也差得太大了吧，是刚果和越南的对比么

以上介绍了“postgres和MYSQL的测试，呵呵。”的问题解答，希望对有需要的网友有所帮助。
本文网址链接：http://www.codes51.com/itwd/1508402.html

上一篇：有关postgresql 下一篇：postgresql表中列在视图中用到，不能修改该列的长度，怎么解决

postgres和MYSQL的测试，呵呵。

引用来自“kenyon_君羊”的答案

相关图片

相关文章