使用 TPoX 测试 XML 数据库的性能(2)

来源:developerWorks 中国 作者:张巍
  

2.2 生成测试数据

如果下载的测试数据不能满足你的需求,你可以用 TPoX 自带的数据生成工具来生成满足你要求的数据。运行之前,请先下载下列库文件:

1. ToxGene(http://www.cs.toronto.edu/tox/toxgene/downloads.html),解压后放到 TPoX/toxgene 目录下;

2. PoBoy.jar(http://www.cs.toronto.edu/tox/toxgene/downloads.html),保存到 toxgene/lib 目录下;

3. Xerces 解析器(http://xml.apache.org/dist/xerces-j/Xerces-J-bin.2.5.0.zip)解压后保存到 toxgene/lib 目录下;

最后将所有 jar 包加入到 CLASSPATH 变量中,将 toxgene 加入到 PATH 中。


清单 5. export CLASSPATH 示例
			
export CLASSPATH=$CLASSPATH:...PoBoy.jar:...xercesImpl.jar:...xmlParserAPIs.jar 
 export PATH=$PATH:$HOME/TPoX/toxgene/bin


清单 6. 运行 generateXML.ksh
			
cd TPoX/datagen 

 ./generateXML.ksh 

 Usage: generateXML.ksh 

 [-o OUTPUTPATH]  output directory relative to the datagen directory (default: ../gene 
 [-t DOCUMENT TYPE]  Possible values: c, custacc, o, order, s, security, a, account, all. 
 [-s SCALE]     defines the scale factor (integer), skips input prompt. (default is 
 [-p PARALLEL_GEN]  max number of parallel toxgene sessions  (default: 4) 
 [-d PARALLEL_DEL]  max parallelism to delete existing data  (default: 30) 
 [-e EXTRA_DOCS]    generate additional XML docs for insert transactions on top of 
          the populated database [0%, 15%, 33%, 50%, 66%, 100%]  (default: 15% 

 --> Scale factor must be an integer. Decimal points not allowed. 

 --> the output path will be : /home/zhangw/TPoX/datagen/../generatedXML 
 \n 
 ****************************************** 
 ****    TPoX XML DATA GENERATION      **** 
 ******************************************\n 
 PLEASE ENTER SCALE FACTOR FOR DATA GENERATION (DocType: * all *) 
 =======================================================================\n 
 XXXS (100MB:  6K custacc,  30K order)  --> 1 
 XXS  (1GB:   60K custacc, 300K order)  --> 10 
 XS   (10GB: 600K custacc,   3M order)  --> 100 
 S    (100GB:  6M custacc,  30M order)  --> 1000 
 M    (1TB:   60M custacc, 300M order)  --> 10000 
 L    (10TB: 600M custacc,   3B order)  --> 100000 
 XL   (100TB:  6B custacc,  30B order)  --> 1000000 
 XXL  (1PB:   60B custacc, 300B order)  --> 10000000 
 CUSTOM SCALE  --> [Integer]\n 
 ENTER SCALE:
 

在提示处,你可以输入自己想要的测试数据大小,如 1,10,100 … .10000000 等。数据会生成在 generatedXML 目录下。

3. 创建数据库

本文使用 DB2 v9.5 作为目标数据库(免费版本下载 http://www.ibm.com/db2/express )。


清单 7. 执行脚本
			
db2start 
 SQL1063N  DB2START processing was successful. 

 db2 create db XMLDB using codeset utf-8 territory CN pagesize 16 k 
 DB20000I  The CREATE DATABASE command completed successfully. 

 db2 connect to xmldb 

   Database Connection Information 

 Database server        = DB2/LINUX 9.5 
 SQL authorization ID   = ZHANGW 
 Local database alias   = XMLDB 

 db2 "create table custacc (cadoc xml)" 
 DB20000I  The SQL command completed successfully. 

 db2 "create table security (sdoc xml)" 
 DB20000I  The SQL command completed successfully. 

 db2 "create table order (odoc xml)" 
 DB20000I  The SQL command completed successfully.
 
 

4. 性能测试

接下来我们以 TPoX 自带的一些负载描述文件来进行 Insert 和 Query 性能测试,首先确保 CLASSPATH 中包含下列 CLASS 或 Jar 包。

  • db2jcc.jar
  • db2jcc_license_cisuz.jar (or any other db2_jcc_license*.jar file)
  • TPoX/WorkloadDriver/plugins/commons-cli-1.0.jar
  • TPoX/WorkloadDriver/classes
  • TPoX/DB2/classes

在 TPoX/WorkloadDriver 目录下进行性能测试。

insert 测试:

现在我们进行无 Schema 验证的 XML 插入性能测试,插入 250 个 security 文档,模拟 5 个并发用户,每个用户执行 50 次事务:


清单 8. insert 测试
			
java -classpath $CLASSPATH WorkloadDriver -d XMLDB -w 
 properties/insNoValidsecurity.properties -u 5 -tr 50 

 The WorkloadDriver program is running... 

 The following arguments are used (user id/password omitted): 
 -d XMLDB -u 5 -w properties/insNoValidsecurity.properties -tr 50 



 Longest connection time:                1 seconds 
 Workload execution starting date/time:  Mon Jun 29 11:14:04 CST 2009 
 Workload execution finishing date/time: Mon Jun 29 11:14:06 CST 2009 
 Workload execution elapsed time:        1 seconds 


 STATISTICS OVER THE COMPLETE RUN: 

 *** SYSTEM WORKLOAD STATISTICS *** 

 Tr. #     Name               Type     Count       %-age       Total Time (s)     
 Min Time (s)     Max Time (s)     Avg Time (s) 
 1         insNoValidsecurity I        250         100.00      9.81               
 0.01             1.05             0.04 

 *** SYSTEM THROUGHPUT *** 

 The throughput is 15000 transactions per minute (250.00 per second). 


 The output/output2009_06_29_1114 directory contains the files output.txt 
 and stats.txt (as well as stats_per_user.txt, if the verbosity level 
 is 1 or 2, and user1.txt, etc., if the verbosity level is 2). 
 Additionally, it contains comment.txt if -c option was used. 

 The last security document inserted was ../generatedXML/XXS/security/security250.xml
 
 

从中我们可以看到,WorkloadDriver 已经为我们做好了统计,让我们来看一下统计结果:

1. 执行的 transaction 是 insNoValidsecurity,类型为插入 I 操作;

2. 5 个并发用户,每个用户 50 次事务,所以总数 Count 是 250 次;

3. 因为只有一个 transaction,所以其所占比重是 100% ;

4. 250 次事务总共用时 9.81s,最小一次 transaction 0.01s,最大 1.04s,平均 0.04s ;

5. 吞吐率即 DB2 server 每分钟可以处理的事务数为 15000 次;

还可以对需要 schema 验证的插入进行性能测试,如下所示,插入 custacc 文档,需要 schema 验证,模拟 10 个并发用户,执行总时间为 60s。


清单 9. 带验证的 insert 测试
 
 java -classpath $CLASSPATH WorkloadDriver -d XMLDB -w 
 properties/insValidcustacc.properties -u 10 – ti 60
 
 

query 测试:

使用 5 个并发用户,每个用户 50 次事务。


清单 10. query 测试
			
java -classpath $CLASSPATH WorkloadDriver -d XMLDB -w 
 properties/queries.properties -u 5 -tr 50 

 The WorkloadDriver program is running... 

 The following arguments are used (user id/password omitted): 
 -d XMLDB -u 5 -w properties/queries.properties -tr 50 



 Longest connection time:                0 seconds 
 Workload execution starting date/time:  Mon Jun 29 12:43:20 CST 2009 
 Workload execution finishing date/time: Mon Jun 29 12:43:24 CST 2009 
 Workload execution elapsed time:        4 seconds 


 STATISTICS OVER THE COMPLETE RUN: 

 *** SYSTEM WORKLOAD STATISTICS *** 

 Tr. #     Name                      Type     Count       %-age       Total Time (s)     
 Min Time (s)     Max Time (s)     Avg Time (s) 
 1         get_order_sqlxml          Q        45          18.00       3.77               
 0.06             0.44             0.08 
 2         get_security_sqlxml       Q        28          11.20       1.93               
 0.05             0.26             0.07 
 3         customer_profile_sqlxml   Q        32          12.80       2.49               
 0.06             0.28             0.08 
 4         search_securities_sqlxml  Q        33          13.20       4.25               
 0.10             0.30             0.13 
 5         account_summary_sqlxml    Q        39          15.60       2.86               
 0.06             0.22             0.07 
 6         get_security_price_sqlxml Q        37          14.80       2.80               
 0.05             0.31             0.08 
 7         customer_max_order_sqlxml Q        36          14.40       2.40               
 0.06             0.11             0.07 

 *** SYSTEM THROUGHPUT *** 

 The throughput is 3750 transactions per minute (62.50 per second). 


 The output/output2009_06_29_1243 directory contains the files output.txt 
 and stats.txt (as well as stats_per_user.txt, if the verbosity level 
 is 1 or 2, and user1.txt, etc., if the verbosity level is 2). 
 Additionally, it contains comment.txt if -c option was used.
 

使用 4 个并发用户,运行时间 300 秒。


清单 11. query 测试
 
 java -classpath $CLASSPATH WorkloadDriver -d XMLDB – w 
 properties/queries.properties -u 4 -ti 300 

 The WorkloadDriver program is running... 

 The following arguments are used (user id/password omitted): 
 -d XMLDB -u 5 -w properties/queries.properties -ti 300 



 Longest connection time:                0 seconds 
 Workload execution starting date/time:  Mon Jun 29 12:45:05 CST 2009 
 Workload execution finishing date/time: Mon Jun 29 12:50:05 CST 2009 
 Workload execution elapsed time:        300 seconds 


 STATISTICS OVER THE COMPLETE RUN: 

 *** SYSTEM WORKLOAD STATISTICS *** 

 Tr. #     Name                      Type     Count       %-age       Total Time (s)     
 Min Time (s)     Max Time (s)     Avg Time (s) 
 1         get_order_sqlxml          Q        2667        14.20       219.91             
 0.06             1.24             0.08 
 2         get_security_sqlxml       Q        2631        14.01       177.64             
 0.05             1.53             0.07 
 3         customer_profile_sqlxml   Q        2727        14.52       195.19             
 0.06             1.57             0.07 
 4         search_securities_sqlxml  Q        2628        14.00       308.82             
 0.09             2.24             0.12 
 5         account_summary_sqlxml    Q        2720        14.49       204.83             
 0.06             1.89             0.08 
 6         get_security_price_sqlxml Q        2681        14.28       183.57             
 0.05             1.22             0.07 
 7         customer_max_order_sqlxml Q        2723        14.50       207.08             
 0.06             1.40             0.08 

 *** SYSTEM THROUGHPUT *** 

 The throughput is 3755 transactions per minute (62.59 per second). 


 The output/output2009_06_29_1245 directory contains the files output.txt 
 and stats.txt (as well as stats_per_user.txt, if the verbosity level 
 is 1 or 2, and user1.txt, etc., if the verbosity level is 2). 
 Additionally, it contains comment.txt if -c option was used. 

 Since the -ti or -fto option was used, some users may still be 
 finishing up their last transactions and closing their connections...
 
 

时间:2009-07-31 13:23 来源:developerWorks 中国 作者:张巍 原文链接

好文,顶一下
(1)
100%
文章真差,踩一下
(0)
0%
------分隔线----------------------------


把开源带在你的身边-精美linux小纪念品
无觅相关文章插件,快速提升流量