2.2 生成测试数据
如果下载的测试数据不能满足你的需求,你可以用 TPoX 自带的数据生成工具来生成满足你要求的数据。运行之前,请先下载下列库文件:
1. ToxGene(http://www.cs.toronto.edu/tox/toxgene/downloads.html),解压后放到 TPoX/toxgene 目录下;
2. PoBoy.jar(http://www.cs.toronto.edu/tox/toxgene/downloads.html),保存到 toxgene/lib 目录下;
3. Xerces 解析器(http://xml.apache.org/dist/xerces-j/Xerces-J-bin.2.5.0.zip)解压后保存到 toxgene/lib 目录下;
最后将所有 jar 包加入到 CLASSPATH 变量中,将 toxgene 加入到 PATH 中。
清单 5. export CLASSPATH 示例
export CLASSPATH=$CLASSPATH:...PoBoy.jar:...xercesImpl.jar:...xmlParserAPIs.jar export PATH=$PATH:$HOME/TPoX/toxgene/bin |
清单 6. 运行 generateXML.ksh
cd TPoX/datagen ./generateXML.ksh Usage: generateXML.ksh [-o OUTPUTPATH] output directory relative to the datagen directory (default: ../gene [-t DOCUMENT TYPE] Possible values: c, custacc, o, order, s, security, a, account, all. [-s SCALE] defines the scale factor (integer), skips input prompt. (default is [-p PARALLEL_GEN] max number of parallel toxgene sessions (default: 4) [-d PARALLEL_DEL] max parallelism to delete existing data (default: 30) [-e EXTRA_DOCS] generate additional XML docs for insert transactions on top of the populated database [0%, 15%, 33%, 50%, 66%, 100%] (default: 15% --> Scale factor must be an integer. Decimal points not allowed. --> the output path will be : /home/zhangw/TPoX/datagen/../generatedXML \n ****************************************** **** TPoX XML DATA GENERATION **** ******************************************\n PLEASE ENTER SCALE FACTOR FOR DATA GENERATION (DocType: * all *) =======================================================================\n XXXS (100MB: 6K custacc, 30K order) --> 1 XXS (1GB: 60K custacc, 300K order) --> 10 XS (10GB: 600K custacc, 3M order) --> 100 S (100GB: 6M custacc, 30M order) --> 1000 M (1TB: 60M custacc, 300M order) --> 10000 L (10TB: 600M custacc, 3B order) --> 100000 XL (100TB: 6B custacc, 30B order) --> 1000000 XXL (1PB: 60B custacc, 300B order) --> 10000000 CUSTOM SCALE --> [Integer]\n ENTER SCALE: |
在提示处,你可以输入自己想要的测试数据大小,如 1,10,100 … .10000000 等。数据会生成在 generatedXML 目录下。
3. 创建数据库
本文使用 DB2 v9.5 作为目标数据库(免费版本下载 http://www.ibm.com/db2/express )。
清单 7. 执行脚本
db2start SQL1063N DB2START processing was successful. db2 create db XMLDB using codeset utf-8 territory CN pagesize 16 k DB20000I The CREATE DATABASE command completed successfully. db2 connect to xmldb Database Connection Information Database server = DB2/LINUX 9.5 SQL authorization ID = ZHANGW Local database alias = XMLDB db2 "create table custacc (cadoc xml)" DB20000I The SQL command completed successfully. db2 "create table security (sdoc xml)" DB20000I The SQL command completed successfully. db2 "create table order (odoc xml)" DB20000I The SQL command completed successfully. |
4. 性能测试
接下来我们以 TPoX 自带的一些负载描述文件来进行 Insert 和 Query 性能测试,首先确保 CLASSPATH 中包含下列 CLASS 或 Jar 包。
- db2jcc.jar
- db2jcc_license_cisuz.jar (or any other db2_jcc_license*.jar file)
- TPoX/WorkloadDriver/plugins/commons-cli-1.0.jar
- TPoX/WorkloadDriver/classes
- TPoX/DB2/classes
在 TPoX/WorkloadDriver 目录下进行性能测试。
insert 测试:
现在我们进行无 Schema 验证的 XML 插入性能测试,插入 250 个 security 文档,模拟 5 个并发用户,每个用户执行 50 次事务:
清单 8. insert 测试
java -classpath $CLASSPATH WorkloadDriver -d XMLDB -w properties/insNoValidsecurity.properties -u 5 -tr 50 The WorkloadDriver program is running... The following arguments are used (user id/password omitted): -d XMLDB -u 5 -w properties/insNoValidsecurity.properties -tr 50 Longest connection time: 1 seconds Workload execution starting date/time: Mon Jun 29 11:14:04 CST 2009 Workload execution finishing date/time: Mon Jun 29 11:14:06 CST 2009 Workload execution elapsed time: 1 seconds STATISTICS OVER THE COMPLETE RUN: *** SYSTEM WORKLOAD STATISTICS *** Tr. # Name Type Count %-age Total Time (s) Min Time (s) Max Time (s) Avg Time (s) 1 insNoValidsecurity I 250 100.00 9.81 0.01 1.05 0.04 *** SYSTEM THROUGHPUT *** The throughput is 15000 transactions per minute (250.00 per second). The output/output2009_06_29_1114 directory contains the files output.txt and stats.txt (as well as stats_per_user.txt, if the verbosity level is 1 or 2, and user1.txt, etc., if the verbosity level is 2). Additionally, it contains comment.txt if -c option was used. The last security document inserted was ../generatedXML/XXS/security/security250.xml |
从中我们可以看到,WorkloadDriver 已经为我们做好了统计,让我们来看一下统计结果:
1. 执行的 transaction 是 insNoValidsecurity,类型为插入 I 操作;
2. 5 个并发用户,每个用户 50 次事务,所以总数 Count 是 250 次;
3. 因为只有一个 transaction,所以其所占比重是 100% ;
4. 250 次事务总共用时 9.81s,最小一次 transaction 0.01s,最大 1.04s,平均 0.04s ;
5. 吞吐率即 DB2 server 每分钟可以处理的事务数为 15000 次;
还可以对需要 schema 验证的插入进行性能测试,如下所示,插入 custacc 文档,需要 schema 验证,模拟 10 个并发用户,执行总时间为 60s。
清单 9. 带验证的 insert 测试
java -classpath $CLASSPATH WorkloadDriver -d XMLDB -w properties/insValidcustacc.properties -u 10 – ti 60 |
query 测试:
使用 5 个并发用户,每个用户 50 次事务。
清单 10. query 测试
java -classpath $CLASSPATH WorkloadDriver -d XMLDB -w properties/queries.properties -u 5 -tr 50 The WorkloadDriver program is running... The following arguments are used (user id/password omitted): -d XMLDB -u 5 -w properties/queries.properties -tr 50 Longest connection time: 0 seconds Workload execution starting date/time: Mon Jun 29 12:43:20 CST 2009 Workload execution finishing date/time: Mon Jun 29 12:43:24 CST 2009 Workload execution elapsed time: 4 seconds STATISTICS OVER THE COMPLETE RUN: *** SYSTEM WORKLOAD STATISTICS *** Tr. # Name Type Count %-age Total Time (s) Min Time (s) Max Time (s) Avg Time (s) 1 get_order_sqlxml Q 45 18.00 3.77 0.06 0.44 0.08 2 get_security_sqlxml Q 28 11.20 1.93 0.05 0.26 0.07 3 customer_profile_sqlxml Q 32 12.80 2.49 0.06 0.28 0.08 4 search_securities_sqlxml Q 33 13.20 4.25 0.10 0.30 0.13 5 account_summary_sqlxml Q 39 15.60 2.86 0.06 0.22 0.07 6 get_security_price_sqlxml Q 37 14.80 2.80 0.05 0.31 0.08 7 customer_max_order_sqlxml Q 36 14.40 2.40 0.06 0.11 0.07 *** SYSTEM THROUGHPUT *** The throughput is 3750 transactions per minute (62.50 per second). The output/output2009_06_29_1243 directory contains the files output.txt and stats.txt (as well as stats_per_user.txt, if the verbosity level is 1 or 2, and user1.txt, etc., if the verbosity level is 2). Additionally, it contains comment.txt if -c option was used. |
使用 4 个并发用户,运行时间 300 秒。
清单 11. query 测试
java -classpath $CLASSPATH WorkloadDriver -d XMLDB – w properties/queries.properties -u 4 -ti 300 The WorkloadDriver program is running... The following arguments are used (user id/password omitted): -d XMLDB -u 5 -w properties/queries.properties -ti 300 Longest connection time: 0 seconds Workload execution starting date/time: Mon Jun 29 12:45:05 CST 2009 Workload execution finishing date/time: Mon Jun 29 12:50:05 CST 2009 Workload execution elapsed time: 300 seconds STATISTICS OVER THE COMPLETE RUN: *** SYSTEM WORKLOAD STATISTICS *** Tr. # Name Type Count %-age Total Time (s) Min Time (s) Max Time (s) Avg Time (s) 1 get_order_sqlxml Q 2667 14.20 219.91 0.06 1.24 0.08 2 get_security_sqlxml Q 2631 14.01 177.64 0.05 1.53 0.07 3 customer_profile_sqlxml Q 2727 14.52 195.19 0.06 1.57 0.07 4 search_securities_sqlxml Q 2628 14.00 308.82 0.09 2.24 0.12 5 account_summary_sqlxml Q 2720 14.49 204.83 0.06 1.89 0.08 6 get_security_price_sqlxml Q 2681 14.28 183.57 0.05 1.22 0.07 7 customer_max_order_sqlxml Q 2723 14.50 207.08 0.06 1.40 0.08 *** SYSTEM THROUGHPUT *** The throughput is 3755 transactions per minute (62.59 per second). The output/output2009_06_29_1245 directory contains the files output.txt and stats.txt (as well as stats_per_user.txt, if the verbosity level is 1 or 2, and user1.txt, etc., if the verbosity level is 2). Additionally, it contains comment.txt if -c option was used. Since the -ti or -fto option was used, some users may still be finishing up their last transactions and closing their connections... |