2008
Datagenerator Fix
19/12/08 11:20 Filed in: Swingbench
I’ve updated datagenerator to fix a few bugs,
improve the perfromance and give a little more
feedback when running from the commandline. You can
find it in the usual place.
|
Broken clusteroverview in later builds
17/10/08 17:43 Filed in: Swingbench
Flash version of the screencast now available
05/10/08 15:49 Filed in: Swingbench
New builds of swingbench and datagenerator
01/10/08 16:38 Filed in: Swingbench
I’ve uploaded new builds of swingbench and
datagenerator to fix a few niggly bugs. You can
download them from here
New Swingbench Screencast
01/10/08 16:36 Filed in: Swingbench
Ive added a new swingbench screencast which is a
complete walkthrough from the installation of
swingbench to the running of a benchmark. I’ve
also provided a commentary. So if you ever wondered
what I sound like try it out here
Minor Change to the look and feel of the web site
01/10/08 16:05 Filed in: General
Large scale data sets for "SOE" and "SH"
02/09/08 22:16
I’ve added some new pages describing how to
build large scale “SOE” and
“SH” schemas. I’ve tested them both
to 500GB in size and will create larger scale
versions as soon as I can borrow hardware to test
them at multi terabyte levels. If you’re
interested in giving it ago let me know via the
comments page and I can try
and assist. You can find the the instructions on
how to do create 100GB+ schemas for
“SOE” here
and the how to create 100GB+ “SH”
schemas here
Update : I noticed that the scripts for SH use compression by default. This can slow down a load. and make the expected upload much smaller. I’ll upload a new build shortly that fixes this and afew other issues.
Update : I noticed that the scripts for SH use compression by default. This can slow down a load. and make the expected upload much smaller. I’ll upload a new build shortly that fixes this and afew other issues.
Datagenerator 0.4
02/09/08 20:09 Filed in: Swingbench
I've uploaded a new build of datagenerator. New
features include
The new build can be downloaded here
I’ve also updated the install, and added some additional walk throughs (in the swingbench section)
Lets go through some of the new features in a little more detail.
You can now include indexes and sequences inside of a datagenerator definition
This makes it easy to build an entire schema for a benchmark run removing the need to run additional scripts afterwards. Currently I don't support their reverse engineering but that will come.
Previously it was possible to specify multiple threads for a datageneration run but each table was allocated a single thread. In this version a user can soft partition a table and hence break the build into smaller units which can each have a thread allocated to them.
This means that if you have a 32 CPU server you'll be able to build a 10 billion row table much faster if you soft partition the table into 32 units and allocate 32 threads for the build. The partition key can be either a date or number. This is also useful to avoid resource contention when inserting data into a partitioned table.
Its now possible to run the entire data generation to file or database from the command line. These include
[dgiles@macbook bin]$ ./datagenerator -h
usage: parameters:
-async perform async commits
-bs batch size of inserts (defaults to 50)
-c specify config file
-cl use command line interface
-commit number of inserts between commits
-cs connectring for database insertion
-d output directory (defaults to "data")
-db write data direct to database
-ddl just generate the ddl to be used
-debug turn on debug information
-dt driver type (oci|thin)
-f write data to file
-g use graphical user interface
-h,--help print this message
-ni don't create any indexes after data creation
-nodrop don't drop tables if they exist
-p password for database insertion
-s run silent
-scale mulitiplier for default config
-tc number of generation threads (defaults to 2)
-u username for database insertion
-z compress the results file
The config files for the soe and sh schema are now by default configured for a 1GB build. These can be scaled up by using the -scale option. To build a 100GB sh schema the following command can be used.
./datagenerator -c sh.xml -cl -scale 100
This functionality is supplemented by a new flag on a table definition.
Only tables with this flag enabled will be scaled up.
It is now possible to use the row count of a table as the maximum value of a number generator. This is useful when scaling up/down a datageneration and maintaining data coverage and referential integrity.
As the number of rows in the referenced table increase so does the the maximum value of the data generator.
This build supports the use of asynchronous commits. This results in performance increases of about 10-30% when this option is enabled. I’ve also undergone several database
It is sometimes useful to only create the DDL that will used to create tables and indexes.
The files that are created can be edited and modified to include additional information such as storage definitions.
- Support for indexes and sequences
- New command line options
- Better multi threading support
- New scaleable data builds
- Number generators can reference row counts from other tables
- Better database performance
- Ability to generate only the DDL of a schema
- Numerous bug fixes
The new build can be downloaded here
I’ve also updated the install, and added some additional walk throughs (in the swingbench section)
Lets go through some of the new features in a little more detail.
Indexes and Sequences
You can now include indexes and sequences inside of a datagenerator definition
This makes it easy to build an entire schema for a benchmark run removing the need to run additional scripts afterwards. Currently I don't support their reverse engineering but that will come.
Better multithreading support
Previously it was possible to specify multiple threads for a datageneration run but each table was allocated a single thread. In this version a user can soft partition a table and hence break the build into smaller units which can each have a thread allocated to them.
This means that if you have a 32 CPU server you'll be able to build a 10 billion row table much faster if you soft partition the table into 32 units and allocate 32 threads for the build. The partition key can be either a date or number. This is also useful to avoid resource contention when inserting data into a partitioned table.
New command line options
Its now possible to run the entire data generation to file or database from the command line. These include
[dgiles@macbook bin]$ ./datagenerator -h
usage: parameters:
-async perform async commits
-bs batch size of inserts (defaults to 50)
-c specify config file
-cl use command line interface
-commit number of inserts between commits
-cs connectring for database insertion
-d output directory (defaults to "data")
-db write data direct to database
-ddl just generate the ddl to be used
-debug turn on debug information
-dt driver type (oci|thin)
-f write data to file
-g use graphical user interface
-h,--help print this message
-ni don't create any indexes after data creation
-nodrop don't drop tables if they exist
-p password for database insertion
-s run silent
-scale mulitiplier for default config
-tc number of generation threads (defaults to 2)
-u username for database insertion
-z compress the results file
Scaleable data builds
The config files for the soe and sh schema are now by default configured for a 1GB build. These can be scaled up by using the -scale option. To build a 100GB sh schema the following command can be used.
./datagenerator -c sh.xml -cl -scale 100
This functionality is supplemented by a new flag on a table definition.
Only tables with this flag enabled will be scaled up.
Referenceable row counts
It is now possible to use the row count of a table as the maximum value of a number generator. This is useful when scaling up/down a datageneration and maintaining data coverage and referential integrity.
As the number of rows in the referenced table increase so does the the maximum value of the data generator.
Better database performance
This build supports the use of asynchronous commits. This results in performance increases of about 10-30% when this option is enabled. I’ve also undergone several database
Generate only DDL
It is sometimes useful to only create the DDL that will used to create tables and indexes.
The files that are created can be edited and modified to include additional information such as storage definitions.
Timing groups of SQL operations
02/09/08 10:47 Filed in: Oracle
Some times I feel like I’ve missed out on a
whole chunk on functionality in Oracle products. One
little nugget is the “timing” function in
SQL*Plus. This allows you to time groups of
operations.
Obviously turning on is achieved with the “set timing on” operation. i.e
SQL > set timing on
SQL > select count(1) from all_objects;
COUNT(1)
----------
68653
Elapsed: 00:00:03.95
SQL>
Which is great but what if want to time mulitiple operations. Use the timing function and simply give the timer a name, in this case statement timer.
SQL> timing start statement_timer
SQL> select count(1) from all_objects;
COUNT(1)
----------
68653
SYS@orcl > /
COUNT(1)
----------
68653
SQL> timing show statement_timer;
timing for: statement_timer
Elapsed: 00:00:30.85
SQL>
Which times anything that went on in between the timer starting and finishing. In this case also my typing of the commands. Its a fantastic utility for timing stages in a batch job including call outs to os operations.
Obviously turning on is achieved with the “set timing on” operation. i.e
SQL > set timing on
SQL > select count(1) from all_objects;
COUNT(1)
----------
68653
Elapsed: 00:00:03.95
SQL>
Which is great but what if want to time mulitiple operations. Use the timing function and simply give the timer a name, in this case statement timer.
SQL> timing start statement_timer
SQL> select count(1) from all_objects;
COUNT(1)
----------
68653
SYS@orcl > /
COUNT(1)
----------
68653
SQL> timing show statement_timer;
timing for: statement_timer
Elapsed: 00:00:30.85
SQL>
Which times anything that went on in between the timer starting and finishing. In this case also my typing of the commands. Its a fantastic utility for timing stages in a batch job including call outs to os operations.
New build and fixes in swingbench
22/07/08 12:29 Filed in: Swingbench
I've uploaded a new build of swingbench which
includes a number of fixes (one that broke
charbench). Apologies for this. I thought I released
the fixes if the previous build. You can find it
here.
As usual let me know if you find obvious bugs.
As usual let me know if you find obvious bugs.
On the subject of ISCSI and udev
09/06/08 08:33 Filed in: Linux
Ok so I admit I haven't been updating the web site
recently. Im pretty much overrun with work (and I
took a weeks holiday) so updates to swingbench have
had to be put on hold. That said I haven't been idle.
I've been setting up a RAC cluster under Oracle
Virtual Server (NOTE : This is not currently
supported).
The environrment is build using 3 Dell Lattitude notebooks and a 1Gb Netgear switch. I've installed one notebook with OpenFiler as the storage server and two additional notebooks with Oracle Virtual Server. (NOTE : this really isn't advised as a production implementation
). I've used the Oracle Enterprise Template to
create the Linux Servers running Oracle11g. It
was pretty straight sailing with a few notable
exceptions
The environrment is build using 3 Dell Lattitude notebooks and a 1Gb Netgear switch. I've installed one notebook with OpenFiler as the storage server and two additional notebooks with Oracle Virtual Server. (NOTE : this really isn't advised as a production implementation
- Many of the shipping templates benefit (and some require) the activation of hardware acceleration. This can be done via the BIOS and will make a big difference to performance in most circumstances.
- UDev and ISCSI. This sadly caused me real problems and Im still not sure why. I was under the impression that the devices presented from the Openfiler (the ISCSI Target) would have consistent ID's. Now I've no reason to make this claim it just seemed like a sensible thing to do. I originally used the ENV{ID} to uniquely identify the devices and create persistent links to the disks. However subsequent reboots showed these to change. We eventually ended up using a rule similar to the folowing
- We also encountered some library issues (rpm), Some of the openfiler devices couldn't be seen however upgrading to the very latest initiator libraries solved the problem (At the time of writing this iscsi-initiator-utils-6.2.0.868-0.7.el5).
Minor update to datagenerator
29/03/08 11:01 Filed in: Swingbench
Swingbench with AWR support
25/03/08 21:25 Filed in: Swingbench
Im uploading a new build of swingbench that includes
support for performing AWR snaps at the start and end
of of a benchmark run.
This is a common request and automates a task that a lot of people do. I've also included a new section in the stats that are generated which lists the top 10 (11 to be exact if you include CPU) wait events for a run if you choose to collect database statistics.
I've had a lot of requests about 9i support in swingbench which I appear to have broken. Can you let me know if you still think this is important (I guess its implied by the requests I've had). I need to know whether to focus on this of finish the 2.3 doc.
I'd also appreciate if you could let me know if the wait events I report are in line with the stats you get from ARW... if they're not its down to me not AWR.
Thanks...
This is a common request and automates a task that a lot of people do. I've also included a new section in the stats that are generated which lists the top 10 (11 to be exact if you include CPU) wait events for a run if you choose to collect database statistics.
I've had a lot of requests about 9i support in swingbench which I appear to have broken. Can you let me know if you still think this is important (I guess its implied by the requests I've had). I need to know whether to focus on this of finish the 2.3 doc.
I'd also appreciate if you could let me know if the wait events I report are in line with the stats you get from ARW... if they're not its down to me not AWR.
Thanks...
New build of swingbench 2.3 (Build 370)
06/03/08 14:57
Just a quick build to fix cursors not being closed in
ccwizard. Still looking into the "-be" and "-bs"
issue.
You can get it here
You can get it here
New build of swingbench 2.3
12/02/08 12:22 Filed in: Swingbench
I know things have been a little slow of late in
terms of the releases of code but I've just had too
much on both at work and at home. The good news is
that I've found time to squeeze in a new build of
swingbench. This has a number of bug fixes in and the
following new functionality.
I've also included a walk through of how to set up clusteroverview in 2.3 with the new GroupID functionality, you can find it here.
Any problems report them through the usual channels...
- Clusteroverview now uses the groupID attribute to determine members of a load generation group. This provides a lot more flexibility for clusteroverview to be used to test technologies such as Oracle Dataguard
- The output from the command line option -v can now be directed to a named file with the -vo option
I've also included a walk through of how to set up clusteroverview in 2.3 with the new GroupID functionality, you can find it here.
Any problems report them through the usual channels...