Thursday, March 3, 2011

BEDDING SOA performance data using a more realistic test

 BEDDING performance data using a more realistic test
of SOA: Chen Xin Chen Fang
actual performance test through a sample, the data on a bunk and there is no initial working test results in both cases the data analysis carried out, We found that initial working data, performance testing, testers can not only help to better identify problems in applications, but also can make test results more close to the real result of the production environment.
This article assumes that the system of a large-scale SOA performance testing, to ensure that all online users at the same time-line input file. test scenarios including the creation of member information, income and expenditure information and submission of applications for such relief cases.
In this system, the performance test at the same time need to simulate a lot of people Online scenarios. through performance testing tool can simulate any number of people simultaneously online, and to ensure system performance and stability, ensure the system at any time regardless of the request and the response time remained stable, as the database will not slow down the increase of data . Based on these issues, we need to find an effective way to test the performance of the system. In the above scenario, need a stable environment for a long time, so that it can increase the amount of data inside the database, and build load balancing application server environment.
sum up, spread the bottom to the data stored in the database, set up the cluster server on the server side.
prepare initial working data
Why, then, how do we make bunk data? Here We will elaborate on these issues in detail.
initial working data is what we are doing performance testing prior to addition to the database in the database dictionary tables outside in accordance with the business logic into the large amounts of data. These data can be considered junk data because they business logic of the system no real influence, but the performance of the system has great impact.
situation we need to generate the actual BEDDING data, and the process of production to meet the actual data rate table. For example: the amount of data with a table and another table with a multiple relationship is 5:1, then the time to prepare the data for the data regardless of how many, but also need to ensure that the amount of data of these two tables is a multiple of 5:1 .
Although the initial working data is garbage data, but they also need to comply with the dependence of data in the database. such as the one on one, one to many relationship.
prepare our performance test data, such as Test preparation before BEDDING DB2 data inside the table preparing these BEDDING data? these data were not our actual production data appear, why take the time to prepare for such a large amount of data? The answer is, the system has BEDDING BEDDING data and the data is not the case, the performance will be big difference. So why does this happen?
First, without those initial working data, then had a table set up as an index, the amount of data when the system is small, the database will have possible full table scan instead of index scan, so if there is no initial working data, may cause the system to the database deadlock occurred.
If the data was relatively low, in order to optimize the database, and sometimes do not have an index scan and the use of full table scan, causing the entire table is locked so, resulting in deadlock. The amount of data the database will be indexed after the scan, will not lock the entire table.
So, in some cases, the system on-line must be prepared to have useless data on the table, so that the database will not use full table scan. although sometimes you can change the locking strategy to solve this problem, but if there is a risk, in the on-line system will Avoid.
Second, if the data is very small, we do not know when to conduct a query, SQL statement execution path which the program actually. database SQL statements are automatically calculated based on an optimal path from the view that the function For example, the ACCESS PATH.ACCESS PATH DB2 data and how much will change with the change. Once the system architecture of relatively large, then, in the amount of data over time will increase. so be prepared to a certain amount of data, so ACCESS PATH remained relatively stable.
because BEDDING performance data makes the system more realistic, more in line with the reality of the production environment. BEDDING data stored in the database, the system on-line from the beginning, when there is a relatively stable environment.
BEDDING data if no, then the system may be ready to face the environmental factors of instability, such as the performance of sudden change, database exceptions, such as a sudden drop in response time. so be prepared to spread the bottom data, not only for performance testing of far-reaching, and production on the upcoming on-line environment is also essential. Just in the banking system, if the data is not prepared to spread the bottom, once the system on the line when a problem occurs, then the bank will lose many customers.
data are ready to spread the bottom the following principles: 1. as long as the amount of data in the database on the number of times than the large memory, the result almost; 2. data preparation time, to maintain the constraints of the original table; 3. the amount of data to each table meet the real situation.

more data for high-performance BEDDING BEDDING introduced the importance of data. you know, prepared to be on the table for each BEDDING billion. So how do we prepare initial working fast and real data what?
simple JDBC program if the data into the bunk, the performance will be poor. and write a program using JDBC to insert data inside the database, then the speed will be very slow, about 100,000 a table, approximately 20 minutes.
assume that we need to prepare a table of data 100 million, is 10000/10t20/60 = 333 hours. If the business logic need to be prepared 20 tables, then we will need to prepare these data 333t20 = 6660 hours = 277.5 days . this amazing slow, so prepare initial working data through JDBC not.
Clearly, we need to more efficiently produce initial working data. I have chosen your team the following method: find the relationship between the database table structure and accordingly the data double, the use of high efficiency CPU computing power to generate the data, and import the database to produce the required initial working data.
spread the bottom if you want to prepare the data, then we must first find the table and relationship between the tables. In other words, we need to clear the main table inside the database and the relationship between schedule-many or many to many, but also to know the actual circumstances, a primary table corresponds to a record about Schedule a few data. only one can about the law, or take an intermediate value. Rational Data Architect can be generated by the table structure found in the relationship between tables: the first set of data for the original (about 1000 per table of data).
we tested, it should be the first to use Rational Performance Tester 7.0 record a script. script which includes the main use cases to test. For example: I made the script, including a total of 10 requests To all in accordance with the order entry to the RPT.
then, we should re-establish the database, the data do not exist within the database. Next, we can use narratives recorded above, Rational Performance Tester script, and 1000 cycles times. so that our database table there will be some inside the 1000 data. Then we see what the table in the database the data has increased, and then export the data inside these tables to a text file.
performance testing environment for
In this paper, we use the test environment is: IBM WebSphere Process Sever application server cluster to do, to do with the IBM DB2 database, Rational Performance Tester 7.0 to do with performance testing tools, to do with IBM HTTP Server HTTP Server (Figure 1).
a group of independent servers in the network performance of a single system, and to manage a single system model. This single system to provide high reliability for the customer service workstation. In most mode, all computers in the cluster have a common name, the cluster service running on any system can be used by all network customers. Cluster coordination and management must be able to separate the components of the mistakes and failures, and transparent add components to the Cluster.
A Cluster contains more than one (at least two) have shared data storage server. any one server to run an application, the application data is stored in the shared data space. each server's operating system and application files are stored in their respective local storage space. Cluster nodes within the server communicate with each other through the internal LAN. When a node server failure, this server application is running in another node server is automatically taken over. When a failure of application services, application services will be restarted or another server to take over. When any one or more failures occur, customers will be able to connect to new applications quickly services.
JVM using cluster we can solve problem of the shortage, you can share I / O, load, the system can effectively reduce the failure rate. If the stability of a 99% Server, the system downtime The probability is 0.01. If our two Server cluster environment with the instability will be reduced to 0.0001, that is, there are two Server cluster environment, the stability can be increased to 0.9999. so that we can based on the actual production environment required to reduce the risk of the system.
the test environment we have adopted the cluster WPS (WPS Cluster). Cluster (cluster) is the number of WPS cluster, it can manage all WPS, and to participate in the management of all WPS load. WPS 6.0.1 and above support structures Cluster, can do load balancing (Workload Balance) and high availability (High Availability), so that the WPS is more stable performance even more remarkable. Moreover, in general, the real production environment , WPS cluster is to be used frequently. The test environment topology shown in Figure 2.
Cluster two types, horizontal and vertical Cluster Cluster. the level of Cluster members in different physical machines, vertical Cluster- members of the same physical machine.
here as a test environment constraints, we use the vertical Cluster, Cluster of three members. including the type of Deployment Manager Profile, which can manage all the units within the WPS. Deployment Manager and Node Agent through the interaction of information to manage the node. The Cluster Node Agent to manage the three members. We also need the middle of the WPS Cluster and applications to add a HTTP Server, to access the application through the HTTP protocol when the random WPS Cluster Cluster allocated to a different top.
Ideally, the performance testing environment for the best application deployment to fully simulate the real environment to be applied in the performance test in a real production environment performance results. However, real application deployment environment relatively large, the idea is not practical. Therefore, in order to get the performance of the authenticity of the results of more and better application of existing performance issues that we can control used in the test environment to test simulate the real environment as possible, including the preparation of historical data. In our test environment, in order to better analyze its own database and application performance issues, such as I / O problem, WPS and the database will be installed in different physical machine.
first through the HTTP request to the HTTP Server, and then into the UI layer, and then through the UI calls Web Service, Web Servive calls and through the HTTP Server BPEL and WSDL files, and then connect via JDBC Database, and deposited by a number of Content Manager files (such as PDF, Word) to the Database on.
to do with IBM HTTP Server HTTP Server, and WebSphere Application Server built on the UI layer, the BPEL WSDL to build to the WebSphere Process Server, and then to do with DB2 Database. Now, we need to basically build the entire test environment completed.
analysis of test results
data by, respectively, and have not spread the bottom bunk to conduct the same data performance testing, we found very different results, mainly in terms of average response time of the page.
been tested in the case of no initial working data, the average page response time is 58.552ms; data bunk in a case, it is 608.344ms. generated from the Rational Performance Tester test report, we can clearly see the average response time for each page.
two pages from the comparison of average response time, we can see the data in a bunk, the average response time than those without BEDDING test results data. After find the root causes found in the two pages, because a great deal in view of the database creation and operation of Select statements, resulting in longer response times.
Therefore, the results from the above comparison we can see the data in a bunk performance test cases, testers can help to better identify problems in the tested applications.
In addition, the general application in the real case of a production system running , there will be a lot of historical data. Therefore, we need to add BEDDING test data, so can not only be more realistic test results, and applications to early detection of hidden dangers that exist in order to avoid running the system discovered after the official follow-up questions to the great trouble.
actual performance test through a sample, through the WebSphere Process Sever-based application performance testing, data on a bunk and did not spread the bottom of the test data in two cases results for comparison and analysis, we found that using the initial working performance test data can not only help the testers to better identify problems in applications, but also can make test results more close to the real result of the production environment.
Figure 1 WPS Cluster topology map
Figure 2 Performance test environment topology

No comments:

Post a Comment