CCA Wiki
CCA Software Resources
Menu [hide]

CQoS Database Usage in Adaptive Linear Solvers

Work note by Sep. 24 2007
print PDF
Performance database uses for the purpose of adaptive linear solver selection fall into two categories:

1. Look up matrix metadata that precisely matches the attributes of linear system and, if hit, retrieve the optimal solver and parameter setting associated the metadata to apply to the new problem.

2. If there is no data in the DB that exactly matches properties of the problem at hand, or a miss, it is desirable to select a linear solver based on historical performance, which can potentially yield optimal performance. Intelligent methods, such as statistical modeling and machine learning will be greatly helpful in this scenario.

Making an adaptive decision often need to go through both types of PerfDB query. If a question can not be answered exactly from historical data, then we need to guess a solution that is good enough.

Performance database query APIs are accordingly broken into two groups.

1. Retrieve optimal solver and parameters. Examples are:

  • int determineMethodMetric(map<string, double> property, string metric_name, string *function_names, bool inclusive,int *result_size,double ***results, bool minimum,int *optimal_index): match matrix properties (including name and value) and functions to best-performed (in terms of the specified performance metric) linear solver in history database.

  • int determineMethodEvent(map<string, double> property, string *event_names, int number_of_events,int *result_size,double ***results, bool minimum,int *optimal_index): match matrix properties (including name and value) and user events to best-performed solver in history database.

  • int determineParameterMetric(string solver, string param_name, string metric_name,string *function_names, bool inclusive,int *result_size,double ***results, bool minimum,int *optimal_index): given the linear solver selected, match functions to best-performed (in terms of the specified performance evaluation metric, e.g., convergence rate, run time) parameters in history database.

  • int determineParameterEvent(string solver, string param_name, string *event_names, int number_of_events,int *result_size,double ***results, bool minimum,int *optimal_index): given the linear solver selected, match user events to best-performed parameter setting in history database.

2. Apply machine learning and statistical modeling to predict optimal parameters/solvers from history data.

Statistical modeling 1 approaches base algorithm decisions on the nature of the user data. The approach can help identify correlations between characteristics of the matrix and choices and parameterizations of the numerical methods. Generally speaking, using statistical techniques to make an adaptive decision proceeds in four steps:

  • Feature extraction: estimate performance-critical properties of the matrix. For linear/nonlinear system solving, we use AnaMod? to compute the matrix metadata.

  • Training stage: a large number of problems is solved with available methods, which, for linear/nonlinear system solving, can be a combination of iterative method, preconditioner, problem or system specific parameter settings. Performance and matrix features are recorded into database.

  • Each problem is assigned to the method that gave the fasted solution. As a result, we find for each method a number of problems that are best solved by it. Base on a statistical model technique, say, Bayesian analysis, we compute a probability density function for the method by using feature vectors of these problems.

  • Given a new problem and its feature vector, the method that maximize the probability density function is selected to solve the problem.

Machine learning 2 methods are also based on training set and matrix features to make solver predictions. Given a performance criterion, the machine learning algorithms, such as Adaboost, classify a set of solvers into two groups: solvers that perform better than some reference method and solvers that are worse than the reference. The first group forms the set of suitable solvers for a linear system. When multiple criteria (e.g., the norm of residual vector, execution time, iteration steps) are employed, the set of suitable candidate solvers will be narrowed by intersecting individual solver sets associated with each criterion.

The application of above intelligent methods to predict optimal solvers entails more complicated DB query requirements. Investigation of specific scenarios of database use in statistical analysis and Adaboost algorithm are underway.

Ongoing work and experiment:


There are many important issues in implementing linear solver prediction and selection algorithms. Among them, ongoing work includes:

  • Matrix feature selection. There are totally 75 matrix properties available in AnaMod?. Not all properties are relevant to a specific problem, some features might be correlated. Applying principle component analysis (PCA) to matrix properties, and using the principal components (i.e., most performance-critical properties) to classify linear solver would improve efficiency and accuracy of solver selection significantly.

  • Impact of matrix property calculation and adaptive algorithm on overall performance. The cost of running the algorithms may possibly degrade performance or compromise the benefit of adaptive solvers. We are currently using Driven Cavity Flow for computational fluid dynamics from the PETSC toolkit as the testbed to measure overheads of AnaMod? computing.

References:

1 Demmel, J., Dongarra, J., Eijkhout, V., Fuentes, E., Petitet, A., Vuduc, R., Whaley, R.C., Yelick, K. "Self Adapting Linear Algebra Algorithms and Software," IEEE Proceedings, 2005.

2 Bhowmick, S., Eijkhout, V., Freund, Y., Fuentes, E., Keyes, D. "Application of Machine Learning to the Selection of Sparse Linear Solvers," International Journal of High Performance Computing Applications, 2006.


Created by: likli last modification: Friday 12 of October, 2007 [15:17:40 UTC] by norris


Online users
4 online users