TY - GEN
T1 - Quantifying regional error in surrogates by modeling its relationship with sample density
AU - Mehmani, Ali
AU - Chowdhury, Souma
AU - Zhang, Jie
AU - Tong, Weiyang
AU - Messac, Achille
PY - 2013
Y1 - 2013
N2 - Approximation models (or surrogate models) provide an efficient substitute to expensive physical simulations and an efficient solution to the lack of physical models of system behavior. However, it is challenging to quantify the accuracy and reliability of such ap- proximation models in a region of interest or the overall domain without additional system evaluations. Standard error measures, such as the mean squared error, the cross-validation error, and the Akaikes information criterion, provide limited (often inadequate) information regarding the accuracy of the final surrogate. This paper introduces a novel and model independent concept to quantify the level of errors in the function value estimated by the final surrogate in any given region of the design domain. This method is called the Re- gional Error Estimation of Surrogate (REES). Assuming the full set of available sample points to be fixed, intermediate surrogates are iteratively constructed over a sample set comprising all samples outside the region of interest and heuristic subsets of samples inside the region of interest (i.e., intermediate training points). The intermediate surrogate is tested over the remaining sample points inside the region of interest (i.e., intermediate test points). The fraction of sample points inside region of interest, which are used as interme- diate training points, is fixed at each iteration, with the total number of iterations being pre-specified. The estimated median and maximum relative errors within the region of in- terest for the heuristic subsets at each iteration are used to fit a distribution of the median and maximum error, respectively. The estimated statistical mode of the median and the maximum error, and the absolute maximum error are then represented as functions of the density of intermediate training points, using regression models. The regression models are then used to predict the expected median and maximum regional errors when all the sample points are used as training points. Standard test functions and a wind farm power generation problem are used to illustrate the effectiveness and the utility of such a regional error quantification method.
AB - Approximation models (or surrogate models) provide an efficient substitute to expensive physical simulations and an efficient solution to the lack of physical models of system behavior. However, it is challenging to quantify the accuracy and reliability of such ap- proximation models in a region of interest or the overall domain without additional system evaluations. Standard error measures, such as the mean squared error, the cross-validation error, and the Akaikes information criterion, provide limited (often inadequate) information regarding the accuracy of the final surrogate. This paper introduces a novel and model independent concept to quantify the level of errors in the function value estimated by the final surrogate in any given region of the design domain. This method is called the Re- gional Error Estimation of Surrogate (REES). Assuming the full set of available sample points to be fixed, intermediate surrogates are iteratively constructed over a sample set comprising all samples outside the region of interest and heuristic subsets of samples inside the region of interest (i.e., intermediate training points). The intermediate surrogate is tested over the remaining sample points inside the region of interest (i.e., intermediate test points). The fraction of sample points inside region of interest, which are used as interme- diate training points, is fixed at each iteration, with the total number of iterations being pre-specified. The estimated median and maximum relative errors within the region of in- terest for the heuristic subsets at each iteration are used to fit a distribution of the median and maximum error, respectively. The estimated statistical mode of the median and the maximum error, and the absolute maximum error are then represented as functions of the density of intermediate training points, using regression models. The regression models are then used to predict the expected median and maximum regional errors when all the sample points are used as training points. Standard test functions and a wind farm power generation problem are used to illustrate the effectiveness and the utility of such a regional error quantification method.
KW - Error quantification
KW - Model selection
KW - Regional accuracy
KW - Sampling
KW - Surrogate model
UR - https://www.scopus.com/pages/publications/84881338827
UR - https://www.scopus.com/pages/publications/84880832147
U2 - 10.2514/6.2013-1751
DO - 10.2514/6.2013-1751
M3 - Conference contribution
SN - 9781624102233
T3 - Collection of Technical Papers - AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Materials Conference
BT - 54th AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Materials Conference
T2 - 54th AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Materials Conference
Y2 - 8 April 2013 through 11 April 2013
ER -