Add the aggregated Sobol indices

dumas · October 8, 2020, 7:26am

Hello,
We noticed that the aggregated Sobol indices are not provided in the interface. It could useful to have it. The question is how to implement it :

add the aggregated output in the list of all output
add another tab only visible when several outputs are defined and only dedicated to the aggregated indices.

What do you think ?

braydi · October 8, 2020, 8:32am

Hello,
It is very useful to get the aggregated Sobol indices when we have mutltioutput model, I think the second choice for the implementation is better.

mbaudin47 · October 9, 2020, 3:25pm

Hi Antoine,
If the function G has only one single output, the aggregated Sobol’ indices are equal to the classical Sobol’ indices, is this true? I suppose that this is the main motivation for not showing these aggregated indices when there is only one output, because the information they provide duplicates what is already presented in the GUI.
In the context of the validation of a multivariate function G, is there an “aggregated Q2” predictivity coefficient" defined in the bibliography? I ask this question because the current GUI provides a separate Q2 for each marginal output, not an aggregated one. This would greatly simplify the comparison between different metamodels.
Regards,
Michaël

dumas · October 12, 2020, 7:55am

For your 1st question, yes the aggregated Sobol indices are equal to the classical ones, if there is only 1 output. The aggregated indices are also interesting in the case the output is a field, we might want to know globally the influence as well as along the mesh axis. This could be a functionality to add to Persalys.

For the aggregated Q2, I did not read anything like this in the bibliography but I assume we can do the sum of the residuals for all outputs. However I still prefer having a Q2 per output in order to choose the best metamodel for each output.

mbaudin47 · October 12, 2020, 9:07pm

If the multi-output function is a field, adding the residuals might make sense. If the multi-output function has output which have varying order magnitudes (e.g. a Young modulus and a strain), this might not perform accurately. Would an averaged-Q2 (i.e. the sum of Q2 for all outputs divided by the number of outputs) do the trick in a manner that is statistically consistent?

dumas · October 13, 2020, 7:18am

I agree that for output with different order of magnitude, it might not perform well, especially as if the output variance is different. I tried to find some papers on it and I finally arrived in the scikit-learn documentation where they returns either all Q2, or the mean or the weighted mean (using the variance) of all Q2.

Topic		Replies	Views
Using Persalys as a data visualization tool: questions and suggestions Persalys usage	4	299	November 25, 2022
Error in Morris criblage Persalys usage	10	479	September 15, 2021
Polynomial chaos expansion in Persalys Persalys usage	3	560	October 5, 2020
New Persalys release 9.0 Developer’s spot	0	471	November 26, 2020
Metamodels from Data models Persalys usage	5	231	December 8, 2022

Add the aggregated Sobol indices

Related topics