We discussed this morning with @JPelamatti and a PERSALYS user who uses the tool to analyze a csv dataset made of chemical properties of water, including water conductivity (denoted Lambda), calcium carbonate (CaCO3) concentration and other properties (approximately 4 to 6 variables). The dataset has approximately 2200 observations.
A separate analysis of outliers observations was performed by hand, removing the lines of the csv file that were not consistent with the known properties of the physical variable. It would be very convenient to use a graphical tool to do this in this particular case. For example, the user wants to select observations which exceeds a given conductivity using the cobweb, then remove these observations from the dataset. This is currently not possible.
The user is interested in predicting the conductivity depending on the calcium carbonate concentration. To do this a linear regression model would be very useful, because this would allows to get the coefficients and use it within another software as a very basic metamodel. More advanced metamodels are available in the tool (kriging and polynomial chaos), but linear regression is not available yet.
I was wondering if this could be a feature which could be handy for other users. If this is the case, please share your experience so that we may take into account this type of needs.
We were also thinking about adding a simpler version of the metamodel easily accessible through the metamodel wizard. I am planning to plug into persalys the LineraModelAnalysis class from openturns. It comes with LinearModelResult which provides a bunch of control plots/variables. Do you have any feedback on which input parameters and control outputs are more relevant in regard of this “new” type of analysis?