With a patent application now on file to preserve invention rights, our RASON users are the first to benefit from these innovative methods.
INCLINE VILLAGE, Nev. (PRWEB)
September 21, 2022
Frontline Systems is shipping RASON® V2023, a new version of its cloud platform for advanced analytics, that enables both business analysts and developers to easily create and run models using mathematical optimization, Monte Carlo simulation and risk analysis, data mining and machine learning, and business rules and calculations.
RASON is not new – since 2015 it has been available “around the clock” as an Azure-based SaaS platform, one of the first to offer a REST API for both predictive and prescriptive analytics. RASON has been enriched each year with new analytics and model management features – including support for business rules using the DMN open standard in 2019. But now it’s the first and only tool with fully automated methods for risk analysis of previously trained and validated machine learning (ML) models.
Risk analysis changes the focus from how accurately a ML model will predict a single new case, to how it will perform in aggregate over thousands or millions of new cases, what the business consequences will be, and the (quantified) risk that this will be different than expected from the ML model’s training and validation.
“With a patent application now on file to preserve invention rights, our RASON users are the first to benefit from these innovative methods,” said Daniel Fylstra, Frontline’s President and CEO. Frontline Systems is concurrently releasing new versions of Solver SDK®, its object library for developers, and Analytic Solver®, its tool for business analysts using Excel for the Web, Windows and Macintosh, with support for the same innovative methods.
How and Why Machine Learning has Lacked Risk Analysis
For a decade, data science and machine learning (DSML) tools – including RASON – have offered facilities for ‘training’ a model on one set of data, ‘validating’ its performance on another set of data, and ‘testing’ it versus other ML models on a third set of data. But this is not risk analysis: based on known data, it doesn’t assess the risk that the ML model will perform differently on new data when put into production use. While it’s common to assess a ML model’s performance in use, and move to re-train the model if its performance is unexpectedly poor, by that time those risks have occurred, often with adverse financial consequences. Quantification of such risks “ahead of time” has been missing in practice.
There are many reasons for this state of affairs: Data scientists with expertise in ML methods often are not trained in risk analysis; they think of “features” and even predicted output values as data, not as “random variables” with sampled instances. Even if known, conventional risk analysis methods are expensive and time-consuming to apply to machine learning: ML data sets include many (sometimes hundreds) of features, with limited “provenance” of the data’s origins. There are hundreds of classical probability distributions that could be ‘candidates’ to fit each feature. Only some of the features are typically found, after ML model training, to have predictive value; many are found to be correlated with other features and hence ‘redundant’. And in typical projects, a great many ML models are built.
How RASON Performs Automated Risk Analysis
Unlike most cloud DSML platforms, RASON also includes powerful algorithms for risk analysis: Probability distribution fitting, correlation fitting, stratified sample generation and Monte Carlo analysis. But asking users to “quickly master risk analysis” is asking too much. So Frontline Systems has invented ways to automate the entire risk analysis process. Using the new risk analysis capability is as simple as adding text such as “simulation”: { } to the RASON definition of a machine learning model – and the risk analysis typically adds just seconds to a minute to the existing process of training, validating and testing a ML model.
Internally, for each feature in a dataset, RASON tests an entire family of probability distributions – drawing on its first-mover support for the new Metalog family of distributions, created by Dr. Tom Keelin; optimizes all the parameters of each distribution; detects and models correlations among features, using rank order and copula methods; performs synthetic data generation, using Monte Carlo methods for stratified sampling and correlation; computes the ML model’s predictions, as well as user-specified financial consequences, for each simulated case; and importantly, assesses and quantifies the differences in performance of the ML model on this simulated data versus the training, validation and test data.
Results of the risk analysis, including key summary statistics, percentiles and risk measures, are available in JSON and OData form. Users can easily request and obtain the data needed to create charts on their own Web pages or in tools like Power BI or Tableau, or perform further analysis of their own.
Synthetic Data Generation as a Side Benefit
Synthetic Data Generation (SDG) has become topical in machine learning in recent years, with a number of companies founded just to supply software and services around this technology. SDG is used when there isn’t enough original data, or when use of the original data is restricted by law or regulation. But until now (in a patent and literature search), SDG has simply been used to better train ML models.
RASON V2023 includes a powerful, general-purpose, easy to use Synthetic Data Generation facility, invoked by simply writing “algorithm”: “syntheticDataGenerator” within a data “transformer” step. Unlike some special-purpose SDG offerings, this facility can accurately model the behavior of nearly any combination of features with continuous values. But RASON also uses synthetic data in an entirely new way, to analyze the risk that a ML model will yield unexpected results “large enough to matter” when deployed for production use.
Works with Already-Available ‘Augmented Machine Learning’
RASON’s V2022 release featured “augmented machine learning” features found only in other sophisticated machine learning tools. The user simply supplies data, and in a RASON “estimator” clause, adds “algorithm”: “findBestModel” and provides a list of “Learners” of different types – classification and regression trees, neural networks, linear and logistic regression, discriminant analysis, naïve Bayes, k-nearest neighbors and more. When the model is run, RASON automatically tests and fits parameters for all of the Learners (ML algorithms) to the training data, validates and compares them according to user-chosen criteria, and delivers the trained ML model that best fits the data. Again by adding a command as simple as “simulation”: { }, the user can perform a risk analysis on the “best model” found by RASON.
Free Trials, Learning and Coaching Resources
Analysts and developers can sign up for free trial accounts to evaluate RASON at https://rason.com, exercise the REST API, try out dozens of example models using optimization and simulation, forecasting and machine learning, business rules and calculations, and download the RASON User Guide and Reference Guide in PDF form. For more information please contact sales@solver.com.
Frontline Systems Inc. (http://www.solver.com) is the alternative to analytics complexity, helping business analysts and managers gain insights and make better decisions for an uncertain future, without the cost, delays and risk of ‘big vendor’ tools. Its products integrate forecasting and data mining for “predictive analytics,” Monte Carlo simulation for risk analysis, conventional and stochastic optimization for “prescriptive analytics,” and business rules and Excel calculations to make the best business decisions. Founded in 1987, Frontline is based in Incline Village, Nevada (775-831-0300).
Microsoft Excel, Office 365, Azure and Power BI are trademarks of Microsoft Corp. Tableau is a trademark of Salesforce Inc. Analytic Solver®, RASON® and Solver SDK® are registered trademarks of Frontline Systems Inc.