ML.NET gives you the ability to add machine learning to .NET applications, in either online or offline scenarios. With this capability, you can make automatic predictions using the data available to your application. Machine learning applications make use of patterns in the data to make predictions rather than needing to be explicitly programmed.

ML.NET runs on Windows, Linux, and macOS using .NET Core, or Windows using .NET Framework. 64 bit is supported on all platforms. 32 bit is supported on Windows, except for TensorFlow, LightGBM, and ONNX-related functionality.

In this article we will be specifically focus on AutoML Model Builder. If writing code isn’t your thing, you can use the ML.NET model builder to train and generate a model using a wizard-based user interface built into Visual Studio. By this way you can have a ML experiment up and running in no time.

Prerequisites 🚛

Basic understanding on classification using ML.
Download dataset from : https://www.kaggle.com/datasets/muhammadshahidazeem/customer-churn-dataset or https://github.com/nishanc/CustomerChurnMLDemo/tree/main/CustomerChurnMLDemo/Data
Download Visual Studio 2022

During installation, the .NET desktop development workload should be selected along with the optional ML.NET Model Builder component. Using the link above should preselect all the prerequisites correctly, as shown on the following image:

Once you've enabled ML.NET Model Builder in Visual Studio, download and install the latest version.

Download the latest version of Model Builder

Open Visual Studio and create a new .NET console app:

Select Create a new project from the Visual Studio 2022 start window.
Select the C# Console App project template.
Change the project name to CustomerChurnMLDemo.
Make sure Place solution and project in the same directory is unchecked.
Select the Next button.
Select .NET 7.0 (Standard Term support) as the Framework.
Select the Create button. Visual Studio creates your project and loads the Program.cs file.

QuickStart 🚀

Right-click on the CustomerChurnMLDemo project in Solution Explorer and select Add > Machine Learning Model.

By the way this entire project can be downloaded from here:

In the Add New Item dialog, make sure Machine Learning Model (ML.NET) is selected. Change the Name field to CustomerChurn.mbconfig and select the Add button.

This is sort of a graphical wizard that you can configure to build your model. So let's go through the wizard to see what we can do. First create a Folder named "Data" and copy data files.

If we look at the data, the column "Churn" is the value that we need to predict. 1 for Yes and 0 for No.

You can also see that this data has a CustomerId field as well. We will remove that when we retrain our model later. For now let's continue with this column. (If you want you can remove that column from the file before importing as well, but we need to learn how to do it using code.)

So this ML problem is a data classification. Under the "Tabular" section, select "Local" under "Data classification"

Now select training environment, for this use case only CPU is available, Azure and GPU options are not available for us.

In next step, let's browse the data file we downloaded. Also select the column we want to predict, which is "Churn"

Go to "Advanced data options..." to ignore CustomerId column, because that is not a feature. In the meantime see if all the string features are marked as categorical.

Check "Data formatting" and "Validation data" as well if you need to make any changes. Such as if the data has a header row etc. If all good, click Save and go to next step which is Train.

With the release of ML.NET 2.0, MIcrosoft introduced advanced training options which enable you to:

Choose the evaluation metric you want to optimize
Choose trainers

Click on the Advanced training options... to access these.

Configure these options if you want, I will keep them as it is. But it's great to see these new options coming in to ML.NET. Read more about ML.NET 2.0

Now click on Start training, notice that I have given 600 seconds because size of our data is around 20MB and this is the recommended length for that size according to How long should I train for?.

So what is it doing now? Answer is..

AutoML 🤖

In general, the workflow to train machine learning models is as follows:

Define a problem
Collect data
Preprocess data
Train a model
Evaluate the model

Preprocessing, training, and evaluation are an experimental and iterative process that requires multiple trials until you achieve satisfactory results. Because these tasks tend to be repetitive, AutoML can help automate these steps. In addition to automation, optimization techniques are used during the training and evaluation process to find and select algorithms and hyperparameters.

With AutoML in ML.NET, users can provide their dataset and specify certain parameters, and the system will automatically explore and try various algorithms, preprocessing techniques, and hyperparameter configurations to find the best-performing model. This automation significantly reduces the manual effort required to fine-tune and optimize models, making it easier for individuals without extensive machine learning expertise to create effective models.

AutoML in ML.NET aims to simplify the machine learning process and make it more accessible, enabling a broader range of developers to leverage the power of machine learning in their applications.

Oh look, training is finished. 💡

We have achieved 0.9991 accuracy, and AutoML has decided "Fast Tree OVA(One-versus-all)" is the best model. Let's go to the next step, which is Evaluate. I'm selecting the data from the first row from the testing data set. (customer_churn_dataset-testing-master.csv)

So we can assume that this customer will terminate their subscription. (For label 1, confidence in 86%, for label 0, confidence is less than 14%)

Consuming Models 🦾

It's also very simple.

Click on Add to solution under Web API, and give a name in the next dialog.

When the project is created you may set the CustomerChurnWebApi as the startup project.

Now open the Progam.cs file. We need to add a line to our Web API to start at swagger endpoint. [commit]

Now when you run the application you will get this browser window.

Let's try to send the same testing record to this endpoint and see the result.

We can see that we are getting the same prediction we got before. Now only thing that remains is to create a frontend application from your favourite JavaScript framework to consume this API. 💫

Beyond Model Builder 🔧

Feature Importance

Feature importance in machine learning refers to the degree to which individual input features (also known as variables, attributes, or columns) contribute to the prediction or output of a machine learning model. It helps in understanding which features have a stronger influence on the model's predictions and which ones have a weaker influence. Feature importance can provide insights into the relationships between input features and the target variable and can guide decisions about feature selection, engineering, and model refinement.

There are different techniques and metrics used to assess feature importance, in AutoML we can get feature importance using Permutation Feature Importance: This method involves randomly shuffling the values of a single feature while keeping the other features unchanged. The resulting decrease in model performance (e.g., accuracy, loss) indicates the importance of that feature. It helps to assess how sensitive the model's predictions are to changes in that feature.

If you navigate to CustomerChurn.evaluate.cs you will see a method called CalculatePFI(), let's call it from the Program.cs and see the result [commit]. Also you can refer this and this for more information.

Now if you run the console application (not the web API), you will get the following output. Note that calculating PFI can be a time consuming operation. How much time it takes to calculate is proportional to the number of feature columns you have. The more features, the longer PFI will take to run.

Confusion Matrix

A confusion matrix is a tool used in machine learning to assess the performance of a classification model. It is particularly useful when working with supervised learning algorithms that predict categorical outcomes (classes or labels). The confusion matrix provides a comprehensive summary of how well a model's predictions match the actual class labels in the dataset.

To create a confusion matrix using ML.NET, you should do the AutoML experiment by yourself without using the Model Builder wizard. If you were to wight the code to do this entire thing it should look something like this.

First of all you need to install Microsoft.ML.AutoML package.

I think the code is pretty much self explanatory. The code first initializes an MLContext, which is the central object for all ML.NET operations. It loads the training and evaluation datasets from CSV files using the LoadFromTextFile method, specifying the class CustomerChurn.ModelInput to define the data schema.

Then, the AutoML experiment settings are defined, including the optimizing metric (MacroAccuracy), the maximum experiment time, and caching options. The Auto() method is called to create a multiclass classification experiment, and then the experiment is executed with Execute(), passing the training data, evaluation data, the label column name ("Churn"), and a custom progress reporter to track the running progress.

The best model from the AutoML experiment is obtained, and predictions are made on the evaluation data using this model. The model's performance is evaluated using metrics such as the confusion matrix, macro accuracy, micro accuracy, and log loss, which are printed to the console. Additionally, a custom progress reporter class (MulticlassProgressReporter) is defined to report the progress and metrics of each run during the AutoML process. This code is structured to showcase how to use AutoML to automate the selection and evaluation of models for multiclass classification tasks while providing insights into their performance.

You can read more about ML.NET from official Microsoft Documentation. I'm gonna wrap things up there. Leave a comment if you have any problem running this demo. Thank you for reading this far. I will see you in the next one.

Nishan's Dev Blog

ML.NET AutoML Model Builder (Step-by-Step Walkthrough)