-
ML uses historical data to make predictions.
-
It looks at patterns within that data to create a mathematical model of the data, and then it processes similar data to make predictions.
-
Diagram illustrates this sequence, ML — A Conceptual Representation
-
- This process works because the new data is not memorized.
- Instead, based on the initial dataset supplied for creating the model, the ML process can find the same or similar patterns in the new data, and as such, make predictions.
- The initial/histrorical dataset supplied must be good—in and good quality—curated data. So it will lead to a model producing good predictions.
- Otherwise low-quality data will lead to a model that will likely make poor or inaccurate predictions, leading to erroneous results.
- ML uses specific algorithms to find patterns and retrieve insights from data.
-
3 common categories of prediction
- image classification (is the image a dog or a cat?)
- tabular data (what is the creditworthiness of the loan applicant?)
- Natural language(NL) processing (is this restaurant review positive or negative?).
-
2 mainstream approaches (more, but basic) - 1. supervised and 2. unsupervised.
-
Other types of ML approaches (but not limited to) - 1. semi-supervised, 2. reinforcement, 3. meta-learning, 4. topic modeling, 5.deep learning based on the extensive use of artificial neural networks.
-
Supervised ML - your dataset will contain a column (label) whose values you’ll want to predict.
- For example, Analyst at Interpol and have a list of fugitives and a column that indicates the probability of finding the offender, use this column to produce a prediction.
- Example inputs and desired outputs (resultant columns or labels) are presented, allowing the computer to find patterns and rules that map inputs to the desired outcomes.
- Algorithm types
- Regression: Used to predict values, such as an employee's salary. An increase is often used for time-series problems.
- Classification: Used for predicting classes, such as classifying pictures into different class types: houses, cars, airplanes, toys, etc. 1. Classification is usually divided into binary classification (prediction with precisely two possible values) or multiclass classification (three or more possible values).
-
Unsupervised - the dataset would not have the column or label containing the probability of finding the fugitive for the algorithm to predict.
- In this case, the algorithm is left on its own to find structure within the input provided—thus, it is called unsupervised.
- No result columns or labels are given to the algorithm with unsupervised learning, so it must find patterns and structure within the data provided.
- Algorithm types
- Clustering: Used for predicting groups with similar patterns, such as grouping online banking users based on their app usage habits.
- Anomaly detection: Used to predict elements that don’t align with the pattern found in the rest of the data, such as financial fraud.
- Use the historical data, go through data preparation or data cleaning — this includes performing activities such as adding, updating, or removing missing values.
- Other data preparation activities include converting text values into numerical values, given that most algorithms work best with numerical values.
- After the data preparation step, create the model, including deciding which algorithm type to use.
- Once the model has been created, the next step is to evaluate it to check if it executes well and gives good results for new data it hasn’t processed.
- To avoid providing the same data used to create the model (because that data is well-known). Instead, new data is used to check whether the model can generalize well.
- It's an iterative process.
- The model may not perform well when you evaluate it with the new data.
- In that case, you might have to try with a different algorithm, recreate the model, then try the new model with the new data as often as required to achieve good results.
- Once you have a good-enough model, the next step is to deploy it in production where it will be used.
- It is possible once the model is in production that the data will change over time, invalidating the previously created and deployed model and degrading the system's performance and results.
- In that case, you’ll have to go over the initial data-gathering step and repeat the process.

- ML.NET is a .NET ML Nuget library package
- Model Builder, which uses AutoML (automated ML), provides an approachable and easy user interface to create, train, and deploy ML models.
- With Model Builder, it’s possible to employ different algorithms, metrics, and configuration options to create the best possible model for your data.
- It offers several built-in scenarios with different machine-learning use cases.
- Beyond producing a trained model, Model Builder can work with CSV files and SQL databases and generate the source code required to load your model, make predictions, and connect to cloud resources.
- Using the latest model builder - if VS is not showing Model Builder UI, download and install the latest version of Model Builder.
- Create the Console App project (POCML)
- Installing the Microsoft.ML NuGet Package
