Load a List of Objects as Dataset in ML.NET

In ML.NET, you can load a list of objects as a dataset using the DataView API. ML.NET provides a flexible way to represent data as DataView, which can be consumed by machine learning algorithms. To do this, you’ll need to follow these steps:

  1. Define the class for your data objects: Create a class that represents the structure of your data. Each property of the class corresponds to a feature in your dataset.
  2. Create a list of data objects: Instantiate a list of objects with your data. Each object in the list represents one data point.
  3. Convert the list to a DataView: Use the MLContext class to create a DataView from the list of objects.

Here’s a step-by-step implementation:

Step 1: Define the class for your data objects

Assuming you have a class DataObject with properties Feature1, Feature2, and Label, it should look like this:

public class DataObject
{
    public float Feature1 { get; set; }
    public float Feature2 { get; set; }
    public float Label { get; set; }
}

Step 2: Create a list of data objects

Create a list of DataObject instances containing your data points:

var dataList = new List<DataObject>
{
    new DataObject { Feature1 = 1.2f, Feature2 = 5.4f, Label = 0.8f },
    new DataObject { Feature1 = 2.1f, Feature2 = 3.7f, Label = 0.5f },
    // Add more data points here
};

Step 3: Convert the list to a DataView

Use the MLContext class to create a DataView from the list of objects:

using System;
using System.Collections.Generic;
using Microsoft.ML;

// ...

var mlContext = new MLContext();

// Convert the list to a DataView
var dataView = mlContext.Data.LoadFromEnumerable(dataList);

Now you have the dataView, which you can use to train and evaluate your machine learning model in ML.NET. The DataView can be directly consumed by ML.NET’s algorithms or be pre-processed using data transformations.

Remember to replace DataObject with your actual class and modify the properties accordingly based on your dataset.

Load a Text File Dataset in ML.NET

Introduction

Machine learning has revolutionized the way we process and analyze data, making it easier to derive valuable insights and predictions. ML.NET, developed by Microsoft, is a powerful and user-friendly framework that allows developers to integrate machine learning into their .NET applications. One of the fundamental tasks in machine learning is loading datasets for model training or analysis. In this blog post, we’ll explore how to load a text file dataset using ML.NET and prepare it for further processing.

The Dataset

Let’s start with a simple dataset stored in a text file named data.txt. The dataset contains two columns: “City” and “Temperature”. Each row corresponds to a city’s name and its respective temperature. Here’s how the data.txt file looks:

City,Temperature 
Rasht,24 
Tehran,28 
Tabriz,8 
Ardabil,4

The Data Transfer Object (DTO)

In ML.NET, we need to create a Data Transfer Object (DTO) that represents the structure of the data we want to load. The DTO is essentially a C# class that matches the schema of our dataset. In our case, we’ll define a DataDto class to represent each row in the data.txt file. Here’s the DataDto.cs file:

public class DataDto
{
    [LoadColumn(0), ColumnName("City")] 
    public string City { get; set; }
    
    [LoadColumn(1), ColumnName("Temperature")]
    public float Temperature { get; set; }
}

The DataDto class has two properties, City and Temperature, which correspond to the columns in the dataset. The properties are decorated with attributes: LoadColumn and ColumnName. The LoadColumn attribute specifies the index of the column from which the property should load its data (0-based index), and the ColumnName attribute assigns the name for the corresponding column in the loaded data.

Loading the Dataset

With the DTO in place, we can now proceed to load the dataset using ML.NET. The entry point for ML.NET operations is the MLContext class. In our Program.cs, we’ll create an instance of MLContext, specify the path to the text file, and load the data into a DataView.

using System;
using Microsoft.ML;

public class Program
{
    static void Main()
    {
        // Create an MLContext
        var mlContext = new MLContext();
        
        // Specify the path to the text file dataset
        string dataPath = "data.txt";
        
        // Load the data from the text file into a DataView using the DataDto class as the schema
        var dataView = mlContext.Data.LoadFromTextFile<DataDto>(dataPath, separatorChar: ',', hasHeader: true);
        
        // Now you can use the dataView for further processing, like training a model, data analysis, etc.
        // ...
    }
}

The LoadFromTextFile method takes the path to the dataset file (dataPath) as well as the separator character (, in our case) and a boolean indicating whether the file has headers (hasHeader: true).

Conclusion

In this blog post, we’ve learned how to load a text file dataset in ML.NET using a Data Transfer Object (DTO) to define the structure of the data. By leveraging the LoadFromTextFile method, we can easily read the dataset into a DataView and utilize it for further processing, such as training a machine learning model or conducting data analysis. ML.NET simplifies the process of integrating machine learning capabilities into .NET applications, making it accessible to a broader range of developers and opening up new possibilities for data-driven solutions.

Install the ML.NET Command-Line Interface (CLI) tool

Windows

dotnet tool install --global mlnet-win-x64

Linux

dotnet tool install --global mlnet-linux-x64

Install a specific release version

If you’re trying to install a pre-release version or a specific version of the tool, you can specify the OS, processor architecture, and framework using the following format:

dotnet tool install -g mlnet-<OS>-<ARCH> --framework <FRAMEWORK>
dotnet tool install -g mlnet-linux-x64 --framework net7.0

Update the CLI package

dotnet tool list --global
dotnet tool update --global mlnet-linux-x64

References
https://learn.microsoft.com/en-us/dotnet/machine-learning/how-to-guides/install-ml-net-cli