Training a simple classifier
In this example we will see how to build a simple support vector classifier using Netsaur.
The dataset
We will be classifying two classes from the Iris dataset. The dataset consists of 150 samples of data on three different species of the iris flower. We will be using the first two classes (Iris versicolor, Iris setosa) for our example.
The two-class dataset can be downloaded here.
Loading dependencies
We first need to import the necessary modules from Netsaur:
Model
We will be creating a sequential neural network in this example.
import { Sequential } from "jsr:@denosaurs/netsaur@0.4.0/core";
Layers
Let's import a DenseLayer
and a SigmoidLayer
for activation.
import {
DenseLayer,
SigmoidLayer,
} from "jsr:@denosaurs/netsaur@0.4.0/core/layers";
Utilities
From netsaur/utilities
, we need useSplit
to split the dataset and
ClassificationReport
to get model metrics.
import {
useSplit,
ClassificationReport,
} from "jsr:@denosaurs/netsaur@0.4.0/utilities";
Misc
Cost
to select a cost function for the networkCPU
the backend we will be training onsetupBackend
to setup the CPU backendtensor2D
to create a dataset for the model
import {
Cost,
CPU,
setupBackend,
tensor2D,
} from "jsr:@denosaurs/netsaur@0.4.0";
We need the parse
function to load CSV data.
import { parse } from "jsr:@std/csv@1.0.3/parse";
Loading the dataset
First, open the dataset file using Deno.readTextFileSync
and
then parse the text content using the parse
function we imported.
const _data = Deno.readTextFileSync("binary_iris.csv");
const data = parse(_data);
Now we can get the predictors (x) and targets (y). The first four columns are the predictors and the fifth column contains the class.
Since we are training a support vector classifier, our outputs are encoded as
1
and -1
.
const x = data.map((fl) => fl.slice(0, 4).map(Number));
const y = data.map((fl) => (fl[4] === "Setosa" ? 1 : -1));
Next, we split the dataset for training and testing. The common train:test ratio
is 7:3
.
const [[trainX, trainY], [testX, testY]] = useSplit(
{ ratio: [7, 3], shuffle: true },
x,
y
);
Preparing the model
Let's setup our CPU backend first. This allows Netsaur to load the correct binaries for the desired backend.
await setupBackend(CPU);
Now comes our model. The size
parameter defines the input size of your data. For this example,
your data will be split into 4 minibatches. The numbers after the first number define the input shape
with the exception of the number of samples.
We are setting the silent
parameter to false so that the network prints training log to stdout.
Our layer configuration consists of a dense (fully connected) layer with 4 neurons, a sigmoid activation layer, and a dense output layer with 1 neuron. The output layer only has a single neuron because our output is a single binary value.
Finally, we are using the hinge
cost function, which is the standard cost function for SVMs.
const net = new Sequential({
size: [4, trainX[0].length],
silent: false,
layers: [
DenseLayer({ size: [4] }),
SigmoidLayer(),
DenseLayer({ size: [1] }),
],
cost: Cost.Hinge,
});
Training the model
Now we train our network for 150
epochs, in 1
batch, with a learning rate of 0.02
.
net.train(
[
{
inputs: tensor2D(trainX),
outputs: tensor2D(trainY.map((x) => [x])),
},
],
150,
1,
0.02
);
Evaluating the model
To evaluate our model, we can use the trained model on the test data.
const res = await net.predict(tensor2D(testX));
Our result will be a 2-dimensional tensor. Tensor.prototype.data
is a Float32Array
,
which we can iterate through.
Now we use a sign function to convert the predicted values into 1
, -1
.
const y1 = res.data.map((x) => (x < 0 ? -1 : 1));
Finally we can generate a classification report for our evaluation.
const cMatrix = new ClassificationReport(testY, y1);
console.log(cMatrix);
You should get an output like this:
Classification Report
Number of classes: 2
Class Preci F1 Rec Sup
1 1.0000 1.0000 1.0000 17
-1 1.0000 1.0000 1.0000 13
Accuracy 1 30