Custom Pipeline Example
Custom Pipeline Example
Build a reusable Pipeline with transformers and a final estimator.
<?php
require 'vendor/autoload.php';
use Pml\Dataset;
use Pml\Pipeline;
use Pml\Transformers\WordCountVectorizer;
use Pml\Estimators\Classifiers\GBDTClassifier;
$dataset = Dataset::fromCSV('datasets/sentiment/train.csv', labelColumn: 'label', hasHeader: true)
->dropNans()
->materialize(labelCol: 'label');
$vectorizer = new WordCountVectorizer(minDf: 2, maxFeatures: 2000);
$model = new GBDTClassifier(nEstimators: 80, maxDepth: 5);
$pipeline = new Pipeline([
$vectorizer,
], $model);
$pipeline->train($dataset);
$inference = Dataset::fromArray(
[
['I love this product', 'positive'],
['Terrible support experience', 'negative'],
],
[1, 0]
)->materialize(labelCol: 1);
$predictions = $pipeline->predict($inference);
print_r($predictions->toFlatArray());
Notes
Pipeline::train()fits transformers and then trains the estimator.Pipeline::predict()applies fitted transformers before inference.- Save and reload pipelines with
Pipeline::save($dir)andPipeline::load($dir).