Training API
Training API
This section documents the training orchestration used for neural networks and general training loops.
Pml\Training\Trainer
Trainer is a high-level training driver used for TorchBackend and Sequential models.
Signature
public function __construct(MLBackend $backend, TrainingArguments $args, ?LoggerInterface $logger = null)
public function addCallback(TrainerCallback $callback): void
public function train(Dataset $dataset, ?Dataset $validation = null): TrainingResult
What it does
- Creates a
DataLoaderfor mini-batching. - Optionally shuffles the dataset.
- Runs epoch/batch loops for deep learning backends.
- Computes train and validation loss.
- Applies LR scheduling and checkpointing.
- Fires callback hooks at lifecycle events.
Pml\Training\TrainingArguments
A typed container for training hyperparameters.
Constructor
public function __construct(
int $epochs = 10,
int $batchSize = 32,
int $patience = 0,
float $minDelta = 1e-4,
float $learningRate = 0.001,
string $lrSchedule = 'none',
float $lrDecay = 0.1,
int $lrStepSize = 5,
int $warmupEpochs = 0,
bool $mixedPrecision = false,
?string $outputDir = null,
int $saveEvery = 0,
bool $saveBest = true,
int $logEvery = 1
)
Methods
public function toTrainOptions(?Dataset $validation = null): array
Returns a named options array suitable for Sequential::train().
Data loading
Pml\Data\DataLoader
public function __construct(Dataset $dataset, int $batchSize = 32, bool $shuffle = false, bool $dropLast = false, ?DataCollator $collator = null)
public function batches(): \Generator
public function steps(): int
public function batchSize(): int
public function dataset(): Dataset
DataLoader::batches() yields zero-copy DataBatch objects.
Pml\Data\DataBatch
public function __construct(Tensor $inputs, ?Tensor $labels = null, array $meta = [])
public function inputs(): Tensor
public function labels(): ?Tensor
public function meta(): array
public function hasLabels(): bool
public function size(): int
Callbacks
Pml\Training\TrainerCallback
Implement this interface to receive hooks during training.
public function onTrainBegin(TrainingArguments $args, int $steps): void
public function onEpochBegin(int $epoch, int $epochs): void
public function onBatchEnd(int $step, float $batchLoss): void
public function onEpochEnd(int $epoch, float $trainLoss, ?float $valLoss): void
public function onTrainEnd(TrainingResult $result): void
Pml\Training\TrainingResult
Stores summary information after training.
Fields
epochsRuntrainLossHistoryvalLossHistorybestValLossearlyStoppedelapsedSeconds
Example Usage
use Pml\Training\Trainer;
use Pml\Training\TrainingArguments;
use Pml\NeuralNetwork\Sequential;
use Pml\NeuralNetwork\Layers\Dense;
use Pml\Losses\MeanSquaredError;
use Pml\NeuralNetwork\Optimizers\Adam;
$dataset = Dataset::fromArray(
[[0.0], [1.0], [2.0]], [0.0, 2.0, 4.0]
);
$model = new Sequential([
new Dense(1, 10),
new Dense(10, 1),
], new MeanSquaredError(), new Adam(0.01));
$args = new TrainingArguments(epochs: 20, batchSize: 1, patience: 5);
$trainer = new Trainer($backend, $args);
$result = $trainer->train($dataset);
Performance notes
DataLoadershuffles in-place usingDataset::randomize().- The trainer uses zero-copy batch views to avoid repeated copies.
- Checkpointing and validation can be enabled without changing the model code.
Common mistakes
- Passing an unlabeled dataset to a learner that requires labels.
- Trying to use
DataLoaderon a dataset beforematerialize(). - Ignoring the callback hooks if you need metrics or custom logging.