Introduction
Machine learning is everywhere—recommending movies, tagging images, and now even classifying news articles. Imagine if you could do that within PHP! With Rubix ML, you can bring the power of machine learning to PHP in a way that’s straightforward and accessible. This guide will walk you through building a simple news classifier that sorts articles into categories like “Sports” or “Technology.” By the end, you’ll have a working classifier that can predict categories for new articles based on their content.
This project is perfect for beginners who want to dip their toes into machine learning using PHP, and you can follow along with the complete code on GitHub.
Table of Contents
- What is Rubix ML?
- Setting Up the Project
- Creating the News Classification Class
- Training the Model
- Predicting New Samples
- Final Thoughts
What is Rubix ML?
Rubix ML is a machine learning library for PHP that brings ML tools and algorithms into a PHP-friendly environment. Whether you’re working on classification, regression, clustering, or even natural language processing, Rubix ML has you covered. It allows you to load and preprocess data, train models, and evaluate performance—all in PHP.
Rubix ML supports a wide range of machine learning tasks, such as:
- Classification: Categorizing data, like labeling emails as spam or not spam.
- Regression: Predicting continuous values, like housing prices.
- Clustering: Grouping data without labels, like finding customer segments.
- Natural Language Processing (NLP): Working with text data, such as tokenizing and transforming it into usable formats for ML.
Let’s dive into how you can use Rubix ML to build a simple news classifier in PHP!
Setting Up the Project
We’ll start by setting up a new PHP project with Rubix ML and configuring autoloading.
Step 1: Initialize the Project Directory
Create a new project directory and navigate into it:
mkdir NewsClassifier cd NewsClassifier
Step 2: Install Rubix ML with Composer
Make sure you have Composer installed, then add Rubix ML to your project by running:
composer require rubix/ml
Step 3: Configure Autoloading in composer.json
To autoload classes from our project’s src directory, open or create a composer.json file and add the following configuration:
{ "autoload": { "psr-4": { "NewsClassifier\": "src/" } }, "require": { "rubix/ml": "^2.5" } }
This tells Composer to autoload any classes within the src folder under the NewsClassifier namespace.
Step 4: Run Composer Autoload Dump
After adding the autoload configuration, run the following command to regenerate Composer’s autoloader:
mkdir NewsClassifier cd NewsClassifier
Step 5: Directory Structure
Your project directory should look like this:
composer require rubix/ml
- src/: Contains your PHP scripts.
- storage/: Where the trained model will be saved.
- vendor/: Contains dependencies installed by Composer.
Creating the News Classification Class
In src/, create a file called Classification.php. This file will contain the methods for training the model and predicting news categories.
{ "autoload": { "psr-4": { "NewsClassifier\": "src/" } }, "require": { "rubix/ml": "^2.5" } }
This Classification class contains methods to:
- Train: Create and train a pipeline-based model.
- Save the Model: Save the trained model to the specified path.
- Predict: Load the saved model and predict the category for new samples.
Training the Model
Create a script called train.php in src/ to train the model.
composer dump-autoload
Run this script to train the model:
NewsClassifier/ ├── src/ │ ├── Classification.php │ └── train.php ├── storage/ ├── vendor/ ├── composer.json └── composer.lock
If successful, you’ll see:
<?php namespace NewsClassifier; use Rubix\ML\Classifiers\KNearestNeighbors; use Rubix\ML\Datasets\Labeled; use Rubix\ML\Datasets\Unlabeled; use Rubix\ML\PersistentModel; use Rubix\ML\Pipeline; use Rubix\ML\Tokenizers\Word; use Rubix\ML\Transformers\TfIdfTransformer; use Rubix\ML\Transformers\WordCountVectorizer; use Rubix\ML\Persisters\Filesystem; class Classification { private $modelPath; public function __construct($modelPath) { $this->modelPath = $modelPath; } public function train() { // Sample data and corresponding labels $samples = [ ['The team played an amazing game of soccer'], ['The new programming language has been released'], ['The match between the two teams was incredible'], ['The new tech gadget has been launched'], ]; $labels = [ 'sports', 'technology', 'sports', 'technology', ]; // Create a labeled dataset $dataset = new Labeled($samples, $labels); // Set up the pipeline with a text transformer and K-Nearest Neighbors classifier $estimator = new Pipeline([ new WordCountVectorizer(10000, 1, 1, new Word()), new TfIdfTransformer(), ], new KNearestNeighbors(4)); // Train the model $estimator->train($dataset); // Save the model $this->saveModel($estimator); echo "Training completed and model saved.\n"; } private function saveModel($estimator) { $persister = new Filesystem($this->modelPath); $model = new PersistentModel($estimator, $persister); $model->save(); } public function predict(array $samples) { // Load the saved model $persister = new Filesystem($this->modelPath); $model = PersistentModel::load($persister); // Predict categories for new samples $dataset = new Unlabeled($samples); return $model->predict($dataset); } }
Predicting New Samples
Create another script, predict.php, in src/ to classify new articles based on the trained model.
<?php require __DIR__ . '/../vendor/autoload.php'; use NewsClassifier\Classification; // Define the model path $modelPath = __DIR__ . '/../storage/model.rbx'; // Initialize the Classification object $classifier = new Classification($modelPath); // Train the model and save it $classifier->train();
Run the prediction script to classify the samples:
php src/train.php
The output should show each sample text with its predicted category.
Final Thoughts
With this guide, you’ve successfully built a simple news classifier in PHP using Rubix ML! This demonstrates how PHP can be more versatile than you might think, bringing in machine learning capabilities for tasks like text classification, recommendation systems, and more. The full code for this project is available on GitHub.
Experiment with different algorithms or data to expand the classifier. Who knew PHP could do machine learning? Now you do.
Happy coding!
The above is the detailed content of Machine Learning in PHP: Build a News Classifier Using Rubix ML. For more information, please follow other related articles on the PHP Chinese website!

Load balancing affects session management, but can be resolved with session replication, session stickiness, and centralized session storage. 1. Session Replication Copy session data between servers. 2. Session stickiness directs user requests to the same server. 3. Centralized session storage uses independent servers such as Redis to store session data to ensure data sharing.

Sessionlockingisatechniqueusedtoensureauser'ssessionremainsexclusivetooneuseratatime.Itiscrucialforpreventingdatacorruptionandsecuritybreachesinmulti-userapplications.Sessionlockingisimplementedusingserver-sidelockingmechanisms,suchasReentrantLockinJ

Alternatives to PHP sessions include Cookies, Token-based Authentication, Database-based Sessions, and Redis/Memcached. 1.Cookies manage sessions by storing data on the client, which is simple but low in security. 2.Token-based Authentication uses tokens to verify users, which is highly secure but requires additional logic. 3.Database-basedSessions stores data in the database, which has good scalability but may affect performance. 4. Redis/Memcached uses distributed cache to improve performance and scalability, but requires additional matching

Sessionhijacking refers to an attacker impersonating a user by obtaining the user's sessionID. Prevention methods include: 1) encrypting communication using HTTPS; 2) verifying the source of the sessionID; 3) using a secure sessionID generation algorithm; 4) regularly updating the sessionID.

The article discusses PHP, detailing its full form, main uses in web development, comparison with Python and Java, and its ease of learning for beginners.

PHP handles form data using $\_POST and $\_GET superglobals, with security ensured through validation, sanitization, and secure database interactions.

The article compares PHP and ASP.NET, focusing on their suitability for large-scale web applications, performance differences, and security features. Both are viable for large projects, but PHP is open-source and platform-independent, while ASP.NET,

PHP's case sensitivity varies: functions are insensitive, while variables and classes are sensitive. Best practices include consistent naming and using case-insensitive functions for comparisons.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

SublimeText3 Linux new version
SublimeText3 Linux latest version

Notepad++7.3.1
Easy-to-use and free code editor

MantisBT
Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

SublimeText3 Chinese version
Chinese version, very easy to use

Dreamweaver CS6
Visual web development tools
