Combine rule-based and machine learning approaches to build powerful hybrid systems-AI-php.cn

Home

Technology peripherals

Combine rule-based and machine learning approaches to build powerful hybrid systems

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

May 25, 2023 am 09:04 AM

machine learningml

After all these years, we are all convinced that ML can, if not perform better, at least match pre-ML solutions almost everywhere. For example, for some rule constraints, we will all think about whether they can be replaced by a tree-based ML model. But the world isn't always black and white, and while machine learning certainly has its place in solving problems, it's not always the best solution. Rule-based systems can even outperform machine learning, especially in areas where explainability, robustness, and transparency are critical.

In this article, I will introduce some practical cases and how combining manual rules and ML can make our solutions better.

Rule-based system

A rule-based system provides support for decision-making through predefined rules. The system evaluates data based on stored rules and performs specific operations based on mappings.

Here are a few examples:

Fraud Detection: In fraud detection, a rules-based system can be used to quickly flag and investigate suspicious transactions based on predefined rules.

For example, for chess cheaters, their basic approach is to install a computer chess application in another window and use the program to play chess. No matter how complex the program is, each step requires 4- 5 seconds to complete. Therefore, a "threshold" is added to calculate the time of each step of the player. If the fluctuation is not large, it may be judged as a cheater, as shown in the following figure:

Combine rule-based and machine learning approaches to build powerful hybrid systems

Health care industry: Rules-based systems can be used to manage prescriptions and prevent medication errors. They can also be very useful in helping doctors prescribe additional analyzes to patients based on previous results.

Supply Chain Management: In supply chain management, rules-based systems can be used to generate low inventory alerts, help manage expiration dates, or new product launches.

Machine Learning-Based Systems

Machine learning (ML) systems use algorithms to learn from data and make predictions or take actions without being explicitly programmed. Machine learning systems use knowledge gained through training on large amounts of data to make predictions and decisions about new data. ML algorithms can improve their performance as more data is used for training. Machine learning systems include natural language processing, image and speech recognition, predictive analytics, and more.

Fraud Detection: Banks may use machine learning systems to learn from past fraudulent transactions and identify potential fraudulent activity in real time. Or, it might reverse engineer the system and look for transactions that look very "abnormal."

Healthcare: Hospitals may use ML systems to analyze patient data and predict a patient's likelihood of developing a certain disease based on certain X-rays.

Combine rule-based and machine learning approaches to build powerful hybrid systems

Comparison

Rule-based systems and ML systems have their own advantages and disadvantages

Rule-based The advantages of the system are obvious:

Easy to understand and explain
Quick implementation
Easy to modify
Robust

Disadvantages:

Problems involving a large number of variables
Problems with many constraints
Limited to existing rules

Based on The advantages of ml's system are also obvious

Autonomous learning system
The ability to solve more complex problems
Reduces human intervention compared with rule-based systems , improve efficiency
Flexibly adapt to changes in data and environment through continuous learning

Disadvantages:

Required data, sometimes a lot
Limited to the data ML seen before
Limited cognitive ability

Through comparison, we found that the advantages and disadvantages of the two systems do not conflict and are complementary. , so is there a way to combine their advantages?

Hybrid systems

Combine rule-based and machine learning approaches to build powerful hybrid systems

Hybrid systems, which combine rule-based systems and machine learning algorithms, have become increasingly popular recently Popularity. They can provide more robust, accurate and efficient results, especially when dealing with complex problems.

Let’s take a look at a hybrid system that can be implemented using the rental dataset:

Combine rule-based and machine learning approaches to build powerful hybrid systems

Feature Engineering: Convert Floors to Three One of several categories: high, medium or low, depending on the number of floors in the building. This can improve the efficiency of ML models

Hard-coded rules can be used as part of the feature engineering process to identify and extract important features in the input data. For example, if the problem domain is clear and unambiguous, the rules can be easily and accurately defined, and hard-coded rules can be used to create new features or modify existing features to improve the performance of the machine learning model. Although hardcoding rules and feature engineering are two different techniques, they can be used together to improve the performance of machine learning models. Hard-coded rules can be used to create new features or modify existing features, while feature engineering can be used to extract features that are not easily captured by hard-coded rules.

Post-processing: round or normalize the final result.

Hard-coded rules can be used as part of the post-processing stage to modify the output of the machine learning model. For example, if a machine learning model outputs a set of predictions that are inconsistent with some known rules or constraints, hard-coded rules can be used to modify the predictions so that they comply with the rules or constraints. Post-processing techniques such as filtering or smoothing can refine the output of a machine learning model by removing noise or errors, or improving the overall accuracy of predictions. These techniques are particularly effective when there is uncertainty in the machine learning model's output probabilistic predictions or in the input data. In some cases, post-processing techniques can also be used to enhance the input data with additional information. For example, if a machine learning model is trained on a limited data set, post-processing techniques can be used to extract additional features from external sources (such as social media or news feeds) to improve the accuracy of predictions.

Case

Healthcare

Let’s look at the data on heart disease:

Combine rule-based and machine learning approaches to build powerful hybrid systems

If we use random forest to predict the target class:

clf = RandomForestClassifier(n_estimators=100, random_state=random_seed
 X_train, X_test, y_train, y_test = train_test_split(
 df.iloc[:, :-1], df.iloc[:, -1], test_size=0.30, random_state=random_seed
 )
 clf.fit(X_train, y_train))

One of the reasons for choosing random forest here is its ability to build feature importance. Below you can see the importance of the features used for training:

Combine rule-based and machine learning approaches to build powerful hybrid systems

Look at the results:

y_pred = pd.Series(clf.predict(X_test), index=y_test.index
 cm = confusion_matrix(y_test, y_pred, labels=clf.classes_)
 conf_matrix = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=clf.classes_)
 conf_matrix.plot())

Combine rule-based and machine learning approaches to build powerful hybrid systems

f1_score(y_test, y_pred): 0.74
 recall_score(y_test, y_pred): 0.747

That’s when a cardiologist sees your model. Based on his experience and domain knowledge, he believes that the thalassemia characteristic (thal) is much more important than shown above. So we decided to build a histogram and see the results.

Combine rule-based and machine learning approaches to build powerful hybrid systems

Then specify a mandatory rule

y_pred[X_test[X_test["thal"] == 2].index] = 1

The resulting confusion matrix becomes like this:

Combine rule-based and machine learning approaches to build powerful hybrid systems

f1_score(y_test, y_pred): 0.818
 recall_score(y_test, y_pred): 0.9

The results have been greatly improved. This is where domain knowledge plays an important role in assessing patient scores.

Fraudulent Transactions

The following data set is bank fraudulent transactions.

Combine rule-based and machine learning approaches to build powerful hybrid systems

The data set is highly imbalanced:

df["Class"].value_counts()
 0 28431
 1 4925

To create the rules, we look at the box plot of the distribution of the features:

Combine rule-based and machine learning approaches to build powerful hybrid systems

We are going to write our own HybridEstimator class, which will serve as an estimator for our manual rules:

from hulearn.classification import FunctionClassifier
 rules = {
 "V3": ("<=", -2),
 "V12": ("<=", -3),
 "V17": ("<=", -2),
 }
 def create_rules(data: pd.DataFrame, rules):
 filtered_data = data.copy()
 for col in rules:
 filtered_data[col] = eval(f"filtered_data[col] {rules[col][0]} {rules[col][1]}")
 result = np.array(filtered_data[list(rules.keys())].min(axis=1)).astype(int)
 return result
 hybrid_classifier = FunctionClassifier(create_rules, rules=rules)

We can compare pure Results of rule-based system and kNN method. The reason kNN is used here is that it can handle imbalanced data:

Combine rule-based and machine learning approaches to build powerful hybrid systems

As we can see, we With only 3 rules written, it performs better than the KNN model

Summary

Our example here may not be very accurate, but it is enough to illustrate that the hybrid model provides practical benefits , such as fast implementation, robustness to outliers and increased transparency. They are beneficial when combining business logic with machine learning. For example, hybrid rule-ML systems in healthcare can diagnose diseases by combining clinical rules with machine learning algorithms that analyze patient data. Machine learning can achieve excellent results on many tasks, but it also requires supplementary domain knowledge. Domain knowledge can help machine learning models better understand data and predict and classify more accurately.

Hybrid models can help us combine domain knowledge and machine learning models. Hybrid models are usually composed of multiple sub-models, each of which is optimized for specific domain knowledge. These sub-models can be models based on hard-coded rules, models based on statistical methods, or even models based on deep learning.

Hybrid models can use domain knowledge to guide the learning process of machine learning models, thereby improving the accuracy and reliability of the model. For example, in the medical field, hybrid models can combine a doctor’s expertise with the power of a machine learning model to diagnose a patient’s disease. In the field of natural language processing, hybrid models can combine linguistic knowledge and the capabilities of machine learning models to better understand and generate natural language.

In short, hybrid models can help us combine domain knowledge and machine learning models, thereby improving the accuracy and reliability of the model, and have a wide range of applications in various tasks.

The above is the detailed content of Combine rule-based and machine learning approaches to build powerful hybrid systems. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete

Tool Calling in LLMsApr 14, 2025 am 11:28 AM

Large language models (LLMs) have surged in popularity, with the tool-calling feature dramatically expanding their capabilities beyond simple text generation. Now, LLMs can handle complex automation tasks such as dynamic UI creation and autonomous a

How ADHD Games, Health Tools & AI Chatbots Are Transforming Global HealthApr 14, 2025 am 11:27 AM

Can a video game ease anxiety, build focus, or support a child with ADHD? As healthcare challenges surge globally — especially among youth — innovators are turning to an unlikely tool: video games. Now one of the world’s largest entertainment indus

UN Input On AI: Winners, Losers, And OpportunitiesApr 14, 2025 am 11:25 AM

“History has shown that while technological progress drives economic growth, it does not on its own ensure equitable income distribution or promote inclusive human development,” writes Rebeca Grynspan, Secretary-General of UNCTAD, in the preamble.

Learning Negotiation Skills Via Generative AIApr 14, 2025 am 11:23 AM

Easy-peasy, use generative AI as your negotiation tutor and sparring partner. Let’s talk about it. This analysis of an innovative AI breakthrough is part of my ongoing Forbes column coverage on the latest in AI, including identifying and explaining

TED Reveals From OpenAI, Google, Meta Heads To Court, Selfie With MyselfApr 14, 2025 am 11:22 AM

The TED2025 Conference, held in Vancouver, wrapped its 36th edition yesterday, April 11. It featured 80 speakers from more than 60 countries, including Sam Altman, Eric Schmidt, and Palmer Luckey. TED’s theme, “humanity reimagined,” was tailor made

Joseph Stiglitz Warns Of The Looming Inequality Amid AI Monopoly PowerApr 14, 2025 am 11:21 AM

Joseph Stiglitz is renowned economist and recipient of the Nobel Prize in Economics in 2001. Stiglitz posits that AI can worsen existing inequalities and consolidated power in the hands of a few dominant corporations, ultimately undermining economic

What is Graph Database?Apr 14, 2025 am 11:19 AM

Graph Databases: Revolutionizing Data Management Through Relationships As data expands and its characteristics evolve across various fields, graph databases are emerging as transformative solutions for managing interconnected data. Unlike traditional

LLM Routing: Strategies, Techniques, and Python ImplementationApr 14, 2025 am 11:14 AM

Large Language Model (LLM) Routing: Optimizing Performance Through Intelligent Task Distribution The rapidly evolving landscape of LLMs presents a diverse range of models, each with unique strengths and weaknesses. Some excel at creative content gen

See all articles