search
HomeJavajavaTutorialRecommended resources for Mahout video tutorials

Recommended resources for Mahout video tutorials

Sep 01, 2017 am 10:01 AM
mahoutrecommendresource

Mahout provides some scalable implementations of classic algorithms in the field of machine learning, aiming to help developers create intelligent applications more conveniently and quickly. Mahout contains many implementations, including clustering, classification, recommendation filtering, and frequent sub-item mining. Additionally, Mahout can efficiently scale to the cloud by using the Apache Hadoop library.

Recommended resources for Mahout video tutorials

The teacher’s teaching style:

The teacher’s lectures are simple and easy to understand, clear in structure, analyzed layer by layer, interlocking, and rigorous in argumentation , has a rigorous structure, uses the logical power of thinking to attract students' attention, and uses reason to control the classroom teaching process. By listening to the teacher's lectures, students not only learn knowledge, but also receive thinking training, and are also influenced and influenced by the teacher's rigorous academic attitude

The more difficult point in this video is the logistic regression classifier_Bei Yess Classifier_1:

1. Background

First of all, at the beginning of the article, let’s ask a few questions , if you can answer these questions, then you don’t need to read this article, or your motivation for reading is purely to find faults with this article. Of course, I also welcome it. Please send an email to "Naive Bayesian of Faults" to 297314262 @qq.com, I will read your letter carefully.

By the way, if after reading this article, you still can’t answer the following questions, then please notify me by email and I will try my best to answer your doubts.

The "naive" in the naive Bayes classifier specifically refers to what characteristics of this classifier

Naive Bayes classifier and maximum likelihood estimation (MLE), maximum posterior The relationship between probability (MAP)

The relationship between naive Bayes classification, logistic regression classification, generative model, and decision model

The relationship between supervised learning and Bayesian estimation

2. Agreement

So, this article begins. First of all, regarding the various expression forms that may appear in this article, some conventions are made here

Capital letters, such as X, represent random variables; if X is a multi-dimensional variable, then the subscript i represents the i-th dimension variable, That is, Xi

lowercase letters, such as Xij, represent one value of the variable (the jth value of Xi)

3. Bayesian estimation and supervised learning

Okay, so first answer the fourth question, how to use Bayesian estimation to solve supervised learning problems?

For supervised learning, our goal is actually to estimate an objective function f: X->Y,, or target distribution P(Y|X), where The variable, Y, is the actual classification result of the sample. Assume that the value of sample |X=xk), just find all the estimates of P(X=xk|Y=yi) and all the estimates of P(Y=yi) based on the sample. The subsequent classification process is to find the largest yi of P(Y=yi|X=xk). It can be seen that using Bayesian estimation can solve the problem of supervised learning.

4. The "simple" characteristics of the classifierRecommended resources for Mahout video tutorials

Next, answer the first question, what is "simple"?

From the analysis in Section 3, we know that to obtain P(Y=yi|X=xk), we need to know all the estimates of P(X=xk|Y=yi), as well as P( Y=yi), then assume that There are also two categories, so you need to find 2*(2^N - 1) estimates (note that since Y is given as a certain category, the sum of the probabilities of each value of X is 1, so the actual The value that needs to be estimated is 2^N - 1). As you can imagine, for situations where N is very large (the possible values ​​of term are very large during text classification), the amount of calculation required for this estimation is huge. So how to reduce the amount of estimation required and make the Bayesian estimation method feasible? Here, an assumption is introduced:

Assumption: Under the given condition Y=yi, the dimensional variables of X are independent of each other.

Then, under this assumption, P(X=xk|Y=yi)=P(X1=x1j1|Y=yi)P(X2=x2j2|Y=yi)... P(Xn=xnjn|Y=yi), that is to say, at this time only N estimates are required. Therefore, this assumption reduces the computational complexity of Bayesian estimation from 2*(2^N - 1) to N, making this classifier practical. This assumption then becomes a naive property.

5. Maximum likelihood estimation and maximum posterior probability solution

Next, to answer the second question, our first choice is to apply the maximum likelihood estimation method to naive Bayes classification The solution process of the device.

As mentioned above, the solution of P(X=xk|Y=yi) can be transformed into the solution of P(X1=x1j1|Y=yi), P(X2=x2j2|Y=yi),... P (Xn=xnjn|Y=yi), then how to use the maximum likelihood estimation method to find these values?

First choice We need to understand what maximum likelihood estimation is. In fact, in our probability theory textbooks, the explanations about maximum likelihood estimation are all about solving unsupervised learning problems. After reading After reading this section, you should understand that using maximum likelihood estimation to solve supervised learning problems under naive characteristics is actually using maximum likelihood estimation to solve unsupervised learning problems under various categories of conditions.

The above is the detailed content of Recommended resources for Mahout video tutorials. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
How does IntelliJ IDEA identify the port number of a Spring Boot project without outputting a log?How does IntelliJ IDEA identify the port number of a Spring Boot project without outputting a log?Apr 19, 2025 pm 11:45 PM

Start Spring using IntelliJIDEAUltimate version...

How to elegantly obtain entity class variable names to build database query conditions?How to elegantly obtain entity class variable names to build database query conditions?Apr 19, 2025 pm 11:42 PM

When using MyBatis-Plus or other ORM frameworks for database operations, it is often necessary to construct query conditions based on the attribute name of the entity class. If you manually every time...

How to use the Redis cache solution to efficiently realize the requirements of product ranking list?How to use the Redis cache solution to efficiently realize the requirements of product ranking list?Apr 19, 2025 pm 11:36 PM

How does the Redis caching solution realize the requirements of product ranking list? During the development process, we often need to deal with the requirements of rankings, such as displaying a...

How to safely convert Java objects to arrays?How to safely convert Java objects to arrays?Apr 19, 2025 pm 11:33 PM

Conversion of Java Objects and Arrays: In-depth discussion of the risks and correct methods of cast type conversion Many Java beginners will encounter the conversion of an object into an array...

How do I convert names to numbers to implement sorting and maintain consistency in groups?How do I convert names to numbers to implement sorting and maintain consistency in groups?Apr 19, 2025 pm 11:30 PM

Solutions to convert names to numbers to implement sorting In many application scenarios, users may need to sort in groups, especially in one...

E-commerce platform SKU and SPU database design: How to take into account both user-defined attributes and attributeless products?E-commerce platform SKU and SPU database design: How to take into account both user-defined attributes and attributeless products?Apr 19, 2025 pm 11:27 PM

Detailed explanation of the design of SKU and SPU tables on e-commerce platforms This article will discuss the database design issues of SKU and SPU in e-commerce platforms, especially how to deal with user-defined sales...

How to set the default run configuration list of SpringBoot projects in Idea for team members to share?How to set the default run configuration list of SpringBoot projects in Idea for team members to share?Apr 19, 2025 pm 11:24 PM

How to set the SpringBoot project default run configuration list in Idea using IntelliJ...

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

DVWA

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

Safe Exam Browser

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

SublimeText3 Linux new version

SublimeText3 Linux new version

SublimeText3 Linux latest version

mPDF

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),