Home >Backend Development >Python Tutorial >Python some installation methods

Python some installation methods

高洛峰
高洛峰Original
2017-03-09 09:53:531213browse

xgboost series

ubuntu14.04 installation

pip install xgboost

Error reporting

sudo apt-get update

The result is the same error

Solution:

sudo -H pip install --pre xgboost

Successfully installed xgboost
Cleaning up...

Success!

Overfitting
When you observe that the training accuracy is high but the detection accuracy is low, you are likely to encounter an overfitting problem.

xgboost is a fast and effective boosting model.
Boosting classifier is an integrated learning model. The basic idea is to combine hundreds or thousands of tree models with low classification accuracy into a model with high accuracy. This model will continue to iterate, generating a new tree with each iteration. Many methods have been proposed on how to generate a reasonable tree at each step. Here we briefly introduce the Gradient Boosting Machine proposed by Friedman. It uses the idea of ​​gradient descent when generating each tree, taking one more step towards minimizing the given objective function based on all previously generated trees. Under reasonable parameter settings, we often need to generate a certain number of trees to achieve satisfactory accuracy. When the data set is large and complex, we may need thousands of iterative operations. If it takes a few seconds to generate a tree model, then the time-consuming operation of so many iterations should allow you to concentrate on thinking quietly...

Now, we hope to better solve this problem through the xgboost tool. The full name of xgboost is eXtreme Gradient Boosting. As its name suggests, it is a C++ implementation of Gradient Boosting Machine. The author is Chen Tianqi, a master who is studying machine learning at the University of Washington. During his research, he felt that he was limited by the calculation speed and accuracy of existing libraries, so he started building the xgboost project a year ago, and it gradually took shape last summer. The biggest feature of xgboost is that it can automatically utilize the multi-threading of the CPU for parallelization, and at the same time improve the algorithm to increase the accuracy. Its debut was Kaggle's Higgs signal identification competition. Because of its outstanding efficiency and high prediction accuracy, it attracted widespread attention from the contestants in the competition forum, and it occupied a place in the fierce competition of more than 1,700 teams. As its popularity in the Kaggle community increases, a team has recently won first place in the competition with the help of xgboost. In order to make it easier for everyone to use, Chen Tianqi encapsulated xgboost into a python library. I was fortunate to work with him to create an R language interface for the xgboost tool and submit it to CRAN. Some users have also encapsulated it into a julia library. The functions of the python and R interfaces have been constantly updated. You can learn about the general functions below, and then choose the language you are most familiar with to learn.

Ipython notebook Use the

command line to directly enter

ipython notebook

                                                         

The above is the detailed content of Python some installation methods. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn