Home > Article > Backend Development > Python some installation methods
ubuntu14.04 installation
pip install xgboost
Error reporting
sudo apt-get update
The result is the same error
Solution:
sudo -H pip install --pre xgboost Successfully installed xgboost Cleaning up...
Success!
Overfitting
When you observe that the training accuracy is high but the detection accuracy is low, you are likely to encounter an overfitting problem.
xgboost is a fast and effective boosting model.
Boosting classifier is an integrated learning model. The basic idea is to combine hundreds or thousands of tree models with low classification accuracy into a model with high accuracy. This model will continue to iterate, generating a new tree with each iteration. Many methods have been proposed on how to generate a reasonable tree at each step. Here we briefly introduce the Gradient Boosting Machine proposed by Friedman. It uses the idea of gradient descent when generating each tree, taking one more step towards minimizing the given objective function based on all previously generated trees. Under reasonable parameter settings, we often need to generate a certain number of trees to achieve satisfactory accuracy. When the data set is large and complex, we may need thousands of iterative operations. If it takes a few seconds to generate a tree model, then the time-consuming operation of so many iterations should allow you to concentrate on thinking quietly...
Now, we hope to better solve this problem through the xgboost tool. The full name of xgboost is eXtreme Gradient Boosting. As its name suggests, it is a C++ implementation of Gradient Boosting Machine. The author is Chen Tianqi, a master who is studying machine learning at the University of Washington. During his research, he felt that he was limited by the calculation speed and accuracy of existing libraries, so he started building the xgboost project a year ago, and it gradually took shape last summer. The biggest feature of xgboost is that it can automatically utilize the multi-threading of the CPU for parallelization, and at the same time improve the algorithm to increase the accuracy. Its debut was Kaggle's Higgs signal identification competition. Because of its outstanding efficiency and high prediction accuracy, it attracted widespread attention from the contestants in the competition forum, and it occupied a place in the fierce competition of more than 1,700 teams. As its popularity in the Kaggle community increases, a team has recently won first place in the competition with the help of xgboost. In order to make it easier for everyone to use, Chen Tianqi encapsulated xgboost into a python library. I was fortunate to work with him to create an R language interface for the xgboost tool and submit it to CRAN. Some users have also encapsulated it into a julia library. The functions of the python and R interfaces have been constantly updated. You can learn about the general functions below, and then choose the language you are most familiar with to learn.
Ipython notebook Use the
ipython notebook
The above is the detailed content of Python some installation methods. For more information, please follow other related articles on the PHP Chinese website!