What does the bootstrap method mean?
The bootstrap method is the self-help method. In statistics, the bootstrap method (Bootstrapping, or bootstrap sampling method) is a uniform sampling with replacement from a given training set, that is, whenever a sample is selected, it is equally likely to be selected again and added to the training set again.
The self-help method was published by Bradley Efron in "Annals of Statistics" in 1979. When the sample comes from the population and can be described by a normal distribution, its sampling distribution is a normal distribution; but when the sample comes from the population that cannot be described by a normal distribution, it is analyzed by asymptotic analysis, bootstrapping, etc. Use random sampling with replacement. For small data sets, bootstrapping works well.
The most commonly used one is the .632 bootstrapping method, assuming that the given data set contains d samples. The data set is sampled d times with replacement, producing a training set of d samples. In this way, some samples in the original data samples are likely to appear multiple times in the sample set. The samples that do not enter the training set eventually form the verification set (test set).
Obviously the probability of each sample being selected is 1/d, so the probability of not being selected is (1-1/d). In this way, the probability that a sample does not appear in the training set is that it has not been selected d times. The probability of selection is (1-1/d)d. When d approaches infinity, this probability will approach e-1=0.368, so the samples remaining in the training set account for approximately 63.2% of the original data set.
The above is the detailed content of What does bootstrap method mean?. For more information, please follow other related articles on the PHP Chinese website!