Home >Backend Development >Python Tutorial >How to Efficiently Delete Rows from a Pandas DataFrame Based on a Column Value?

How to Efficiently Delete Rows from a Pandas DataFrame Based on a Column Value?

Mary-Kate Olsen
Mary-Kate OlsenOriginal
2024-12-18 14:06:10487browse

How to Efficiently Delete Rows from a Pandas DataFrame Based on a Column Value?

Deleting DataFrame Row in Pandas Based on Column Value

Problem:

Consider a Pandas DataFrame with a column named line_race. The task is to remove all rows where the value in the line_race column is equal to 0.

Efficient Solution:

To efficiently remove rows based on a specific column value, use the following steps:

  1. Import the Pandas library:

    import pandas as pd
  2. Create the DataFrame with the given data:

    data = {
        "line_race": [11, 11, 9, 10, 10, 9, 8, 9, 11, 8, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
        "rating": [56, 67, 66, 83, 88, 52, 66, 70, 68, 72, 65, 70, 64, 70, 70, -1, -1, -1, -1, -1, 69, -1, -1, -1, -1],
        "rw": [1.000000, 1.000000, 1.000000, 0.880678, 0.793033, 0.636655, 0.581946, 0.518825, 0.486226, 0.446667, 0.164591, 0.142409, 0.134800, 0.117803, 0.113758, 0.109852, 0.098919, 0.093168, 0.083063, 0.075171, 0.048690, 0.045404, 0.039679, 0.034160, 0.030915],
        "wrating": [56.000000, 67.000000, 66.000000, 73.096278, 69.786942, 33.106077, 38.408408, 36.317752, 33.063381, 32.160051, 10.698423, 9.968634, 8.627219, 8.246238, 7.963072, -0.109852, -0.098919, -0.093168, -0.083063, -0.075171, 3.359623, -0.045404, -0.039679, -0.034160, -0.030915],
        "line_date": ["2007-03-31", "2007-03-10", "2007-02-10", "2007-01-13", "2006-12-23", "2006-11-09", "2006-10-22", "2006-09-29", "2006-09-16", "2006-08-30", "2006-02-11", "2006-01-13", "2006-01-02", "2005-12-06", "2005-11-29", "2005-11-22", "2005-11-01", "2005-10-20", "2005-09-27", "2005-09-07", "2005-06-12", "2005-05-29", "2005-05-02", "2005-04-02", "2005-03-13", "2004-11-09"]
    }
    
    df = pd.DataFrame(data)
  3. Filter the DataFrame using the query() method, which is faster than using boolean indexing:

    df_filtered = df.query("line_race != 0")
  4. Alternatively, you can use the drop() method with the inplace parameter set to True:

    df.drop(df.index[df['line_race'] == 0], inplace=True)
  5. The filtered DataFrame can then be assigned to the original DataFrame variable or assigned to a new variable.

The updated DataFrame will no longer contain rows where the line_race column is equal to 0.

The above is the detailed content of How to Efficiently Delete Rows from a Pandas DataFrame Based on a Column Value?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn