search

Home  >  Q&A  >  body text

Python method to extract HTML page strings and convert them into data frames

<p>I have an HTML page that contains a string. I want to convert it to dataframe. Rows in this string are separated by a space that is not inside quotes (there are spaces between quotes in column values). </p> <p>Page link: https://gladys.geog.ucl.ac.uk/bikesapi/load.php?scheme=saopaulo</p> <p>I know this is a common question, but if this is an exact duplicate with the same problem and solution, could you please send me the link? I tried several solutions but none matched my problem. </p>
P粉691461301P粉691461301469 days ago498

reply all(1)I'll reply

  • P粉775788723

    P粉7757887232023-08-17 10:20:21

    Try using pd.read_csv:

    url = "https://gladys.geog.ucl.ac.uk/bikesapi/load.php?scheme=saopaulo"
    
    df = pd.read_csv(url)
    print(df.head())
    

    Output result:

       #id timestamp|gmt_local_diff_sec|gmt_servertime_diff_sec                   name        lat        lon  bikes  spaces  installed  locked  temporary  total_docks  givesbonus_acceptspedelecs_fbbattlevel  pedelecs
    0    1                               1692123219|10800|-3600    1 - Largo da Batata -23.566831 -46.693741     43      37       True   False      False           83                                     NaN        10
    1    3                               1692123219|10800|-3600     3 - CPTM Pinheiros -23.566478 -46.701258      6       7       True   False      False           15                                     NaN         3
    2    4                               1692123219|10800|-3600  4 - Rua Diogo Moreira -23.569145 -46.692003      2      20       True   False      False           23                                     NaN         2
    3    5                               1692123219|10800|-3600        5 - Chicão Vive -23.569894 -46.697897      4       7       True   False      False           11                                     NaN         1
    4    6                               1692123219|10800|-3600        6 - Rua Manduri -23.572137 -46.690107     10       7       True   False      False           19                                     NaN         0
    

    reply
    0
  • Cancelreply