search

Home  >  Q&A  >  body text

Linux Bash: Easily delete HTML table data blocks

I have an html file that is processed using a bash script and want to delete empty tables. The file is generated from the sql statement but includes headers when no record is found. I want to delete the title for which no record was found.

<table border="1">
  <caption>Table with data</caption>
  <tr>
    <th align="center">type</th>
    <th align="center">column1</th>
    <th align="center">column2</th>
    <th align="center">column3</th>
    <th align="center">column4</th>
   </tr>
   
   Data rows exists here
   
  </table>

<table border="1">
  <caption>Empty Table To Remove</caption>
  <tr>
    <th align="center">type</th>
    <th align="center">column1</th>
    <th align="center">column2</th>
    <th align="center">column3</th>
    <th align="center">column4</th>
    <th align="center">column5</th>
    <th align="center">column6</th>
    <th align="center">column7</th>
  </tr>
</table>

<table border="1">
  <caption>Table with data</caption>
  <tr>
   <th align="center">type</th>
    <th align="center">column1</th>
    <th align="center">column2</th>
    <th align="center">column3</th>
    <th align="center">column4</th>
   </tr>
     Data rows exists here
  </table>

I tried using a combination of grep and sed to delete the empty table. I am able to accomplish this task when the tables contain the same number of columns. I'm having some problems now because my tables have different number of columns.

When the tables have the same number of columns, I am able to loop based on the headers, count, and then delete. Since the number of columns is different, this doesn't work.

P粉787806024P粉787806024267 days ago561

reply all(1)I'll reply

  • P粉242741921

    P粉2427419212024-04-03 00:19:04

    Like this, use and :

    $ xmlstarlet format -H file.html | sponge file.html
    $ xmlstarlet ed -d '//table[./caption/text()="Empty Table To Remove"]' file.html 
    
    
    
      
        
       
       Data rows exists here
       
      
    Table with data
    typecolumn1column2column3column4
    Data rows exists here
    Table with data
    typecolumn1column2column3column4

    To edit in a location such as sed -i, use

    xmlstarlet edit -L ...

    No explanation, but do not use sed or regex to parse HTML/XML

    reply
    0
  • Cancelreply