search

Home  >  Q&A  >  body text

python library to extract html string using css font-family?

Is there a library that uses the font-family attribute of css to extract html strings under python? Used for font subsetting.

扔个三星炸死你扔个三星炸死你2749 days ago1087

reply all(2)I'll reply

  • 我想大声告诉你

    我想大声告诉你2017-06-12 09:29:55

    The question you asked is a bit vague. If you use CSS Selector to get the content in html, you can use lxml.cssselect. There are Chinese instructions for this, and it’s not just about using lxml

    reply
    0
  • 巴扎黑

    巴扎黑2017-06-12 09:29:55

    font-family just specifies the font to use.

    What you want to do is to calculate how many Chinese characters there are in an HTML article, and then dynamically or semi-statically generate a smaller Chinese character font containing only these characters for remote download and use?

    If you just count Chinese characters, the set under python is actually the simplest.

    But it is a big pitfall to generate the corresponding font library. Founder currently has a similar service, which seems to be called Yunziku. I have inquired about the price before, and the other party honestly said that there are many problems.

    reply
    0
  • Cancelreply