Home  >  Q&A  >  body text

java字符串搜索匹配问题?

我有三万多个词汇和它们的词性(就是动词,名词,形容词或者副词之类的),我想写个函数把以参数的方式传递进来的单词进行词性分析,然后返回它的词性。

public int analyze(String word){
    // 这个地方应该用什么来保存那三万个词汇
    // 这个地方应该用怎样的数据结构或算法来判断word到底在不在我那个三万个词汇里面
    // 怎么做才能功效地判断出word的词性
    return wordType;
}

词汇-词性表目前是单纯的以行为单位的txt文件

word1 t
word2 n
word3 a

就这样,我该怎么做呢,应该用什么来存储我的三万行数据,txt,json,xml,或者写到代码里面放到数组里面去呢?哪一个循环起来快一些,有什么好的建议?

阿神阿神2743 days ago507

reply all(2)I'll reply

  • 怪我咯

    怪我咯2017-04-18 10:53:54

    What is your specific use? Do you use it frequently and focus on query efficiency? You can use map to store memory

    If you don’t use it often, you can split the word into several files according to the first letter. Each query can locate a certain file, which is faster. To be honest, 30,000 is actually quite small, and it shouldn’t be slow to read. If it’s more exaggerated, you can put it in a database.

    I will reply to you if I have better ideas

    reply
    0
  • PHPz

    PHPz2017-04-18 10:53:54

    You can try hadoop’s map reduce

    reply
    0
  • Cancelreply