高洛峰2017-04-18 09:24:21
I changed your code so it should be simpler:
(Changed it based on @evian’s suggestion)
import os
import sys
def getinfo(filename) :
info = {}
with open(filename, 'r') as f:
for line in f:
ID, name = line.strip().split()
info[ID] = name
return info
def matchname(info, input_file, output_file) :
with open(input_file, 'r') as reader, open(output_file, 'w') as writer:
for line in reader:
n1, n2, content = line.strip().split()
for ID, name in info.items():
if name in content:
print(n1, n2, name, ID, sep='\t', file=writer)
if __name__ == '__main__':
info_filename = 'aa.txt'
content_filename = 'bb.txt'
result_filename = 'final_output2.txt'
info = getinfo(info_filename)
matchname(info, content_filename, result_filename)
print('done')
(Come back later to add explanation...)
Questions I answered: Python-QA
大家讲道理2017-04-18 09:24:21
in
当然比 find
Fast, because the former requires fewer attribute searches, function calls, and more comparison operations than the latter:
>>> def t():
... return "abctestdef".find("testx")
...
>>> import dis
>>> dis.dis(t)
2 0 LOAD_CONST 1 ('abctestdef')
3 LOAD_ATTR 0 (find)
6 LOAD_CONST 2 ('testx')
9 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
12 RETURN_VALUE
>>> def t():
... return "test" in "abctestdef"
...
>>> dis.dis(t)
2 0 LOAD_CONST 1 ('test')
3 LOAD_CONST 2 ('abctestdef')
6 COMPARE_OP 6 (in)
9 RETURN_VALUE
If you want to go faster, consider using Rust :-)
Also, your code is not very well written. It is recommended to use with instead of closing manually for file operations.
PHPz2017-04-18 09:24:21
The in time complexity of set is O(1)
The in time complexity of list is O(n)
You can try using sets when assembling