Python program to search for specific strings in hash values (coding help)

2022-12-30 16:43 问答作者：

Trying to write a code that searches hash values for specific string's (input by user) and returns the hash if searchquery is present in that line.

Doing this to kind of just learn python a bit more, but it could be a real world application used by an HR department to search a .csv resume database for specific words in each resume.

I'd like this program to look through a .csv file that has three entries per line (id#;applicant name;resume text)

I set it up so that it creates a hash, then created a string for the resume text hash entry, and am trying to use the .find() function to return the entire hash for each instance.

What i'd like is if the word "gpa" is used as a search query and it is found in s['resumetext'] for three applicants(rows in .csv file), it prints the id, name, and res开发者_如何学运维ume for every row that has it.(All three applicants)

As it is right now, my program prints the first row in the .csv file(print resume['id'], resume['name'], resume['resumetext']) no matter what the searchquery is, whether it's in the resumetext or not.

lastly, are there better ways to doing this, by searching word documents, pdf's and .txt files in a folder for specific words using python (i've just started reading about the re module and am wondering if this may be the route, rather than putting everything in a .csv file.)

def find_details(id2find):
    resumes_f=open("resume_data.csv")
    for each_line in resumes_f:
        s={}
        (s['id'], s['name'], s['resumetext']) = each_line.split(";")
        resumetext = str(s['resumetext'])
        if resumetext.find(id2find):
            return(s)
        else:
            print "No data matches your search query. Please try again"

searchquery = raw_input("please enter your search term")
resume = find_details(searchquery)
if resume:
    print resume['id'], resume['name'], resume['resumetext']

The line

resumetext = str(s['resumetext'])

is redundant, because s['resumetext'] is already a string (since it comes as one of the results from a .split call). So, you can merge this line and the next into

if id2find in s['resumetext']: ...

Your following else is misaligned -- with it placed like that, you'll print the message over and over again. You want to place it after the for loop (and the else isn't needed, though it would work), so I'd suggest:

for each_line in resumes_f:
    s = dict(zip('id name resumetext'.split(), each_line.split(";"))
    if id2find in s['resumetext']:
        return(s)
print "No data matches your search query. Please try again"

I've also shown an alternative way to build dict s, although yours is fine too.

What @Justin Peel said. Also to be more pythonic I would say change

if resumetext.find(id2find) != -1: to if id2find in resumetext:

A few more changes: you might want to lower case the comparison and user input so it matches GPA, gpa, Gpa, etc. You can do this by doing searchquery = raw_input("please enter your search term").lower() and resumetext = s['resumetext'].lower(). You'll note I removed the explicit cast around s['resumetext'] as it's not needed.

One change that I recommend for your code is changing

if resumetext.find(id2find):

if resumetext.find(id2find) != -1:

because find() returns -1 if id2find wasn't in resumetext. Otherwise, it returns the index where id2find is first found in resumetext, which could be 0. As @Personman commented, this would give you the false positive because -1 is interpreted as True in Python.

I think that problem has something to do with the fact that find_details() only returns the first entry for which the search string is found in resumetext. It might be good to make find_details() into a generator instead and then you could iterate over it and print the found records out one by one.

继续阅读：full-text-search hash parsing python regex

Python program to search for specific strings in hash values (coding help)

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？