how to print the linenumber of incorrectwords located in a txt file?
i have this piece of code which only prints the line number of the incorrect words. i want it to print the linenumbers of the incorrect words from the txt file. Am i able to modify this code to do that?
# text1 is my incorrect words
# words is my text file where my incorrect word are in
from collections import defaultdict
d = defaultdict(list)
for lineno, word in enumerate(text1):
d[word].append(lineno)
print(d)
ive now done this but this prints the character its located like the place of the word rather then the line. this is the code
import sys
import string
text = []
infile = open(sys.argv[1], 'r').read()
for punct in string.punctuation:
infile = infile.replace(punct, "")
text = infile.split()
dict = open(sys.argv[2], 'r').read()
dictset = []
dictset = dict.split()
words = []
words = list(set(text) - set(dictset))
words = [text.lower() for text in words]
words.sort()
def allwords(line):
return line.split()
def iswrong(word):
return word in words
for i, line in enumerate(text):
for word in allwords(line):
if iswrong(word):
print(word, i))
the output of that code is
millwal 342
this is printing where the character is located not which line its located
i want it to print the line number so what do i change in my code?????开发者_运维问答
You could completely rewrite this code to do what you mention -- this code's structure has no relation whatsoever to what you require.
Since you need "line numbers from a text file", you'll need an object representing the text file (either as a list of lines in memory, or as an open file object). You say you have one called words
(it's not clear if that's a filename or a Python variable identifier): having the text in a file called (say, as a variable) words
and the (incorrect) words in a (collection of some kind) named text1
is a truly horrible choice of names, possibly the worst I've seen in many decades -- positively misleading. Use variable names that are a better match for the variables' meaning, unless you're trying to confuse yourself and everybody else.
Given a sensibly named variable for the input text, e.g. text = open('thefile.txt')
, and a decent way to determine whether a word is incorrect, say a function def iswrong(word):...
, the way to code what you require becomes clear:
for i, line in enumerate(text):
for word in allwords(line):
if iswrong(word):
print word, i
The allwords
function could be just:
def allwords(line):
return line.split()
if you have no punctuation (words just separated by whitespace), or
import re
def allwords(line):
return re.findall(r'\w+', line)
using regular expressions.
If e.g. badwords
is a set of incorrect words,
def iswrong(word):
return word in badwords
or viceversa if goodwords
is the set of all correct words,
def iswrong(word):
return word not in goodwords
The details of iswrong
and allwords
are secondary -- as is the choice of whether to keep them as functions or just embed their code inline in the main stream of control.
精彩评论