How am I able to pass a var into MeCab for Python?
The code is:
import MeCab
m = MeCab.Tagger("-O wakati")
text = raw_input("Enter Japanese here: ")
print m.parse(text)
The problem is that after entering the string into the raw_input it gives an error in IDLE:
Traceback (most recent call last):
File "C:\Users\---\Desktop\---\Python\japanesetest.py", line 5, in <module>
print m.parse(text)
File "C:\Users\---\Desktop\---\Python\lib\site-packages\MeCab.py", line 220...
def parse(self, *args): return _MeCab.Tagger_parse(self, *args)
TypeError: in method 'Tagger_parse', argument 2 of type 'char const *'
If I do this however:
import MeCab
m = MeCab.Tagger("-O wakati")
print m.parse('なるほど、マルコフ辞書のキーはタプルにしたほうがスッキリしますね。')
I get the proper result:
なるほど 、 マルコフ 辞書 の キー は タプル に し た ほう が スッキリ し ます ね 。
Things I have tried are unicode tags at the beginning, writing to a textfile in unicode and parsing the text, and a few other million things. I'm running Python 2.7 and MeCab 0.98. If this can't be answer, even a little light shed on the error would be appreciate开发者_如何转开发d.
I am able to run your snippet successfully using Python 2.7 and MeCab 0.98 in both IDLE and IPython command line.
import MeCab
m = MeCab.Tagger("-O wakati")
text = raw_input("Enter Japanese here: ")
Enter Japanese here: 私の車はとても高いです。
print m.parse(text)
私 の 車 は とても 高い です 。
However, when reading from a UTF file I will get errors when trying to parse the text. For those cases I explicitly encode the text to shift-jis. You might try this technique. Below is an example.
rawtext = open("UTF.file", "rb").read()
tagger = MeCab.Tagger()
encoded_text = rawtext.encode('shift-jis', errors='ignore')
print tagger.parse(encoded_text).decode('shift-jis', errors='ignore')
This is my current workaround, and should help people coming across the same issue:
import MeCab
import codecs
write_to = codecs.open("pholder.txt", "w", "utf-8")
text = raw_input("Please insert Japanese text here: ")
write_to.write(text)
write_to.close()
read_from = open('pholder.txt').read()
mecab = MeCab.Tagger("-Owakati")
print mecab.parse(read_from)
The deal-breaker here is adding .read() to the open func. Why? Maybe you can tell me. :/
精彩评论