How can I escape latex code received through user input?

2023-04-02 03:48 问答作者：

I read in a string from a GUI textbox entered by the user and process it through pandoc. The string contains latex directives for math which have backslash characters. I want to send in the string as a raw string to pandoc for processing. But something like "\theta" becomes a tab and "heta".

How can I convert a string literal that contains backslash characters to a raw string...?

Edit:

Thanks develerx, flying sheep and unutbu. But none of the solutions seem to help me. The reason is that there are other backslashed-characters which do not have any effect in python but do have a meaning in latex.

F开发者_如何转开发or example '\lambda'. All the methods suggested produce

\\lambda

which does not go through in latex processing -- it should remain as \lambda.

Another edit:

If i can get this work, i think i should be through. @Mark: All three methods give answers that i dont desire.

a='\nu + \lambda + \theta'; 
b=a.replace(r"\\",r"\\\\"); 
c='%r' %a; 
d=a.encode('string_escape');
print a

u + \lambda +   heta
print b

u + \lambda +   heta
print c
'\nu + \\lambda + \theta'
print d
\nu + \\lambda + \theta

Python’s raw strings are just a way to tell the Python interpreter that it should interpret backslashes as literal slashes. If you read strings entered by the user, they are already past the point where they could have been raw. Also, user input is most likely read in literally, i.e. “raw”.

This means the interpreting happens somewhere else. But if you know that it happens, why not escape the backslashes for whatever is interpreting it?

s = s.replace("\\", "\\\\")

(Note that you can't do r"\" as “a raw string cannot end in a single backslash”, but I could have used r"\\" as well for the second argument.)

If that doesn’t work, your user input is for some arcane reason interpreting the backslashes, so you’ll need a way to tell it to stop that.

If you want to convert an existing string to raw string, then we can reassign that like below

s1 = "welcome\tto\tPython"
raw_s1 = "%r"%s1
print(raw_s1)

Will print

welcome\tto\tPython

a='\nu + \lambda + \theta'
d=a.encode('string_escape').replace('\\\\','\\')
print(d)
# \nu + \lambda + \theta

This shows that there is a single backslash before the n, l and t:

print(list(d))
# ['\\', 'n', 'u', ' ', '+', ' ', '\\', 'l', 'a', 'm', 'b', 'd', 'a', ' ', '+', ' ', '\\', 't', 'h', 'e', 't', 'a']

There is something funky going on with your GUI. Here is a simple example of grabbing some user input through a Tkinter.Entry. Notice that the text retrieved only has a single backslash before the n, l, and t. Thus no extra processing should be necessary:

import Tkinter as tk

def callback():
    print(list(text.get()))

root = tk.Tk()
root.config()

b = tk.Button(root, text="get", width=10, command=callback)

text=tk.StringVar()

entry = tk.Entry(root,textvariable=text)
b.pack(padx=5, pady=5)
entry.pack(padx=5, pady=5)
root.mainloop()

If you type \nu + \lambda + \theta into the Entry box, the console will (correctly) print:

['\\', 'n', 'u', ' ', '+', ' ', '\\', 'l', 'a', 'm', 'b', 'd', 'a', ' ', '+', ' ', '\\', 't', 'h', 'e', 't', 'a']

If your GUI is not returning similar results (as your post seems to suggest), then I'd recommend looking into fixing the GUI problem, rather than mucking around with string_escape and string replace.

When you read the string from the GUI control, it is already a "raw" string. If you print out the string you might see the backslashes doubled up, but that's an artifact of how Python displays strings; internally there's still only a single backslash.

>>> a='\nu + \lambda + \theta'
>>> a
'\nu + \\lambda + \theta'
>>> len(a)
20
>>> b=r'\nu + \lambda + \theta'
>>> b
'\\nu + \\lambda + \\theta'
>>> len(b)
22
>>> b[0]
'\\'
>>> print b
\nu + \lambda + \theta

I spent a lot of time trying different answers all around the internet, and I suspect the reasons why one thing works for some people and not for others is due to very small weird differences in application. For context, I needed to read in file names from a csv file that had strange and/or unmappable unicode characters and write them to a new csv file. For what it's worth, here's what worked for me:

s = '\u00e7\u00a3\u0085\u00e5\u008d\u0095' # csv freaks if you try to write this
s = repr(s.encode('utf-8', 'ignore'))[2:-1]

继续阅读：python string text

How can I escape latex code received through user input?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？