What is an example of a lexical error and is it possible that a language has no lexical errors?

2023-01-11 08:39 问答作者：

for our compiler 开发者_C百科theory class, we are tasked with creating a simple interpreter for our own designed programming language. I am using jflex and cup as my generators but i'm a bit stuck with what a lexical error is. Also, is it recommended that i use the state feature of jflex? it feels wrong as it seems like the parser is better suited to handling that aspect. and do you recommend any other tools to create the language. I'm sorry if i'm impatient but it's due on tuesday.

A lexical error is any input that can be rejected by the lexer. This generally results from token recognition falling off the end of the rules you've defined. For example (in no particular syntax):

[0-9]+   ===> NUMBER token
[a-zA-Z] ===> LETTERS token
anything else ===> error!

If you think about a lexer as a finite state machine that accepts valid input strings, then errors are going to be any input strings that do not result in that finite state machine reaching an accepting state.

The rest of your question was rather unclear to me. If you already have some tools you are using, then perhaps you're best to learn how to achieve what you want to achieve using those tools (I have no experience with either of the tools you mentioned).

EDIT: Having re-read your question, there's a second part I can answer. It is possible that a language could have no lexical errors - it's the language in which any input string at all is valid input.

A lexical error could be an invalid or unacceptable character by the language, like '@' which is rejected as a lexical error for identifiers in Java (it's reserved).

Lexical errors are the errors thrown by your lexer when unable to continue. Which means that there's no way to recognise a lexeme as a valid token for you lexer. Syntax errors, on the other side, will be thrown by your scanner when a given set of already recognised valid tokens don't match any of the right sides of your grammar rules.

it feels wrong as it seems like the parser is better suited to handling that aspect

No. It seems because context-free languages include regular languages (meaning than a parser can do the work of a lexer). But consider than a parser is a stack automata, and you will be employing extra computer resources (the stack) to recognise something that doesn't require a stack to be recognised (a regular expression). That would be a suboptimal solution.

NOTE: by regular expression, I mean... regular expression in the Chomsky Hierarchy sense, not a java.util.regex.* class.

lexical error is when the input doesn't belong to any of these lists: key words: "if", "else", "main"... symbols: '=','+',';'... double symbols: ">=", "<=", "!=", "++" variables: [a-z/A-Z]+[0-9]*
numbers: [0-9]*

examples: 9var: error, number before characters, not a variable and not a key word either. $: error

what I don't know is whether something like more than one symbol after each other is accepted, like "+-"

Compiler can catch an error when it has the grammar in it! It will depend on the compiler itself whether it has the capacity (scope) of catching the lexical errors or not. If is decided during the development of compiler what types of lexical error and how (according to the grammar) they are going to be handled. Usually all famous and mostly used compiler has this capabilities.

继续阅读：compiler-theory jflex

What is an example of a lexical error and is it possible that a language has no lexical errors?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？