开发者

Is .( ever legal in C# or VB.Net?

Can the sequence .( ever appear in C# or VB.Net code?

(Not in a string, comment, or XML literal, EDIT: or preprocessor directive)

I'm reasonably certai开发者_运维知识库n that the answer is no, but I'd like to make sure.


The only places that . appears in the grammar are:

real-literal:
    decimal-digits   .   decimal-digits ...
    .   decimal-digits ...

namespace-or-type-name:
    namespace-or-type-name   .   identifier ...


member-access:
    primary-expression   .   identifier ...
    predefined-type   .   identifier ...

qualified-alias-member   .   identifier ...

base-access:
    base   .   identifier

unbound-type-name:
    unbound-type-name   .   identifier

qualified-identifier: 
    qualified-identifier   .   identifier

member-name:
    interface-type   .   identifier

indexer-declarator:
    type   interface-type   .   this   

(The ... means I have elided the remainder of the production rule.) In none of these cases is a .( valid as . is either followed by digits, a valid identifier, or the keyword this.


In C#, a #region segment allows any characters to follow it:

#region foo.(
// this is perfectly legal C#
#endregion

Note that in VB.Net this is not a concern because the region label has to be a valid string literal, so it has quotes:

#Region "foo.("
' quotes required
#End Region

It's also legal after #error and #warn which have no VB equivalent.

The biggest concern, though, is that you can have any arbitrary code inside of an #if block. In C#:

#if false
    foo.( perfectly legal
#endif

In VB.Net:

#If False Then
    foo.( perfectly legal
#End If

It's actually worse than that, because the VB version allows arbitrary expressions so you can't know if some code is actually VB unless you evaluate the expressions. In other words, parsing alone is not sufficient -- you have to evaluate too.

That said, analyzing the grammar in the C# Language Specification Version 4.0, Appendix B, the . character appears in the following lines:

real-literal:
    decimal-digits   .   decimal-digits   exponent-partopt   real-type-suffixopt
    .   decimal-digits   exponent-partopt   real-type-suffixopt

operator-or-punctuator:  one of
    {     }     [     ]     (     )     .     ,     :     ;

namespace-or-type-name:
    namespace-or-type-name   .   identifier   type-argument-listopt

member-access:
    primary-expression   .   identifier  type-argument-listopt
    predefined-type   .   identifier  type-argument-listopt
    qualified-alias-member   .   identifier

base-access:
    base   .   identifier

unbound-type-name:
    unbound-type-name   .   identifier   generic-dimension-specifieropt

qualified-identifier:
    qualified-identifier   .   identifier

member-name:
    interface-type   .   identifier

indexer-declarator:
    type   interface-type   .   this   [   formal-parameter-list   ]

Since a . is always followed by a decimal digit, an identifier, or a this token, the only way to have a .( sequence is to allow multiple operator-or-punctuator symbols next to each other. Looking up operator-or-punctuator, we see:

token:
    operator-or-punctuator

Since token is only used in lexical analysis, there's nothing to suggest that a . is legal followed by a ( in regular code.

Of course that still leaves comments, literals, etc. which I leave out because you already know about those.


No reference to the grammar and completely unscientific, but here's my guess:

.( is not legal in C# (can't speak for VB.NET).

Outside of comments and string literals, I think . can only appear as:

  1. The member access operator, which must be followed by an identifier. Since identifiers may not begin with (, this is a no go.
  2. As a decimal point in real literals, which must be followed by a digit. ( is not a digit.

Finally, the . operator is not overloadable, so foo.(bar) won't work either.


Having perused the VB reference, I’m now confident that the answer for VB is no.

VB uses the character . for only three things: inside floating point number literals and for member access and nested name access.

Leaving aside XML literals, the only thing that may every appear behind a member access is an IdentifierOrKeyword (§1.105.6). Identifiers are very well-defined and they may only start with letters, underscores or, in the case of an escaped identifier, the character [.

The same goes for nested name access (and, for completeness’ sake, also in With blocks and field initialisers).

As for floating point literals, the point there must be followed by at least one more digit (§1.6.3).


On this page http://blogs.msdn.com/b/lucian/archive/2010/04/19/grammar.aspx I put a copy of the complete grammar for C#4 and VB10 in machine-readable format (EBNF & ANTLR) and human-readable (HTML). The HTML version includes the computed "may-follows" set for each token.

According to this, the "may-follows" set of PERIOD does not include LPAREN in either C#4 or VB10.

Unfortunately the grammars aren't quite complete. Nevertheless, within the VB/C# teams, these grammars are what we start with for a lot of our analysis. For instance...

  • VB10 introduced "single-line statement lambdas" of the form "Sub() STMT". A lambda itself is an expression, and may appear in a list e.g. "Dim array = {Sub() STMT1, Sub() STMT2}". We had to be aware of ambiguities about what came after an expression and what came after a statement. For instance, "Dim x = Sub() Dim y = 5, z = 3" is ambiguous because the "z=3" might be part of the first OR the second Dim.

  • VB10 introduced "implicit line continuation" feature, which is more or less analogous to allowing C# to include SEMICOLON anywhere in the source code. We had to figure out whether this would introduce any ambiguities. That's equivalent to asking whether a prefix of any sentence in the language is also itself a valid sentence. That's equivalent to figuring out whether the intersection of two context-free languages is empty. It's an undecidable problem in general, but not in the case of the VB grammar, which we were able to decide with additional "human insight" into the algorithm.


I don't think they'll add anything like this to C#, would just look plain wrong.

I am not at all sure about VB.Net, though. just by looking at how they did generics, it seems that the VB.Net team doesn't have this "not looking weird" attitude.

So, if you build any kind of tool that should work with future versions of those languages, better watch out for VB.Net...

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜