开发者

regular expression for anything but an empty string

Is it possible to use a regular expression to detect anything that is NOT an "empty string" like this:

string s1 = "";
st开发者_如何学Cring s2 = " ";
string s3 = "  ";
string s4 = "   ";

etc.

I know I could use trim etc. but I would like to use a regular expression.


^(?!\s*$).+

will match any string that contains at least one non-space character.

So

if (Regex.IsMatch(subjectString, @"^(?!\s*$).+")) {
    // Successful match
} else {
    // Match attempt failed
}

should do this for you.

^ anchors the search at the start of the string.

(?!\s*$), a so-called negative lookahead, asserts that it's impossible to match only whitespace characters until the end of the string.

.+ will then actually do the match. It will match anything (except newline) up to the end of the string. If you want to allow newlines, you'll have to set the RegexOptions.Singleline option.


Left over from the previous version of your question:

^\s*$

matches strings that contain only whitespace (or are empty).

The exact opposite:

^\S+$

matches only strings that consist of only non-whitespace characters, one character minimum.


In .Net 4.0, you can also call String.IsNullOrWhitespace.


Assertions are not necessary for this. \S should work by itself as it matches any non-whitespace.


What about?

/.*\S.*/

This means

/ = delimiter
.* = zero or more of anything but newline
\S = anything except a whitespace (newline, tab, space)

so you get
match anything but newline + something not whitespace + anything but newline


You can do one of two things:

  • match against ^\s*$; a match means the string is "empty"
    • ^, $ are the beginning and end of string anchors respectively
    • \s is a whitespace character
    • * is zero-or-more repetition of
  • find a \S; an occurrence means the string is NOT "empty"
    • \S is the negated version of \s (note the case difference)
    • \S therefore matches any non-whitespace character

References

  • regular-expressions.info, Anchors, Repetition
  • MSDN - Character classes - Whitespace character \s
    • Note that unless you're using RegexOptions.ECMAScript, \s matches things like ellipsis

Related questions

  • .Net regex: what is the word character \w?


You could also use:

public static bool IsWhiteSpace(string s) 
{
    return s.Trim().Length == 0;
}


We can also use space in a char class, in an expression similar to one of these:

(?!^[ ]*$)^\S+$
(?!^[ ]*$)^\S{1,}$
(?!^[ ]{0,}$)^\S{1,}$
(?!^[ ]{0,1}$)^\S{1,}$

depending on the language/flavor that we might use.

RegEx Demo

Test

using System;
using System.Text.RegularExpressions;

public class Example
{
    public static void Main()
    {
        string pattern = @"(?!^[ ]*$)^\S+$";
        string input = @"

            abcd
            ABCD1234
            #$%^&*()_+={}
            abc def
            ABC 123
            ";
        RegexOptions options = RegexOptions.Multiline;

        foreach (Match m in Regex.Matches(input, pattern, options))
        {
            Console.WriteLine("'{0}' found at index {1}.", m.Value, m.Index);
        }
    }
}

C# Demo


If you wish to simplify/modify/explore the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.


RegEx Circuit

jex.im visualizes regular expressions:

regular expression for anything but an empty string


I think [ ]{4} might work in the example where you need to detect 4 spaces. Same with the rest: [ ]{1}, [ ]{2} and [ ]{3}. If you want to detect an empty string in general, ^[ ]*$ will do.


Create "regular expression to detect empty string", and then inverse it. Invesion of regular language is the regular language. I think regular expression library in what you leverage - should support it, but if not you always can write your own library.

grep --invert-match

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜