How to remove punctuation from a String in C
I'm lo开发者_开发技巧oking to remove all punctuation from a string and make all uppercase letters lower case in C, any suggestions?
Just a sketch of an algorithm using functions provided by ctype.h
:
#include <ctype.h>
void remove_punct_and_make_lower_case(char *p)
{
char *src = p, *dst = p;
while (*src)
{
if (ispunct((unsigned char)*src))
{
/* Skip this character */
src++;
}
else if (isupper((unsigned char)*src))
{
/* Make it lowercase */
*dst++ = tolower((unsigned char)*src);
src++;
}
else if (src == dst)
{
/* Increment both pointers without copying */
src++;
dst++;
}
else
{
/* Copy character */
*dst++ = *src++;
}
}
*dst = 0;
}
Standard caveats apply: Completely untested; refinements and optimizations left as exercise to the reader.
Loop over the characters of the string. Whenever you meet a punctuation (ispunct
), don't copy it to the output string. Whenever you meet an "alpha char" (isalpha
), use tolower
to convert it to lowercase.
All the mentioned functions are defined in <ctype.h>
You can either do it in-place (by keeping separate write pointers and read pointers to the string), or create a new string from it. But this entirely depends on your application.
The idiomatic way to do this in C is to have two pointers, a source and a destination, and to process each character individually: e.g.
#include <ctype.h>
void reformat_string(char *src, char *dst) {
for (; *src; ++src)
if (!ispunct((unsigned char) *src))
*dst++ = tolower((unsigned char) *src);
*dst = 0;
}
src and dst can be the same string since the destination will never be larger than the source.
Although it's tempting, avoid calling tolower(*src++)
since tolower may be implemented as a macro.
Avoid solutions that search for characters to replace (using strchr or similar), they will turn a linear algorithm into a geometric one.
Here's a rough cut of an answer for you:
void strip_punct(char * str) {
int i = 0;
int p = 0;
int len = strlen(str);
for (i = 0; i < len; i++) {
if (! ispunct(str[i]) {
str[p] = tolower(str[i]);
p++;
}
}
}
精彩评论