开发者

Remove white spaces in character strings

I have a data frame imported from a text file :

                      dateTime contract bidPrice askPrice bidSize askSize
1 Tue Dec 14 05:12:20 PST 2010     CLF1        0        0       0       0
2 Tue Dec 14 05:12:20 PST 2010     CLG1        0        0       0       0
3 Tue Dec 14 05:12:20 PST 2010     CLH1        0        0       0       0
4 Tue Dec 14 05:12:20 PST 2010     CLJ1        0        0       0       0
5 Tue Dec 14 05:12:20 PST 2010     NGF1        0        0       0       0
6 Tue Dec 14 05:12:20 PST 2010     NGG开发者_开发知识库1        0        0       0       0
  lastPrice lastSize volume
1         0        0      0
2         0        0      0
3         0        0      0
4         0        0      0
5         0        0      0
6         0        0      0

I try to create a subset for all rows where contract = CLF1 but get the following error:

> clf1 <- data.frame(subset(train2, as.character(contract="CLF1")))
Error in as.character(contract = "CLF1") : 
  supplied argument name 'contract' does not match 'x'

I try to find how many characters are in the cell:

> f <-as.character(train2[1,2])
> nchar(f)
[1] 5

I assumed this is due to a leading or trailing space so I try the following:

> clf1 <- data.frame(subset(train2, as.character(contract=" CLF1")))
Error in as.character(contract = " CLF1") : 
  supplied argument name 'contract' does not match 'x'
> clf1 <- data.frame(subset(train2, as.character(contract="CLF1 ")))
Error in as.character(contract = "CLF1 ") : 
  supplied argument name 'contract' does not match 'x'

Again no luck, so here I am. Any suggestions would be great. Thank you.

EDIT:

> clf1 <- subset(train2, contract == "CLF1")
> head(clf1)
[1] dateTime  contract  bidPrice  askPrice  bidSize   askSize   lastPrice lastSize 
[9] volume   
<0 rows> (or 0-length row.names)


I believe your problem is that you are using the incorrect syntax in subset. Try this instead:

subset(train2, contract == "CLF1")

So you shouldn't be coercing the subset expression to a character, and you also need to use the == equality operator, not =. You should read ?subset and look at the difference between the examples there and your code.

Although your field may indeed contain leading spaces, in which case it would be good to try:

subset(train2, contract == " CLF1")

or when you read the file in using read.table you can use the strip.white argument to strip whitespace.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜