开发者

bind_rows() error: Error in `bind_rows()`: ! Can't combine `..1$comment_id` <character> and `..2$comment_id` <integer>

I am running a pretty long function that deals with reddit comment data from RedditExtractoR:

# Create an empty data frame to store the thread information
threads_df = data.frame(date = character(), title = character(), url = character(), subreddit = character())

# Bind threads

threads_df = bind_rows(
  data.frame(threads1, subreddit = "SSBM"),
  data.frame(threads2, subreddit = "funny"),
  data.frame(threads3, subreddit = "meltyblood"),
  data.frame(threads4, subreddit = "bloomington")
)

# Get the comments from each thread
comments_df = data.frame()
for (i in 1:nrow(threads_df)) {
  result = get_thread_content(threads_df$url[i])[[2]]
  result$subreddit = threads_df$subreddit[i]
  comments_df = bind_rows(comments_df, result)
  print(paste("Completed thread", i, "of", nrow(threads_df)))
  if (nrow(result) == 0) {
    stop("Failed to retrieve comments for thread", i)
  }
  if (i %% 100 == 0) {
    print("Checking for timeouts")
    Sys.sleep(10)
  }
}

After getting to thread 115/977, I am greeted with the following error:

Error in `bind_rows()`:
! Can't combine `..1$comment_id` <charact开发者_Go百科er> and `..2$comment_id` <integer>.
---
Backtrace:
 1. dplyr::bind_rows(comments_df, result)
 4. vctrs::vec_rbind(!!!dots, .names_to = .id)

I have tried using trycatch to skip the error only to compile even more errors that I don't understand. It would be ideal to just skip threads that generated this error. I tried the following to do that, but it only complicated things past a level that I can comprehend:

# Get the comments from each thread
comments_df = data.frame()
for (i in 1:nrow(threads_df)) {
  result = tryCatch(
    expr = {
      get_thread_content(threads_df$url[i])[[2]]
    },
    error = function(e) {
      NULL
    }
  )
  if (is.null(result)) {
    print(paste("Skipped thread", i, "due to bind_rows error"))
    next
  }
  result$subreddit = threads_df$subreddit[i]
  comments_df = bind_rows(comments_df, result)
  print(paste("Completed thread", i, "of", nrow(threads_df)))
  if (nrow(result) == 0) {
    stop("Failed to retrieve comments for thread", i)
  }
  if (i %% 100 == 0) {
    print("Checking for timeouts")
    Sys.sleep(10)
  }
}

Does anyone have any insight into why this original error might be happening and what I could try to fix it? If not, is there maybe another way to bypass the error?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜