开发者

Structure of an R course for beginners

I realize that this is a question that will probably not have a single best answer, and that it might be closed as such, but I think that this might get some very useful answers so maybe it can be turned into CW instead.

Suppose you have to give a course on R to complete beginners, and that you have limited time to do so so you need to make choices in what you emphasize. This is great, young innocent minds to bend to our will! But how do we do that?

How can we best setup an开发者_如何学运维 R course for absolute beginners so that they become efficient users of R. We want them to do everything right and efficient, but of course we want them to be able to do things in the first place even more. Some issues that come to my mind here are:

  • Indenting and using proper coding styles is very important. Should this be the first thing to come up? Even before looking at how to assign objects?
  • Loops vs applies vs vectorizations, what do you emphasize first? I think loops are so easy to learn and straightforward that those are nice to emphasize first, they might not produce very efficient code but they will be able to get things working! Then again, immediatly stressing vectorizations might get them to be more efficient in the long run.
  • Let them use RStudio from the beginning?
  • What would be a good order to introduce things?


The number one thing you want to do in any short course is get students interested and motivated - you can convey very little information in 3-4 hours, but you can motivate your students to learn more. I'd recommend picking one topic of interest to your community and showing them how R can help them kick butt in that area. Cut ruthlessly - you want to figure out the absolute minimum path from knowing nothing about R to being able to do something useful, something that makes your students say "wow, that's cool". For me, I use graphics - in 3 hours you can teach the basics of ggplot2 (scatterplots, histograms, aesthetics and facetting) giving students a powerful toolkit for data exploration.

I would recommend using RStudio. I wouldn't recommend talking about code style, vectorisation, or probably even for loops.


To reiterate points others touched on:

1) Don't teach R. Teach "solving some problem" and help them use R to do that.

2) Don't try to wow them with what you or someone else can do with R. Wow them with what THEY can do with a little bit of R.

3) Channel a little bit of Kathy Sierra. The end goal is not for the class to be proficient in 3-4 hours. The end goal is to help the class kick a little ass and feel like R will help them kick more ass in the future. The value they attribute to R will be the net present value of all the ass they can imagine kicking in the future. I'm pretty sure there's an R package for calculating net present value of ass kicking.


+1 for hadley's answer. I totally agree: motivation is key. And it's all you can do in a couple of hours. It's like showing fat kids how to lose weight. There are tons of ways to do it. None of them will probably lose significant weight during a 3 hour session, but you can show them it's fun losing weight and everybody has to continue working on their own from there on. That being said, I think focusing is important but you should show them around:

Show them the sky is the limit: show stockplot or the web ggplot2 for example, show a little database connection stuff, e.g. RMySQL (without going into detail), show them ggplot2. You could also show Sweave briefly which is particularly interesting for students aiming at an empirical master's thesis.

And yes, +1 for using RStudio. It has excellent help and auto-complete, which they even improved recently (e.g. brace matching was added). And it is also a very good example of how R is a compared to the likes of SPSS or STATA. You should mention that you setup and improve your own working environment. It's not one program, but a package. You can choose the editor, the graphics packages, ways of storing the data and much much more. It might be obvious to you but might wow beginners.

That being said, pick a topic like Hadley said and go for it. Basically I just wanted to say use a little time to give an overview about the endless possibilities.

Here's a related discussion on programmer's that was SO before but was migrated. We discuss how to market R at an academic institute. And of course some of the arguments hold for lobbying among students as well.

Or just show Hadley's video on youtube and go for coffee.


This answer is late, but I realized it might be helpful.

I have introduced a number of people to R, especially programmers, but it becomes a mental Wikipedia entry if I just show them linear regression, tabulations, a few plots, etc. They watch, they listen, they don't do anything later - after all, Excel is still available for them.

When I show them iplots and the Titanic data set, they absorb everything. They start copying the example code to their computers. Before long, they've begun poking at load, hist (and ihist), glm, summary and lots of other functions.

It's best to WOW them so that they want to learn on their own.

The iplots website doesn't seem to show the Titanic examples anymore, opting for Cars93 instead: http://rosuda.org/iplots/.

For what it's worth, the epiphany I had that guided better presentations was to teach the audience how to ask questions of the data. A few visual insights later and they're very eager to know more. It's great to see adults who can't sit still because they're bubbling with ideas for what to try. They're putty in your hands at that point.


I just gave a tutorial on R to graduate students in economics, assuming no prior knowledge of programming.

My contents:

  • discussion on tools for data analysis
  • text-editors
  • getting R
  • R language fundamentals: vectors and matrices
  • application: formulate your own OLS estimator
  • lm function and formulas showcase
  • t-test and f-test
  • maximum likelihood: probit
  • installing packages and CRAN views
  • getting help
  • suggested readings

I believed quite important to cover some language fundamentals, but I did not reach half of the topics that quite a few people left the presentation, likely thinking "this is too much... I won't use this". On a future opportunity, I would move language fundamentals to an "intermediate" session, and format the intro. tutorial more as a showcase to sell this technology, and then be clear on what they should read next if they are "in". There goes a tradeoff between rigorously correct and interesting (unless programming language details are of interest to your public).

Once you start talking about language details, it's hard to figure when you should stop. Once you gave vectors and matrices, you should mention some subscripts, some data.frame, which brings you to talk about lists, and how to convert between matrix and data.frame... That easily covers 2 hours. And it's not a sexy sales pitch for an absolute beginner!

I did not and would not use Rstudio in a presentation. If the "traditional" terminal/text-editor is too abstract for them, then so is R too abstract for them. A fancy windowed environment won't change much to it. But do mention that there are such interfaces. Also mention that R is cross-platform, and discuss differences/similarities between platforms, even if >90% of your audience uses Windows.


+1 to hadley; I definitely recommend the wow factor with ggplot or wordcloud, but definitely give them something concrete that they can do as well. 4 hours of ggplot without any R background will be very confusing to a beginning student.

Maybe show them how to make a particular type of plot from ggplot. You could teach them the very basics of what a data.frame is and how to use it, then perform a simple analysis and have them make a simple but attractive plot. I would tell them how customizable the plots are, but I would focus on a simple example rather than having them get lost in an overwhelming number of options. The customizability of plotting in R can be very daunting to a beginning user!

Although coding style and efficient code are important, they won't remember those things from a single workshop. Having taken tutorials like this before, I remember very little syntax from the lessons and got lost quickly when there was too much information. Give them a good handout with a list of resources (especially free ones!) and they can continue on their own if you pique their interest.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜