Stata programming language without syntax?
I recently got into Stata coming from a procedural/OO/functional background, and am having trouble understanding the basic elements of the language.
For example, I discovered that there is a syntax
command which "allows programs to interpret the arguments the user types according to a grammar, such as standard Stata syntax". I infer this is the reason why some command require a list of variables given as arguments to be separated by whitespaces while others require a comma-separated list. But开发者_开发知识库 the idea of a program defining its own syntax instead of the (parameter) syntax being enforced seems plain weird.
Another quite interesting construct is the syntax for macro definition and expansion (`macro'
) and the apparent absence of local variables as known in other languages.
Is there something like a "Stata for Java developers" document explaining the basic concepts of the language to people with my background?
PS: Apologies if this question seems unclear. Unfortunately, I can't formulate more concrete/clear questions at this point :(
I'm not exactly sure what you are looking for... but here's a few related points. Stata is kind of like writing a Unix shell script or a Windows batch file. Each line executes a command, and the first word is the command name. By convention, most commands have the following structure:
command [varlist] [=exp] [if expression] [in range] [weight] [using filename] [, options]
Brackets [.] means it's optional (or unavailable, depending on the command). Some commands can be prefixed (such as by:
, xi:
, or svy:
) The syntax of commands by Stata Corp and experienced users are pretty consistent. But, because Stata users also write commands, you occasionally see things that are wacky.
When Stata users write commands, they are saved in .ado files (not .do) and are defined using the program
command. (See help program
and the "Ado files" section of the manual.) Writing a command is akin to writing a function in other languages (e.g., MatLab)
The syntax
command is used to help you write your own command. When you execute a command, everything following the command's name (command
above) is passed to the program in the local macro `0'
. The syntax command parses this local macro, so that you can reference `varlist'
or `if'
and so on. In theory, you could parse `0'
yourself, but the syntax command makes it much easier for you and your users (as long as you are following the conventional syntax). I put an example at the bottom.
I don't know exactly what you mean by "apparent absence of local variables as known in other languages." Macros store a single string or a single number in memory. Here's a comment I wrote about Stata's local/global macros. They are indeed a unique feature of Stata's programming language. As their names imply, "local" macros are only available within a specify program (command) or .do file while "global" macros are available throughout a Stata session.
I found that, once I got used to macros in Stata, I started to miss them in other languages. They are pretty handy. In addition to (local/global) macros and the main data set, you can also store "things" in memory with the scalar
and matrix
commands (and one or two other obscure things).
I hope that helps. Here's a list resources that might help.
Example:
program define myprogram
syntax varlist [if], [hello(string) yes]
macro list _0 _varlist _if _hello _yes
summarize `varlist' `if'
display "Here's the string in my hello option: `hello'"
if !missing("`yes'") di "Yes is on"
else di "Yes is off"
end
sysuse auto.dta
myprogram rep78 headroom if price > 5000 , hello("world") yes
A few books offer an "X for Y users" approach, but generally between stats software solutions. Regarding your question, I would recommend using instinct first.
I started reading (programming and markup) code about ten years ago, and even though I cannot code in a large number of languages, I can read a few languages rather easily. I found Stata easy because most of its core commands are straightforward, with recurrent optional statements like over
, if
or replace
(to take a voluntarily diverse set of statements) that are easy to understand and then apply.
When I teach Stata, I always have problems getting students to use the help
pages as much as I do (and I love the fact they can be accessed so easily, just like in R). I explain the paradox by considering the fact that I can read the syntax indications straightaway. Syntax is very well covered by the previous reply to your question.
The extra mile consists in opening the [R], [U] and especially [P] handbooks that come with Stata in the utilities
folder. There is a wealth of details there, which will interest both programmers and training statisticians. This is where I learnt to use macros and loops, beyond the obvious logic of commands like local
/global
and foreach
/while
(if I understand the term correctly, Stata is Turing-complete).
Stata is sometimes a bit of a pain when it comes to using single/double quotes in macro loops, but it's pretty straightforward otherwise. Have fun!
精彩评论