
AWK program to find the average rainfall of three states

I want to find the average rainfall of any three states say CA, TX and AX for a particular month from Jan to Dec . Given input file delimited by TAB SPACES and has the format city name, the state , and then average rainfall amounts from January through December, and then an annual average for all months. EG may look like

AVOCA   PA  30  2.10    2.15    2.55    2.97    3.65    3.98    3.79    3.32     3.31   2.79    3.06    2.51    36.18
BAKERSFIELD CA  30  0.86    1.06    1.04    0.57    0.20    0.10    0.01    0.09    0.开发者_运维问答17    0.29    0.70    0.63    5.72

What I want to do is "To get the sum of average rainfall for say a particular month feb , over say n years and then find its average for the states CA, TX and AX.

I have written the below script in awk to do the same , but it doesn't give me the expected output

/^CA$/ {CA++; CA_SUM+= $5} # ^CA$ - Regular Expression to match the word CA only 
/^TX$/ {TX++; TX_SUM+= $5} # ^TX$ - Regular Expression to match the word TX only  
/^AX$/ {AX++; AX_SUM+= $5} # ^AX$ - Regular Expression to match the word AX only 
     CA_avg = CA_SUM/CA;
     TX_avg = TX_SUM/TX;
     AX_avg = AX_SUM/AX; 
     printf("CA Rainfall: %5.2f",CA_avg);
     printf("CA Rainfall: %5.2f",TX_avg);
     printf("CA Rainfall: %5.2f",AX_avg);

I invoke the program with the command awk 'FS="\t"'-f awk1.awk rainfall.txt and see no output.

Question: Where am I slipping? Any suggestions and a changed code will be appreciated

The pattern /^CA$/ means the characters "C" and "A" are the only characters on the line. You want:

$2 == "CA" {CA++; CA_SUM+= $5}
# etc.

However, this is DRYer:

{ count[$2]++; sum[$2] += $5 }
    for (state in count) {
        printf("%s Rainfall: %5.2f\n", state, sum[state]/count[state])

Also, this looks wrong: awk 'FS="\t"'-f awk1.awk rainfall.txt
try: awk -F '\t' -f awk1.awk rainfall.txt

Response to comments:

awk -F '\t' -v month=2 -v states="CA,AZ,TX" '
    BEGIN {
        month_col = month + 3  # assume January is month 1
        split(states, wanted_states, /,/)
    { count[$2]++; sum[$2] += $month_col }
    END {
        for (state in wanted_states) {
            if (state in count) {
                printf("%s Rainfall: %5.2f\n", state, sum[state]/count[state])
                print state " Rainfall: no data"
' rainfall.txt

your regexp should be

/ CA / {CA++; cA_SUM+= $5} # ^CA$ - Regular Expression to match the word CA only 
/ TX / {TX++; TX_SUM+= $5} # ^TX$ - Regular Expression to match the word TX only  
/ AX / {AX++; AX_SUM+= $5} # ^AX$ - Regular Expression to match the word AX only 

/^AX$/ match only if it is the only word in the line



/ CA / {CA++; CA_SUM+= $5} # ^CA$ - Regular Expression to match the word CA only 
/ TX / {TX++; TX_SUM+= $5} # ^TX$ - Regular Expression to match the word TX only  
/ AX / {AX++; AX_SUM+= $5} # ^AX$ - Regular Expression to match the word AX only 

 if(CA!=0){CA_avg = CA_SUM/CA;     printf("CA Rainfall: %5.2f",CA_avg);}
 if(TX!=0){TX_avg = TX_SUM/TX;     printf("TX Rainfall: %5.2f",TX_avg);}
 if(AX!=0){TX_avg = AX_SUM/CA;     printf("AX Rainfall: %5.2f",AX_avg);}




验证码 换一张
取 消

