AWK program to find the average rainfall of three states
I want to find the average rainfall of any three states say CA, TX and AX for a particular month from Jan to Dec . Given input file delimited by TAB SPACES and has the format 
city name, the state , and then average rainfall amounts from January through December, and then an annual average for all months. EG may look like 
AVOCA   PA  30  2.10    2.15    2.55    2.97    3.65    3.98    3.79    3.32     3.31   2.79    3.06    2.51    36.18
BAKERSFIELD CA  30  0.86    1.06    1.04    0.57    0.20    0.10    0.01    0.09    0.开发者_运维问答17    0.29    0.70    0.63    5.72
What I want to do is "To get the sum of average rainfall for say a particular month feb , over say n years and then find its average for the states CA, TX and AX.
I have written the below script in awk to do the same , but it doesn't give me the expected output
/^CA$/ {CA++; CA_SUM+= $5} # ^CA$ - Regular Expression to match the word CA only 
/^TX$/ {TX++; TX_SUM+= $5} # ^TX$ - Regular Expression to match the word TX only  
/^AX$/ {AX++; AX_SUM+= $5} # ^AX$ - Regular Expression to match the word AX only 
END {
     CA_avg = CA_SUM/CA;
     TX_avg = TX_SUM/TX;
     AX_avg = AX_SUM/AX; 
     printf("CA Rainfall: %5.2f",CA_avg);
     printf("CA Rainfall: %5.2f",TX_avg);
     printf("CA Rainfall: %5.2f",AX_avg);
    }
I invoke the program with the command 
 awk 'FS="\t"'-f awk1.awk rainfall.txt and see no output. 
Question: Where am I slipping? Any suggestions and a changed code will be appreciated
The pattern /^CA$/ means the characters "C" and "A" are the only characters on the line. You want:
$2 == "CA" {CA++; CA_SUM+= $5}
# etc.
However, this is DRYer:
{ count[$2]++; sum[$2] += $5 }
END {
    for (state in count) {
        printf("%s Rainfall: %5.2f\n", state, sum[state]/count[state])
    }
}
Also, this looks wrong: awk 'FS="\t"'-f awk1.awk rainfall.txt
try: awk -F '\t' -f awk1.awk rainfall.txt
Response to comments:
awk -F '\t' -v month=2 -v states="CA,AZ,TX" '
    BEGIN {
        month_col = month + 3  # assume January is month 1
        split(states, wanted_states, /,/)
    }
    { count[$2]++; sum[$2] += $month_col }
    END {
        for (state in wanted_states) {
            if (state in count) {
                printf("%s Rainfall: %5.2f\n", state, sum[state]/count[state])
            else
                print state " Rainfall: no data"
        }
    }
' rainfall.txt
your regexp should be
/ CA / {CA++; cA_SUM+= $5} # ^CA$ - Regular Expression to match the word CA only 
/ TX / {TX++; TX_SUM+= $5} # ^TX$ - Regular Expression to match the word TX only  
/ AX / {AX++; AX_SUM+= $5} # ^AX$ - Regular Expression to match the word AX only 
/^AX$/ match only if it is the only word in the line
HTH!
EDIT
/ CA / {CA++; CA_SUM+= $5} # ^CA$ - Regular Expression to match the word CA only 
/ TX / {TX++; TX_SUM+= $5} # ^TX$ - Regular Expression to match the word TX only  
/ AX / {AX++; AX_SUM+= $5} # ^AX$ - Regular Expression to match the word AX only 
END {
 if(CA!=0){CA_avg = CA_SUM/CA;     printf("CA Rainfall: %5.2f",CA_avg);}
 if(TX!=0){TX_avg = TX_SUM/TX;     printf("TX Rainfall: %5.2f",TX_avg);}
 if(AX!=0){TX_avg = AX_SUM/CA;     printf("AX Rainfall: %5.2f",AX_avg);}
}
 
         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论