deleting selected lines from data file
This question is continuation from my earlier post titled "selecting digits from regular expression".
Below is the sample data as posted in the earlier post.
DONOR ACCEPTORH ACCEPTOR
atom# res@atom atom# res@atom atom# res@atom %occupied distance angle
| 4726 59@O12 | 1487 19@H12 1486 19@O12 | 85.66 2.819 ( 0.18) 21.85 (12.11)
| 1499 19@O15 | 1730 22@H12 1729 22@O12 | 83.15 3.190 ( 0.31) 22.36 (12.73)
| 1216 16@O22 | 1460 19@H22 1459 19@O22 | 75.74 2.757 ( 0.14) 24.55 (13.66)
| 4232 53@O25 | 4143 52@H24 4142 52@O24 | 74.35 2.916 ( 0.25) 28.27 (13.26)
| 3683 46@O16 | 4163 52@H13 4162 52@O13 | 73.78 2.963 ( 0.29) 23.65 (14.14)
| 4162 52@O13 | 4079 51@H12 4078开发者_高级运维 51@O12 | 73.68 2.841 ( 0.19) 21.25 (11.87)
| 3764 47@O16 | 3825 48@H26 3824 48@O26 | 70.52 2.973 ( 0.28) 26.88 (13.14)
.
.
The lines goes few thousands.
I tired Fredirk's code and it works fine for selecting the lines. Well, now I would like to extend this idea to my real problem.
The $3 (3rd field) and $6 (6th field) in my data file represent "number-molecule" which has arrangement as below:
1 2 3 4 5 6 7 8
9 10 11 12 13 14 15 16
17 18 19 20 21 22 23 24
25 26 27 28 29 30 31 32
33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48
49 50 51 52 53 54 55 56
57 58 59 60 61 62 63 64
Any pairs made from above numbers actually represents pairs in the 3rd and 6th field of each line in the data file.
What I want is to select the pairs made only by numbers which arranged at the outer most lines of the above ordering.
In short, ANY PAIRS made by only the numbers (1 2 3 4 5 6 7 8 57 58 59 60 61 62 63 64 1 9 17 25 33 41 49 57 8 16 24 32 40 48 56 64) are need to be deleted.
I have no idea how to write loop in awk code to select those pairs and delete the lines straight away.
I wish to say many thanks in advance.
Use an array to hold the set of numbers. Define it in the BEGIN block
BEGIN {
i=0
for (n=1; n<=8; n++) set[i++] = n
for (n=57; n<=64; n++) set[i++] = n
for (n=9; n<=49; n+=8) {set[i++] = n; set[i++] = n+7}
}
Then, check that $3 and $6 are both in (or not in) the set:
($3 in set) && ($6 in set) {next}
精彩评论