Ubuntu: Merge values of consecutive rows if they have the same values on a different column (AWK)



Question:

I need to combine the first value ($1) of consecutive rows if their fourth value ($4) is the same (I-PER).

I managed to filter the values I need simply using awk:

awk ' ($4 == "I-PER") {printf $1; printf "\n" }  

I also found how to merge rows with duplicate column values but not consecutive ones.

Example (Input):

Comandante  comandante  NP00000 I-PER  de  de  SPS00   I-PER  la  el  DA0FS0  I-PER  Guardia guardia NP00000 I-PER  Civil   civil   NP00000 I-PER  Pamplona    pamplona    NP00000 I-LOC  Poblador    poblador    NP00000 I-PER  

Example (Output):

Comandante de la Guardia Civil  Poblador  


Solution:1

Another awk solution to avoid printing repeated \newlines if condition didn't meet in any line:

awk '($4=="I-PER"){ printf SEP$1; SEP=" "; C=1; next }         C==1{ SEP=""; print ""; C=0} END{print ""}' infile  

example input:

Comandante  comandante  NP00000 I-PER  de  de  SPS00   I-PER  la  el  DA0FS0  I-PER  Guardia guardia NP00000 I-PER  Civil   civil   NP00000 I-PER  no I-PER in fourth column  anotherline no I-PER in fourth column  Pamplona    pamplona    NP00000 I-LOC  Poblador    poblador    NP00000 I-PER  

The output is:

Comandante de la Guardia Civil  Poblador  


Solution:2

A quick and somewhat dirty solution with a ternary operator (condition?true:false), it does the test you provided and prints either $1 followed by space or a newline:

awk '{printf $4=="I-PER"?$1" ":"\n"}'  

Output:

$ <test awk '{printf $4=="I-PER"?$1" ":"\n"}'  Comandante de la Guardia Civil   Poblador  

Here's a quite poor alternative approach with an array â€" at least this doesn't produce empty lines like the above does for multiple successive non-I-PER lines:

awk '{    if ($4=="I-PER") {a[i++]=$1}    else if (length(a)>0) {      for (i in a) {printf a[i]" ";delete a[i]}      print ""      }    }   END {    if (length(a)>0) {      for (i in a) printf a[i]" ";print ""}    }'  

Output:

$ <test awk '{if($4=="I-PER"){a[i++]=$1}else if(length(a)>0){for(i in a){printf a[i]" ";delete a[i]};print ""}}END{if(length(a)>0){for(i in a)printf a[i]" ";print ""}}'  Comandante de la Guardia Civil   Poblador  

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »