Ubuntu: Search output of AWK in another file



Question:

I have two files fileA and fileB.

I have to extract column1 from fileA like awk '{print $1}' and then the output will be searched into other fileB and will save the matched records into a new file fileC in simple words like:

fileA:

seg1     rec1  seg2     rec2  seg3     rec3   

I need to retrieve column 1 by using awk command and this column 1 is searched into fileB to retrieve the records like:

fileB:

seg1     one  seg2     two  seg3     three  seg4     four  seg5     five  

From fileA, column1 data is extracted and and this data is used to search in fileB and matched record is saved to a test file. My output should be like this:

fileC:

seg1       one  seg2       two  seg3       three  


Solution:1

Can be achieved easily with awk as follows:

awk 'NR==FNR{inFileA[$1]; next} ($1 in inFileA)' fileA fileB > write_to_fileC  

result,

seg1       one  seg2       two  seg3       three  

at above, first we are reading the fileA and holds the entire column1 from into an array named inFileA, then look in fileB for its first column and if it's matched with the saved column1 from fileA then goes to print entire row of fileB.


Solution:2

If the columns to be compared are sorted, you can use join:

join -o 2.1,2.2 file1 file2  

join matches sorted columns from input files and prints them. -o 2,1,2.2 restricts the output to first and second columns of the second input file.


Solution:3

You can use the following one-liner:

cut -f1 fileA | grep -f - fileB > fileC  
  • the cut command will extract the first column of fileA (assuming tab separation. use -d to specify something else)
  • the grep command takes the output of cut and searches fileB for all strings.
  • the output will be written to fileC


Solution:4

You have already received some excellent answers. Just to add to the mix, here's a Perl approach:

$ perl -ane '$i ? $k{$F[0]} && print : { $k{$F[0]}++ }; $i++ if eof' fileA fileB  seg1     one  seg2     two  seg3     three  

And a golfed version of KasiyA's answer:

$ awk 'NR==FNR ? a[$1] : $1 in a' fileA fileB   seg1     one  seg2     two  seg3     three  

And here's a kinda convoluted grep solution:

$ grep -Ff <(grep -oP '^\S+' fileA) fileB  seg1     one  seg2     two  seg3     three  


Solution:5

An attempt with bash script. (Remember to make executable.)

fileA and fileB should exist in the same folder as the script.

A general script which will work for any two files described with script and generate the file with matching text as <fa>_<fb>_match.txt:

To use this, run ./script_name.sh fileA fileB

#!/bin/bash  fa="$1"  # first file- which has  columns  fb="$2"  # second file - which has  raw data to be searched  # file with name <fa>_<fb>_match.txt will be generated.    myarr=($(awk 'NR>1 {print $1}' "$fa")) # NR makes awk to ignore first row.    for index in ${!myarr[@]}; do      #echo $index/${#myarr[@]}      #echo    "${myarr[index]}"  text="${myarr[index]}"  grep -w -F "$text" $fb  >>  $fa"_"$fb"_match".txt  done    # file with name <fa>_<fb>_match.txt will be generated.  

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »