Ubuntu: Print the first column



Question:

I want to print column 1 of this file. I used this command: awk '{print $1}' but it just printed the first word of the 1st column.

DATA

ABC transporters                             ABC transporters  Alanine, aspartate and glutamate metabolism  Alanine, aspartate   alpha-Linolenic acid metabolism              alpha-Linolenic acid metabolism  Aminoacyl-tRNA biosynthesis                  Aminoacyl-tRNA biosynthesis  Amino sugar and nucleotide sugar metabolism  Amino sugar and nucleotide  Arachidonic acid metabolism                  Arachidonic   

Output:

ABC  Alanine,  alpha-Linolenic  Aminoacyl-tRNA  Amino  Arachidonic  

Desired Output:

ABC transporters  Alanine, aspartate and glutamate metabolism  alpha-Linolenic acid metabolism   Aminoacyl-tRNA biosynthesis   Amino sugar and nucleotide sugar metabolism   Arachidonic acid metabolism   


Solution:1

What I can see is that your columns are delimited by two space.

so with awk:

awk -F '\\s\\s' '{print $1}'  


Solution:2

Since this seems to be a fixed-width column, you can just cut the corresponding characters. The widest column Alanine, aspartate and glutamate metabolism seems to be 44 characters wide, so:

$ cut -c1-44 foo  ABC transporters  Alanine, aspartate and glutamate metabolism  alpha-Linolenic acid metabolism  Aminoacyl-tRNA biosynthesis  Amino sugar and nucleotide sugar metabolism  Arachidonic acid metabolism  


Solution:3

As the second column obviously repeats the beginning of the first column, I take this as criterion for the cut with sed, thus it does not depend on the column width:

sed 's/^\(.*\)\(.*\) \1$/\1\2/'  

First pattern is the repeated part, backreferenced as \1 at the end of the line. You could add ;s/ *$// to remove the trailing spaces if they bother you.


Solution:4

Building upon muru's answer that the column is specified with fixed width, using egrep command with option -o will allow you to print just the matched (non-empty) parts of a matching line specified by the search pattern. By default, however, entire line will be printed.

$ egrep -o "^.{44}" foo  

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »