Look through a file and print text from specific lines


I have a file with data that I save. Now I would like to print my results into a new file.

For instance, let's take this example, randomlog.log:

Link encap:Ethernet HWaddr 08:00:00:00:00:67  inet addr: Bcast: Mask:  inet6 addr: fe80::casf:sdfg:23ra:dg12/64 Scope:Link  

How can I take only data from the 12th to the 20th character of the first line and then the 4th to 8th characters of the 3rd line? Output would look something like this:

Ethernet  t6 ad  

Is this possible? I want to set the line and from position to this position.


Here's a sed approach:

$ sed -nE '1s/.{11}(.{8}).*/\1/p; 3s/.{3}(.{4}).*/\1/p' file    Ethernet  t6 a  


The -n suppresses normal output (normal is to print every input line) so that it only prints when told to. The -E enables extended regular expressions.

The sed script has two commands, both using the substitution operator (s/original/replacement/). The 1s/.{11}(.{8}).*/\1/p will only run on the 1st line (that's what the 1s does), and will match the 1st 11 characters of the line (.{11}), then it captures the next 8 ((.{8}), the parentheses are a "capture group") and then everything else until the end of the line (.*). All this is replaced with whatever was in the capture group (\1; if there were a second capture group, it would be \2 etc.). Finally, the p at the end (s/foo/bar/p) causes the line to be printed after the substitution has been made. This results in only the target 8 characters being output.

The second command is the same general idea except that it will only run on the 3rd line (3s) and will keep the 4 characters starting from the 4th.

You could also do the same thing with perl:

$ perl -ne 'if($.==1){s/.{11}(.{8}).*/\1/}              elsif($.==3){s/.{3}(.{4}).*/\1/}              else{next}; print; ' file   Ethernet  t6 a  


The -ne means "read the input file line by line and apply the script given by -e to each line. The script is the same basic idea as before. The $. variable holds the current line number so we check if the line number is either 1 or 3 and, if so, run the substitution, else skip. Therefore the print will only be run for those two lines since all others will be skipped.

Of course, this is Perl, so TIMTOWTDI:

$ perl -F"" -lane '$. == 1 && print @F[11..19]; $.==3 && print @F[3..6]' file   Ethernet   t6 a  


Here, the -a means "split each input line on the character given by -F and save as the array @F. Since the character given is empty, this will save each character of the input line as an element in @F. Then, we print elements 11-19 (arrays start counting at 0) for the 1st line and 3-7 for the 3rd.


awk approach:

$ awk 'NR==1{print substr($0,12,8)};NR==3{print substr($0,4,4)}' input.txt    Ethernet  t6 a  

Uses NR for determining line (in awk terminology - record) number, and accordingly print substring of the line. substr() function is in format

substr(string,starting position,how much offset)   


$ python -c 'import sys                                                                                                                                                  > for index,line in enumerate(sys.stdin,1):                                                                                                                              >     if index == 1:  >          print line[11:19]  >     if index == 3:  >          print line[3:7]' < input.txt  Ethernet  t6 a  

This uses < shell operator to redirect input stream to python process from the input file. Note that strings in python are 0-indexed, hence you need to shift your desired character numbers all by 1.

portable shell way

This works in ksh, dash, bash. Relies only on shell utilities, nothing external.

#!/bin/sh    rsubstr(){      i=0;      while [ $i -lt  $2 ];      do          rmcount="${rmcount}?"          i=$(($i+1))      done;      echo "${1#$rmcount}"  }    lsubstr(){      printf "%.${2}s\n" "$1"  }    line_handler(){      case $2 in          1) lsubstr "$(rsubstr "$1" 11)" 8 ;;          3) lsubstr "$(rsubstr "$1" 3)" 5 ;;      esac  }    readlines(){      line_count=1      while IFS= read -r line;      do          line_handler "$line" "$line_count"          line_count=$(($line_count+1))      done < $1  }    readlines "$1"  

And it works like so:

$ ./get_line_substrings.sh input.txt                                                                                                                                     Ethernet  t6 ad  

