Ubuntu: How to remove text after '-'?



Question:

I have a list of files (basically they are .deb packages). Let's say:

abc-de-1.2.3-1.deb  fgh-ij-4.5.6-2.deb  klm-no-7.8.9-3.deb  pqrs-10.11.12-4.deb  ...  

As you can see some of the file names have numbers after a - while others have some text after a - and then numbers after the next -.

Are there any ways to remove everything starting from the numbers including the -, i.e.,

abc-de  fgh-ij  klm-no  pqrs  ...  

I want to edit the list, not rename the files.


Solution:1

If you're able to use the first number to identify what you want to remove every time, you could use:

$ sed 's/-[0-9].*//' file  abc-de  fgh-ij  klm-no  pqrs  

Notes

  • s/old/new/ replace old with new
  • [0-9] some digit
  • .* any number of any characters


Solution:2

Using grep with Perl regular expressions:

$ grep -Po "^[a-z-]*(?=-[0-9])" filename  abc-de  fgh-ij  klm-no  pqrs  


Solution:3

Perl

$ perl -lne 's/([[:digit:]].*)//;s/-$//;print' input.txt                                                              abc-de  fgh-ij  klm-no  pqrs  

This performs two substitutions, one to delete everything that starts with a digit, and removes trailing -. Use -i options additionally to edit original file, like $ perl -i -lne 's/([[:digit:]].*)//;s/-$//;print' input.txt

Alternatively, with greedy non-digit match and grouping:

$ perl -lne 's/^(\D*)-.*/\1/;print' input.txt                                                                                                          abc-de  fgh-ij  klm-no  pqrs  

AWK

$ awk -F '-' '{s=$1;for(i=2;i<=NF;i++) if($i~/[0-9].*/){print s;next}else{s=s"-"$i}}' input.txt   abc-de  fgh-ij  klm-no  pqrs  

The way this works is that we treat - as separator for fields, then iterate over each line. We "cache" the first field, and move on iterating using for loop. On each iteration we check if the column doesn't contain a number we pad it to s variable. If the column contains a number - we print what we saved up and move on to next line.

Use > new_file.txt at the end to redirect output to new file.

Python

#!/usr/bin/env python  import sys,re    with open(sys.argv[1]) as f:      for line in f:          tokens = re.split("-|\.",line.strip().replace(".deb",""))          words_only = filter(lambda x: not x.isdigit(),tokens)          print("-".join(words_only))  

Using re.split() we break down every line into list of tokens, and filter only non-digit tokens.

Alternatively, here's a one-liner command. This doesn't take pre-caution of in case there's no digit in line, so only use this if you're sure all lines contain numbers.

$ python -c 'import re,sys;f=open(sys.argv[1]);print("\n".join([ l[:re.search(r"\d",l).start()-1] for l in f]))' input.txt  

Potential numbers in package names

hvd properly noted in the comments that there may be integers in package names sometimes, which may present a difficulty with parsing the input file, while version names typically have dots in them. With that in mind, the commands can be altered somewhat to counter that:

$ perl -lne 's/\d*\..*//;s/-$//;print' input.txt    $ awk '{gsub(/[0-9]*\..*/,"");print substr($0,0,length($0)-1)};' input.txt                                                                               $ python -c 'import re,sys;f=open(sys.argv[1]);print("\n".join([ l[:re.search(r"\d*\.",l).start()-1] for l in f]))' input.txt  


Solution:4

Through awk,

awk -F'-[0-9]' '{print $1}' file  

In awk, we can also pass a regex as an argument to Field Separator -F. So this would split each row on the part where the regex matches.

Example:

$ echo 'abc-de-1.2.3-1.deb' | awk -F'-[0-9]' '{print $1}'  abc-de  


Solution:5

I'll make a guess, since you suggested the files are DEB packages, then, perhaps you wanted something like:

dpkg-query -f '${Package}\n' -W 'gnome*'  

Where, instead of gnome*, you could substitute any pattern. I'm not sure what exactly the convention is for naming DEB archives, but if those are DEB archives, it's probably best to rely on dpkg to give you the package name.

And if those are DEB archive files (on your system), then you could use:

dpkg-deb --showformat='${Package}\n' -W some-file.deb   

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »