ABC - Accessible Bioinformatics Cafe
  • Home
  • Calendar
  • Documentation
  • Resources
    • Databases
    • Bash tips
  • People&Help

Useful tips for day-to-day bash

Handling text files

Check if column 2 is different from some text (in this case -) and print it out

awk '$2~/^-$/ { next } { print }' file_input > file_output

Extract M-th column of a .csv file

awk -F "\"*,\"*" '{print $M}' file.csv

Print a file except first two columns

awk '{ $1=""; $2=""; print}' filename

Get from the 10th to the 20th line of a file

sed -n '10,20p' myFile

Get multiple lines of a file (here line 1 and line 2)

sed -n -e 1p -e 2p  myFile

From the L-th line, get each M-th subsequent line

 sed -n '$L~$Mp' myFile

Sort col 1 numerically and count unique occurrences

cut -f1 20141020.combined_mask.whole_genomeV2.bed | sort -n | uniq -c

Substitute “string1” into “string2” in the file

sed -e "s/string1/string2/" myFile 

Sort based on column 3

sort -k 3,3 myFile

remove lines with a specific pattern from a file

grep -v -e "pattern" myFile > newFile
  • same but tells you the lines where pattern does not happen (or remove -v to find where it happens)

    grep -v -n -e "pattern" myFile | cut -f1 -d":"
  • same but looks at one or both of two patterns

    grep -v -e "pattern1\|pattern2" myFile > newFile

Handling variables

echo some values and substitute space with tab delimiter

echo $a $b $c | tr -s ' ' '\t'

portable perl sum (or other operations) between numbers/variables

perl -E "say $a + $b"

Managing files

show files with extension .zip and print them without both path and extension

ls *.zip | xargs -n 1 basename -s .zip

Compress one at the time the files in the current folder (and subfolders) not yet in .gz format

find . -type f ! -name '*.gz' -exec gzip "{}" \;

Transfering files

Download with wget using a list of download links written on each line of file.txt

wget -i file.txt

List folders and print their name removing the last characther “/”

for var in `ls -d */`; do echo ${var::-1}; done