Collectively, the bedtools utilities are a swiss-army knife of tools for a wide-range of genomics analysis tasks. The most widely-used tools enable genome arithmetic: that is, set theory on the genome. For example, bedtools allows one to intersect, merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic file formats such as BAM, BED, GFF/GTF, VCF.
While each individual tool is designed to do a relatively simple task (e.g., intersect two interval files), quite sophisticated analyses can be conducted by combining multiple bedtools operations on the UNIX command line.
Documentation
Slurm script
#!/bin/bash #SBATCH --job-name=bedtools #SBATCH --time=1:00:00 #SBATCH --partition=norm #SBATCH --ntasks=1 #SBATCH --mem=64GB module load bedtools # bedtools sorted bedtools intersect -a ccds.exons.bed -b aln.bam.bed -c -sorted # bedtools unsorted bedtools intersect -a ccds.exons.bed -b aln.bam.bed -c # bedmap (without error checking) bedmap --echo --count --bp-ovr 1 ccds.exons.bed aln.bam.bed # bedmap (no error checking) bedmap --ec --echo --count --bp-ovr 1 ccds.exons.bed aln.bam.bed
Build instructions for those who are curious
git clone https://github.com/arq5x/bedtools2.git
cd bedtools2
make -j 8
mkdir -p /mnt/nasapps/production/bedtools/2.31.1
cp -a bin/ data/ docs/ genomes/ tutorial/ /mnt/nasapps/production/bedtools/2.31.1/