Brockman: Brockman Representation Of Chromatin by K-mers in Mark-Associated Nucleotides
Overview
What is Brockman?
Brockman is a suite of command line tools and R functions to convert genomics data into DNA k-mer words representing the regions associated with a chromatin mark, and then analyzing these k-mer sets to see how samples differ from each other. This approach is primarily intended for single cell genomics data, and was tested most extensively on single cell ATAC-seq data. The bash scripts in particular may require some alteration for other types of genomics data.
A preprint describing the approach is available here.
What are Brockman’s dependencies?
The command line tools rely on the following, and assume the shell is Bash:
- Ruby
- AMUSED: for counting k-mers
- BEDTools: For working with BED files
- Kent Tools - twoBitToFa: For extracting genomic sequence
- SAM Tools: For working with BAM/SAM files
- Trimmomatic: For trimming sequencing reads
The R analysis tools rely on the following packages:
Installation
Command line tools
At present, only anaconda installation is supported. If you haven’t yet learned how to use anaconda, there’s no time like the present!
Linux/OSX:
conda create -c bioconda -n BrockmanEnv brockman-pipeline
R library
If you don’t already have devtools
, install it:
install.packages("devtools")
Load devtools
and install from the GitHub page:
library(devtools)
install_github("Carldeboer/BrockmanR")
Usage
Command line tools
See Brockman_pipeline Example for example data processing pipelines.
R library
See Brockman Analysis Example for example analysis pipelines.
Citation
Please cite BROCKMAN if you find Brockman useful.
Carl G. de Boer, Aviv Regev. BROCKMAN: deciphering variance in epigenomic regulators by k-mer factorization. BMC Bioinformatics. (2018) 19:253;