split
Split fragment files by cell type.
Usage and options
catac_fragment_tools split \
-f <PATH_TO_SAMPLE_TO_FRAGMENT_DEFINITION> \
-b <PATH_TO_CELL_TYPE_TO_CELL_BARCODE_DEFINITION> \
-c <CHROM_SIZES_FILENAME> \
-o <PATH_TO_OUTPUT_FOLDER>
Required Arguments
-f, –sample_fragments
Path to a text file mapping sample names to fragment files.
-b, –cell_type_barcodes
Path to a text file mapping samples to cell types and cell types to cell barcodes.
-c, –chrom
Filename with chromosome sizes (*.chrom.sizes, *.fa.fai).
-o, –output
Path to output folder.
Optional arguments
-t, –temp
Path to temporary folder. Default: /tmp
-n, –n_cpu
Number of cores to use. Default: 1
-v, –verbose
Whether to print progress. Default: False
–clear_temp
Whether to clear the temporary folder. Default: False
-s, –sep
Separator for text files. Default: ‘\t’
–sample_column
Column name for the sample name Default: sample
–fragment_column
Column name for the path to the fragment file Default: path_to_fragment_file
–cell_type_column
Column name for the cell type Default: cell_type
–cell_barcode_column
Column name for the cell barcode Default: cell_barcode
Examples of input files
sample_to_fragment.tsv
sample path_to_fragment_file
A a.fragments.tsv.gz
B b.fragments.tsv.gz
cell_type_to_cell_barcode.tsv
sample cell_type cell_barcode
A type_1 TTAGCTTAGGAGAACA-1
A type_1 TTAGCTTAGGAGAACA-1
A type_1 ATATTCCTCTTGTACT-1
A type_2 TGTGACAGTACAACGG-1
A type_2 CATGCCTTCTCTGACC-1
A type_2 ATCGAGTAGGTTCGAG-1
A type_3 CTCTCAGGTCCCTTTG-1
A type_3 TTCGGTCTCACGTGTA-1
A type_3 GTGACATCATTGTTCT-1
A type_4 AAGGAGCCATCGACCG-1
A type_4 ACCAAACTCTTAAGCG-1
A type_4 CATTGGATCTCTTCCT-1
A type_5 AGGCGAAAGGTCTTTG-1
A type_5 AACGAGGCATCATGTG-1
A type_5 CTACTTAGTCATGAGG-1
B type_1 ATTACCTGTGTGCTTA-1
B type_1 CATAACGTCGGTTGTA-1
B type_1 ATGTCTTTCGGTCCGA-1
B type_2 CAATCCCGTAGCGTTT-1