Skip to Content
Python APIHicHiCPipeline

HiCPipeline

class chr3d.HiCPipeline( genome_index: str, chrom_sizes: str, threads: int = 1, assembly: str = 'hg38', min_mapq: int = 30, min_distance: int = 1000, resolutions: Optional[List[int]] = None, n_splits: int = 0, call_tads: bool = True, tad_windows: Optional[List[int]] = None, call_loops: bool = True, loop_fdr: float = 0.1, call_compartments: bool = True, compartment_phasing_track: Optional[str] = None, fragment_bed: Optional[str] = None, )

Complete Hi-C data processing pipeline orchestrator. Combines all Hi-C processing steps into a single pipeline.

Parameters

ParameterTypeDescription
genome_indexstrPath to BWA-indexed genome FASTA
chrom_sizesstrPath to chromosome sizes file
threadsintNumber of threads for parallel processing (default: 1)
assemblystrGenome assembly name (default: 'hg38')
min_mapqintMinimum mapping quality (default: 30)
min_distanceintMinimum pair distance in bp (default: 1000)
resolutionsOptional[List[int]]List of matrix resolutions in bp (default: [1000, 5000, 10000, 25000, 50000, 100000])
n_splitsintSplit FASTQ into N chunks for parallel alignment; 0 = no splitting (default: 0)
call_tadsboolRun TAD/insulation calling after matrix generation (default: True)
tad_windowsOptional[List[int]]Window sizes in bp for insulation scoring (default: library defaults)
call_loopsboolRun loop calling after matrix generation (default: True)
loop_fdrfloatFDR threshold for loop significance (default: 0.1)
call_compartmentsboolRun A/B compartment calling (default: True)
compartment_phasing_trackOptional[str]Path to BED file for phasing E1 sign (default: None)
fragment_bedOptional[str]Path to restriction fragment BED (default: None)

Methods

run

def run( self, fastq1: Optional[str] = None, fastq2: Optional[str] = None, output_dir: str = './results', sample_id: str = 'sample', cleanup: bool = False, start_from: int = 1, ) -> Dict[str, Any]

Run the complete Hi-C pipeline, or resume from a later step.

Parameters:

ParameterTypeDescription
fastq1Optional[str]Path to R1 FASTQ file
fastq2Optional[str]Path to R2 FASTQ file
output_dirstrOutput directory (default: './results')
sample_idstrSample identifier (default: 'sample')
cleanupboolRemove intermediate files (default: False)
start_fromintStep to resume from: 1=alignment, 2=SAM/BAM, 3=pairs, 4=matrix (default: 1)

Returns:

Dict[str, Any] containing:

  • 'output_sam': Path to aligned SAM file
  • 'sorted_bam': Path to sorted BAM file
  • 'filtered_pairs': Path to filtered pairs file
  • 'cool_file': Path to contact matrix .cool file
  • 'mcool_file': Path to multi-resolution .mcool file
  • 'timing': Step-by-step timing breakdown

Example:

import chr3d as c3d hic = c3d.HiCPipeline( genome_index="/data/genomes/hg38.fa", chrom_sizes="/data/genomes/hg38.chrom.sizes", threads=24 ) stats = hic.run( fastq1="sample_R1.fastq.gz", fastq2="sample_R2.fastq.gz", output_dir="results/", sample_id="sample1" ) print(f"Output mcool: {stats['mcool_file']}")
Last updated on