Software for working with next-generation sequence reads that was developed at the Departement of General and Molecular Botany
trim_fastq2.pl
Perl program for trimming reads in fastq format from the 5' and 3' ends.
Sequencing quality often deteriorates towards the 3' end, and several low quality bases at the 5' end are also not unusual. Downstream applications (de novo assembly, mapping etc.) usually benefit from read trimming.
Perl program for calculating an overview of quality scores for reads in a fastq file.
The program takes a fastq file and calculates summary statistics for the quality scores. For each base position,
the mean quality score is calculated and the quality scores are sorted in six bins from 0-9, 10-19, 20-29, 30-39, 40-49, >=50.
The mean quality score, the number of bases, and percent of total bases at that position in each bin are given as output.
Useful to determine if reads need trimming or if trimming was effective.
Perl program for splitting a single paired-read file into two files.
The program splits a fastq files with consecutive paired reads into two files containing first and second reads of each pair, respectively.
Useful for downstream applications that require paired reads to be in two separate files.
Perl program for calculating some basic statistics for multiple nucleic acid sequences in fasta format.
Calculates e.g. length, GC content, number of gaps, and N50 value for the total sequence and some stats for individual sequences.
Useful e.g. to get some basic stats on genome assemblies.