GSP incorporate the Bayesian framework and EM algorithm for the genome size prediction without de novo assembling. This clever approach use the l-mer hash for the basic frequency statistic, then extended the l-mer frequency table into BE & EM iterator for genome size estimating. The software is written in C and supports the FASTQ/FASTA format as the input files. The software not only supports the simulating data but also supports the real data set with an extra error model.
Estimate your genome size in minutes with GSP !
Main Features
- fast calculate the K-mer frequency of the genome size.
- Bayesian framework for frequency estimating
- EM iterative algorithm for the genome size predition.
- work with simualted data and real data with an extra error model
- specially advantage even reduce sequence data szie lower to 2-4 fold cases
(An Efficient l-mer frequency genome size predictor)
Copyright, 2009 - 2010, The Zhejiang University, China. All Rights reserved.
Permission granted to download and use GSP freely for academics. Any restrictions to use by non-academics are License need. Contact Zhejiang University Ph.D. Email: shangood@zju.edu.cn.
Copyright @ 2009-2010 Zhejiang University. All Rights Reserved.