GSP incorporate the Bayesian framework and EM algorithm for the genome size prediction without de novo assembling. This clever approach use the l-mer hash for the basic frequency statistic, then extended the l-mer frequency table into BE & EM iterator for genome size estimating. The software is written in C and supports the FASTQ/FASTA format as the input files. The software not only supports the simulating data but also supports the real data set with an extra error model.

      Estimate your genome size in minutes with GSP !

Free download now

     Main Features
  • fast calculate the K-mer frequency of the genome size.
  • Bayesian framework for frequency estimating
  • EM iterative algorithm for the genome size predition.
  • work with  simualted data and real data with an extra error model
  • specially advantage even reduce sequence data szie lower to 2-4 fold cases
     What new features would you like to see in a future release? - Tell us right now!

If you hope to known the genome size before de novo assembing, this is a definite must have. It is beyond simple!

HOME            DOWNLOAD           DOCUMENT           CONTACTS
Welcome to GSP HomePage

(An Efficient l-mer frequency genome size predictor)

Copyright, 2009 - 2010, The Zhejiang University, China.  All Rights reserved.

Permission granted to download and use GSP freely for academics.  Any restrictions to use by non-academics are License need. Contact Zhejiang University Ph.D. Email:

Copyright @ 2009-2010 Zhejiang University. All Rights Reserved.

web page maker screenshot