Putting together the various planarian transcriptomes

I gave an overview of the 3 planarian transcriptomes in my last blog post. Since we do not have immediate access to all the raw data that went into the transcriptomes, we have to resort to merging the assembled transcriptomes according to their strengths and weaknesses.

Here are some thoughts on the 3 transcriptomes that’ll need to be address when performing the merge:

  • The BIMSB transcriptome contains the most full length, non redundant set of transcripts, but it also has the least amount of transcripts.

  • The AAA and Heidelberg transcriptomes have better coverage of the transcriptome due to high depth of sequencing.

  • The BIMSB and AAA transcriptomes both have a small population of very short sequences. It looks like Heidelberg discarded transcripts below 100 base pairs.

  • The Heidelberg transcriptome contains a lot of Ns.

  • There maybe some isoform information contained in all the transcriptomes.

  • The AAA transcriptome may contain elements of both asexual and sexual sequences since the SOLiD reads were reference assembled.

  • Strandness in BIMSB and Heidelberg transcripts are mainly based on ORF or homology evidence.

