REPCLASS: A Software Workflow Toolset for Automated Classification of
Transposable Elements
Umeshkumar Keswani,
Nirmal Ranganathan, Cedric Feschotte and David Levine
Whole genomes for new species are being sequenced
at an ever increasing pace. This creates an urgent need to create
software tools that can assist automated (or semi-automated)
analysis for fundamental biological understanding. While
computers become much faster and hold much more data every
year, utilizing these computational resources effectively is quite
challenging. Analyzing the information in a new genome
quickly, yet accurately and creating biologically important
summary overviews without drowning someone in
overwhelming details is a ambitious yet worthwhile goal. The
DNA in genome sequences is very repetitive, finding and
classifying those repeated segments has been a very tedious and
valuable effort. In this work, we present REPCLASS, a software
workflow toolset that automatically classifies transposable
elements (TEs) in genomes. REPCLASS provides biologically
valuable reports and views, allowing a quick overview of a
genome. In order to provide a fast response time REPCLASS
scales to work faster on clusters of computers, dividing large
computational tasks into pieces running on many
computational nodes concurrently. In addition to running
quickly, the REPCLASS workflow eliminates many artifacts of
automated classification, providing more accurate results to
scientists.
Index Terms
Software tool, transposable elements, automated classification.
Full Text (PDF)