Class Transcript

All Implemented Interfaces:
Serializable, Cloneable, Comparable<Interval>, Iterable<Exon>, TxtSerializable

public class Transcript extends IntervalAndSubIntervals<Exon>
Interval for a transcript, as well as some other information: exons, utrs, cds, etc.
Author:
pcingola
See Also:
  • Constructor Details

    • Transcript

      public Transcript()
    • Transcript

      public Transcript(Gene gene, int start, int end, boolean strandMinus, String id)
  • Method Details

    • aaNumber2Pos

      public int[] aaNumber2Pos()
      Calculate chromosome position as function of Amino Acid number Note that returns the chromosomal position of the first base for each Amino Acid

      If you need the chromosomal position of each base

    • aaNumber2Pos

      public int aaNumber2Pos(int aaNum)
      Find a genomic position of the first base in a Amino Acid 'aaNum'
    • add

      public void add(Cds cdsInt)
      Add a CDS
    • add

      public void add(Intron intron)
      Add an intron
    • add

      public void add(SpliceSite spliceSite)
      Add a SpliceSite
    • add

      public void add(Utr utr)
      Add a UTR
    • adjust

      public boolean adjust()
      Adjust transcript coordinates
    • apply

      public Transcript apply(Variant variant)
      Create a new transcript after applying changes in variant

      Note: If this transcript is unaffected, no new transcript is created (same transcript is returned)

      Overrides:
      apply in class IntervalAndSubIntervals<Exon>
      Returns:
      The marker result after applying variant
    • baseAt

      public String baseAt(int pos)
      Find base at genomic coordinate 'pos'
    • baseNumber2MRnaPos

      public int baseNumber2MRnaPos(int pos)
      Calculate distance from transcript start to a position mRNA is roughly the same than cDNA. Strictly speaking mRNA has a poly-A tail and 5'cap.
    • baseNumberCds

      public int baseNumberCds(int pos, boolean usePrevBaseIntron)
      Calculate base number in a CDS where 'pos' maps
      Parameters:
      usePrevBaseIntron - : When 'pos' is intronic this method returns: - if( usePrevBaseIntron== false) => The first base in the exon after 'pos' (i.e. first coding base after intron) - if( usePrevBaseIntron== true) => The last base in the exon before 'pos' (i.e. last coding base before intron)
    • baseNumberCds2Codon

      public String baseNumberCds2Codon(int cdsBaseNumber)
      Return a codon that includes 'cdsBaseNumber'
    • baseNumberCds2Pos

      public int[] baseNumberCds2Pos()
      Calculate chromosome position as function of CDS number
    • baseNumberCds2Pos

      public int baseNumberCds2Pos(int cdsBaseNum)
    • cds

      public String cds()
      Retrieve coding sequence
    • cdsMarker

      public Marker cdsMarker()
      Create a marker of the coding region in this transcript
    • cloneShallow

      public Transcript cloneShallow()
      Description copied from class: Marker
      Perform a shallow clone
      Overrides:
      cloneShallow in class IntervalAndSubIntervals<Exon>
    • codonNumber2Pos

      public int[] codonNumber2Pos(int codonNum)
      Return an array of 3 genomic positions where amino acid number 'aaNum' maps
      Returns:
      aa2pos[0], aa2pos[1], aa2pos[2] are the coordinates (within the chromosome) of the three bases conforming codon 'aaNum'. Any aa2pos[i] = -1 means that it could a base in the codon could not be mapped.

      Bases in the array are sorted by chromosome position, so aa2pos[0] < aa2pos[1] < aa2pos[2]

    • collapseZeroGap

      public boolean collapseZeroGap()
      Collapses exons having gaps of zero (i.e. exons that followed by other exons). Does the same for CDSs and UTRs.
      Returns:
      true of any exon in the transcript was 'collapsed'
    • cpgExonBias

      public double cpgExonBias()
      Calculate CpG bias: number of CpG / expected[CpG]
    • cpgExons

      public int cpgExons()
      Count total CpG in this transcript's exons
    • createSpliceSites

      public void createSpliceSites(int spliceSiteSize, int spliceRegionExonSize, int spliceRegionIntronMin, int spliceRegionIntronMax)
      Find all splice sites.
    • createUpDownStream

      public void createUpDownStream(int upDownLength)
      Creates a list of UP/DOWN stream regions (for each transcript) Upstream (downstream) stream is defined as upDownLength before (after) transcript
    • deleteRedundant

      public boolean deleteRedundant()
      Deletes redundant exons (i.e. exons that are totally included in other exons). Does the same for CDSs. Does the same for UTRs.
    • findCds

      public Cds findCds(Exon exon)
      Find a CDS that matches exactly the exon
    • findExon

      public Exon findExon(int pos)
      Return the an exon that intersects 'pos'
    • findExon

      public Exon findExon(Marker marker)
      Return an exon intersecting 'marker' (first exon found)
    • findIntron

      public Intron findIntron(int pos)
      Return an intron overlapping position 'pos'
    • findUtr

      public Utr findUtr(int pos)
      Return the UTR that hits position 'pos'
      Returns:
      An UTR intersecting 'pos' (null if not found)
    • findUtrs

      public List<Utr> findUtrs(Marker marker)
      Return the UTR that intersects 'marker' (null if not found)
    • frameCorrection

      public boolean frameCorrection()
      Correct exons based on frame information.

      E.g. if the frame information (form a genomic database file, such as a GTF) does not match the calculated frame, we correct exon's boundaries to make them match.

      This is performed in two stages: i) First exon is corrected by adding a fake 5'UTR ii) Other exons are corrected by changing the start (or end) coordinates.

    • get3primeUtrs

      public List<Utr3prime> get3primeUtrs()
      Create a list of 3 prime UTRs
    • get3primeUtrsSorted

      public List<Utr3prime> get3primeUtrsSorted()
    • get5primeUtrs

      public List<Utr5prime> get5primeUtrs()
      Create a list of 5 prime UTRs
    • get5primeUtrsSorted

      public List<Utr5prime> get5primeUtrsSorted()
    • getBioType

      public BioType getBioType()
    • getCds

      public List<Cds> getCds()
      Get all CDSs
    • getCdsEnd

      public int getCdsEnd()
    • getCdsStart

      public int getCdsStart()
    • getDownstream

      public Downstream getDownstream()
    • getExons

      public Collection<Exon> getExons()
      A more intuitive name for 'subintervals'
    • getFirstCodingExon

      public Exon getFirstCodingExon()
      Get first coding exon
    • getGene

      public Gene getGene()
    • hasProteinId

      public boolean hasProteinId()
    • getProteinId

      public String getProteinId()
    • getTags

      public String[] getTags()
    • getTranscriptSupportLevel

      public TranscriptSupportLevel getTranscriptSupportLevel()
    • getTss

      public Marker getTss()
      Create a TSS marker
    • getUpstream

      public Upstream getUpstream()
    • getUtrs

      public List<Utr> getUtrs()
      Get all UTRs
    • getVersion

      public String getVersion()
    • setVersion

      public void setVersion(String version)
    • hasCds

      public boolean hasCds()
    • hasError

      public boolean hasError()
      Does this transcript have any errors?
    • hasErrorOrWarning

      public boolean hasErrorOrWarning()
      Does this transcript have any errors?
    • hasTag

      public boolean hasTag(String tag)
      Does this transcript have 'tag'?
    • hasTags

      public boolean hasTags()
    • hasTranscriptSupportLevelInfo

      public boolean hasTranscriptSupportLevelInfo()
    • hasWarning

      public boolean hasWarning()
      Does this transcript have any errors?
    • introns

      public List<Intron> introns()
      Get all introns (lazy init)
    • isAaCheck

      public boolean isAaCheck()
    • setAaCheck

      public void setAaCheck(boolean aaCheck)
    • isAdjustIfParentDoesNotInclude

      protected boolean isAdjustIfParentDoesNotInclude(Marker parent)
      Description copied from class: Marker
      Adjust parent if it does not include child?
      Overrides:
      isAdjustIfParentDoesNotInclude in class Marker
    • isCanonical

      public boolean isCanonical()
    • isChecked

      public boolean isChecked()
      Has this transcript been checked against CDS/DNA/AA sequences?
    • isCorrected

      public boolean isCorrected()
    • isDnaCheck

      public boolean isDnaCheck()
    • setDnaCheck

      public void setDnaCheck(boolean dnaCheck)
    • isDownstream

      public boolean isDownstream(int pos)
    • isErrorProteinLength

      public boolean isErrorProteinLength()
      Check if coding length is multiple of 3 in protein coding transcripts
      Returns:
      true on Error
    • isErrorStartCodon

      public boolean isErrorStartCodon()
      Is the first codon a START codon?
    • isErrorStopCodonsInCds

      public boolean isErrorStopCodonsInCds()
      Check if protein sequence has STOP codons in the middle of the coding sequence
      Returns:
      true on Error
    • isIntron

      public boolean isIntron(int pos)
    • isProteinCoding

      public boolean isProteinCoding()
    • isRibosomalSlippage

      public boolean isRibosomalSlippage()
    • setRibosomalSlippage

      public void setRibosomalSlippage(boolean ribosomalSlippage)
    • isUpstream

      public boolean isUpstream(int pos)
    • isUtr

      public boolean isUtr(int pos)
    • isUtr

      public boolean isUtr(Marker marker)
    • isUtr3

      public boolean isUtr3(int pos)
    • isUtr5

      public boolean isUtr5(int pos)
    • isWarningStopCodon

      public boolean isWarningStopCodon()
      Is the last codon a STOP codon?
    • markers

      public Markers markers()
      A list of all markers in this transcript
      Overrides:
      markers in class IntervalAndSubIntervals<Exon>
    • mRna

      public String mRna()
      Retrieve coding sequence AND the UTRs (mRNA = 5'UTR + CDS + 3'UTR) I.e. Concatenate all exon sequences
    • protein

      public String protein()
      Protein sequence (amino acid sequence produced by this transcripts)
    • query

      public Markers query(Marker marker)
      Query all genomic regions that intersect 'marker'
      Overrides:
      query in class IntervalAndSubIntervals<Exon>
    • queryExon

      public Exon queryExon(Marker interval)
      Return the first exon that intersects 'interval' (null if not found)
    • rankExons

      public boolean rankExons()
      Assign ranks to exons
    • reset

      public void reset()
      Description copied from class: IntervalAndSubIntervals
      Remove all intervals
      Overrides:
      reset in class IntervalAndSubIntervals<Exon>
    • resetCache

      public void resetCache()
    • resetExons

      public void resetExons()
    • sanityCheck

      public ErrorWarningType sanityCheck(Variant variant)
      Perfom some baseic chekcs, return error type, if any
    • serializeParse

      public void serializeParse(MarkerSerializer markerSerializer)
      Parse a line from a serialized file
      Specified by:
      serializeParse in interface TxtSerializable
      Overrides:
      serializeParse in class IntervalAndSubIntervals<Exon>
    • serializeSave

      public String serializeSave(MarkerSerializer markerSerializer)
      Create a string to serialize to a file
      Specified by:
      serializeSave in interface TxtSerializable
      Overrides:
      serializeSave in class IntervalAndSubIntervals<Exon>
    • setBioType

      public void setBioType(BioType bioType)
    • setCanonical

      public void setCanonical(boolean canonical)
    • setProteinCoding

      public void setProteinCoding(boolean proteinCoding)
    • setProteinId

      public void setProteinId(String proteinId)
    • setTags

      public void setTags(String tags)
    • setTranscriptSupportLevel

      public void setTranscriptSupportLevel(TranscriptSupportLevel transcriptSupportLevel)
    • sortCds

      public void sortCds()
    • spliceSites

      public List<SpliceSite> spliceSites()
    • toString

      public String toString()
      Overrides:
      toString in class Marker
    • toString

      public String toString(boolean full)
    • toStringAsciiArt

      public String toStringAsciiArt(boolean full)
      Show a transcript as an ASCII Art
    • utrFromCds

      public boolean utrFromCds(boolean verbose)
      Calculate UTR regions from CDSs
    • variantEffect

      public boolean variantEffect(Variant variant, VariantEffects variantEffects)
      Get some details about the effect on this transcript
      Overrides:
      variantEffect in class Marker