Welcome to PolyA_DB, a web resource for analysis of pre-mRNA cleavage and polyadenylation sites (polyA sites, pA sites or PAS) and alternative polyadenylation (APA) isoforms

PolyA site databases

PolyA_DB 1

ENTER

PolyA_DB version 1 contains human and mouse polyA sites mapped by cDNA/EST sequences.

Reference: Zhang et al. (2005). PolyA_DB: a database for mammalian mRNA polyadenylation. Nucleic Acids Res 33:D116-20.

PolyA_DB 2

ENTER

PolyA_DB version 2 contains polyA sites in human, mouse, rat, chicken and zebrafish that are mapped by cDNA/EST and Trace sequences. Sequence alignments between orthologous sites are also available.

Reference: Lee et al. (2007). PolyA_DB 2: mRNA polyadenylation sites in vertebrate genes. Nucleic Acids Res 35:D165-8.

PolyA_DB 3

ENTER

PolyA_DB version 3 contains polyA sites mapped by 3'READS in human, mouse, rat and chicken genomes. In addition to substantially increased polyA site coverage, quantitative information about polyA site expression levels is available. Strand-specific RNA-seq data helped find polyA sites in extended regions downstream of annotated transcript end sites. Conservation information of polyA sites across mammals sheds light on evolutional pressure on alternative polyadenylation (APA).

Reference: Wang et al. (2018). PolyA DB 3 catalogs cleavage and polyadenylation sites identified by deep sequencing in multiple genomes. Nucleic Acids Res D1:D315-D319.

PolyA_DB 4

ENTER

PolyA_DB version 4 contains polyA sites in human and mouse genomes mapped by 3' end short-read RNA-seq (3'READS+) data from a wide variety of samples. Long-read RNA-seq data (PacBio and Oxford Nanopore) were used to validate and annotate polyA sites. PolyA site conservation and strength information are also available.

Reference: Yu et al. PolyA_DB v4: systematic polyA site identification and isoform annotation in human and mouse genomes using 3’ end and long-read sequencing data. Nucleic Acids Res (2026) D1: D247-D254.

PolyA site prediction and cis element analysis

PolyA_SVM

ENTER

PolyA_SVM predicts polyA sites using 15 cis elements identified for human polyA sites.

Reference: Cheng et al. Prediction of mRNA polyadenylation sites by support vector machine. Bioinformatics (2006) 22:2320-5.

Alternative polyadenylation isoform analysis

APALYZER

ENTER

APAlyzer uses RNA-seq data and annotated polyA sites in the PolyA_DB database to examine 3'UTR APA, intronic APA and gene expression changes.

Reference: Wang & Tian. APAlyzer: a bioinformatics package for analysis of alternative polyadenylation isoforms. Bioinformatics (2020) 12:3907-3909.

MAAPER

ENTER

Co-developed with Vivian Li's group, MAAPER is a probabilistic model-based method which utilizes nearSite reads for APA analysis. MAAPER predicts PASs with high accuracy and sensitivity and examines different types of APA events with robust statistics. It is useful for both bulk and single-cell data.

Reference: Li et al. MAAPER: model-based analysis of alternative polyadenylation using 3′ end-linked reads. Genome Biol (2022) 22:222.