CPSF1_ARATH
ID CPSF1_ARATH Reviewed; 1442 AA.
AC Q9FGR0; Q8H1T4;
DT 18-OCT-2001, integrated into UniProtKB/Swiss-Prot.
DT 02-MAR-2010, sequence version 2.
DT 25-MAY-2022, entry version 127.
DE RecName: Full=Cleavage and polyadenylation specificity factor subunit 1;
DE AltName: Full=Cleavage and polyadenylation specificity factor 160 kDa subunit;
DE Short=AtCPSF160;
DE Short=CPSF 160 kDa subunit;
GN Name=CPSF160; OrderedLocusNames=At5g51660; ORFNames=K17N15.21;
OS Arabidopsis thaliana (Mouse-ear cress).
OC Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
OC Spermatophyta; Magnoliopsida; eudicotyledons; Gunneridae; Pentapetalae;
OC rosids; malvids; Brassicales; Brassicaceae; Camelineae; Arabidopsis.
OX NCBI_TaxID=3702;
RN [1]
RP NUCLEOTIDE SEQUENCE [MRNA].
RX PubMed=16897494; DOI=10.1007/s11103-006-0051-6;
RA Xu R., Zhao H., Dinkins R.D., Cheng X., Carberry G., Li Q.Q.;
RT "The 73 kD subunit of the cleavage and polyadenylation specificity factor
RT (CPSF) complex affects reproductive development in Arabidopsis.";
RL Plant Mol. Biol. 61:799-815(2006).
RN [2]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=cv. Columbia;
RA Kaneko T., Katoh T., Asamizu E., Sato S., Nakamura Y., Kotani H.,
RA Tabata S.;
RT "Structural analysis of Arabidopsis thaliana chromosome 5. XI.";
RL Submitted (APR-1999) to the EMBL/GenBank/DDBJ databases.
RN [3]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=cv. Columbia;
RX PubMed=10718197; DOI=10.1093/dnares/7.1.31;
RA Sato S., Nakamura Y., Kaneko T., Katoh T., Asamizu E., Kotani H.,
RA Tabata S.;
RT "Structural analysis of Arabidopsis thaliana chromosome 5. X. Sequence
RT features of the regions of 3,076,755 bp covered by sixty P1 and TAC
RT clones.";
RL DNA Res. 7:31-63(2000).
RN [4]
RP GENOME REANNOTATION.
RC STRAIN=cv. Columbia;
RX PubMed=27862469; DOI=10.1111/tpj.13415;
RA Cheng C.Y., Krishnakumar V., Chan A.P., Thibaud-Nissen F., Schobel S.,
RA Town C.D.;
RT "Araport11: a complete reannotation of the Arabidopsis thaliana reference
RT genome.";
RL Plant J. 89:789-804(2017).
RN [5]
RP INTERACTION WITH CPSF30 AND CPSF100, GENE FAMILY, AND NOMENCLATURE.
RX PubMed=18479511; DOI=10.1186/1471-2164-9-220;
RA Hunt A.G., Xu R., Addepalli B., Rao S., Forbes K.P., Meeks L.R., Xing D.,
RA Mo M., Zhao H., Bandyopadhyay A., Dampanaboina L., Marion A.,
RA Von Lanken C., Li Q.Q.;
RT "Arabidopsis mRNA polyadenylation machinery: comprehensive analysis of
RT protein-protein interactions and gene expression profiling.";
RL BMC Genomics 9:220-220(2008).
RN [6]
RP SUBCELLULAR LOCATION, AND INTERACTION WITH CPSF30.
RX PubMed=19573236; DOI=10.1186/1471-2121-10-51;
RA Rao S., Dinkins R.D., Hunt A.G.;
RT "Distinctive interactions of the Arabidopsis homolog of the 30 kD subunit
RT of the cleavage and polyadenylation specificity factor (AtCPSF30) with
RT other polyadenylation factor subunits.";
RL BMC Cell Biol. 10:51-51(2009).
RN [7]
RP COMPONENT OF CPSF COMPLEX.
RX PubMed=19748916; DOI=10.1104/pp.109.142729;
RA Zhao H., Xing D., Li Q.Q.;
RT "Unique features of plant cleavage and polyadenylation specificity factor
RT revealed by proteomic studies.";
RL Plant Physiol. 151:1546-1556(2009).
RN [8]
RP INTERACTION WITH FY.
RX PubMed=19439664; DOI=10.1073/pnas.0903444106;
RA Manzano D., Marquardt S., Jones A.M., Baurle I., Liu F., Dean C.;
RT "Altered interactions within FY/AtCPSF complexes required for Arabidopsis
RT FCA-mediated chromatin silencing.";
RL Proc. Natl. Acad. Sci. U.S.A. 106:8772-8777(2009).
CC -!- FUNCTION: CPSF plays a key role in pre-mRNA 3'-end formation,
CC recognizing the AAUAAA signal sequence and interacting with
CC poly(A)polymerase and other factors to bring about cleavage and poly(A)
CC addition. This subunit is involved in the RNA recognition step of the
CC polyadenylation reaction. {ECO:0000250|UniProtKB:Q10570}.
CC -!- SUBUNIT: Component of the CPSF complex, at least composed of CPSF160,
CC CPSF100, CPSF73-I, CPSF73-II, CPSF30, FY and FIPS5. Forms a complex
CC with cleavage and polyadenylation specificity factor (CPSF) subunits
CC FY, CPSF30, CPSF73-I, CPSF 73-II and CPSF100.
CC {ECO:0000269|PubMed:18479511, ECO:0000269|PubMed:19439664}.
CC -!- INTERACTION:
CC Q9FGR0; Q9LKF9: CPSF100; NbExp=4; IntAct=EBI-1775436, EBI-1775444;
CC Q9FGR0; Q6NLV4: FY; NbExp=2; IntAct=EBI-1775436, EBI-1632908;
CC -!- SUBCELLULAR LOCATION: Nucleus {ECO:0000269|PubMed:19573236}.
CC -!- SIMILARITY: Belongs to the CPSF1 family. {ECO:0000305}.
CC -!- SEQUENCE CAUTION:
CC Sequence=BAB11613.1; Type=Erroneous gene model prediction; Evidence={ECO:0000305};
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AY140902; AAN41460.1; -; mRNA.
DR EMBL; AB025607; BAB11613.1; ALT_SEQ; Genomic_DNA.
DR EMBL; AB018109; BAB11613.1; JOINED; Genomic_DNA.
DR EMBL; CP002688; AED96112.1; -; Genomic_DNA.
DR RefSeq; NP_199979.2; NM_124545.3.
DR AlphaFoldDB; Q9FGR0; -.
DR SMR; Q9FGR0; -.
DR BioGRID; 20485; 14.
DR DIP; DIP-40386N; -.
DR IntAct; Q9FGR0; 5.
DR STRING; 3702.AT5G51660.1; -.
DR iPTMnet; Q9FGR0; -.
DR PaxDb; Q9FGR0; -.
DR PRIDE; Q9FGR0; -.
DR ProteomicsDB; 224489; -.
DR EnsemblPlants; AT5G51660.1; AT5G51660.1; AT5G51660.
DR GeneID; 835240; -.
DR Gramene; AT5G51660.1; AT5G51660.1; AT5G51660.
DR KEGG; ath:AT5G51660; -.
DR Araport; AT5G51660; -.
DR TAIR; locus:2153122; AT5G51660.
DR eggNOG; KOG1896; Eukaryota.
DR HOGENOM; CLU_002414_0_0_1; -.
DR InParanoid; Q9FGR0; -.
DR OMA; PMTKFKL; -.
DR OrthoDB; 360328at2759; -.
DR PhylomeDB; Q9FGR0; -.
DR PRO; PR:Q9FGR0; -.
DR Proteomes; UP000006548; Chromosome 5.
DR ExpressionAtlas; Q9FGR0; baseline and differential.
DR Genevisible; Q9FGR0; AT.
DR GO; GO:0005634; C:nucleus; IDA:UniProtKB.
DR GO; GO:0003723; F:RNA binding; IEA:UniProtKB-KW.
DR GO; GO:0006378; P:mRNA polyadenylation; IBA:GO_Central.
DR Gene3D; 2.130.10.10; -; 2.
DR InterPro; IPR004871; Cleavage/polyA-sp_fac_asu_C.
DR InterPro; IPR018846; Cleavage/polyA-sp_fac_asu_N.
DR InterPro; IPR015943; WD40/YVTN_repeat-like_dom_sf.
DR Pfam; PF03178; CPSF_A; 1.
DR Pfam; PF10433; MMS1_N; 1.
PE 1: Evidence at protein level;
KW mRNA processing; Nucleus; Reference proteome; RNA-binding.
FT CHAIN 1..1442
FT /note="Cleavage and polyadenylation specificity factor
FT subunit 1"
FT /id="PRO_0000074391"
FT CONFLICT 494
FT /note="N -> D (in Ref. 1; AAN41460)"
FT /evidence="ECO:0000305"
FT CONFLICT 893
FT /note="S -> P (in Ref. 1; AAN41460)"
FT /evidence="ECO:0000305"
SQ SEQUENCE 1442 AA; 158075 MW; 14741C95CB4190BF CRC64;
MSFAAYKMMH WPTGVENCAS GYITHSLSDS TLQIPIVSVH DDIEAEWPNP KRGIGPLPNV
VITAANILEV YIVRAQEEGN TQELRNPKLA KRGGVMDGVY GVSLELVCHY RLHGNVESIA
VLPMGGGNSS KGRDSIILTF RDAKISVLEF DDSIHSLRMT SMHCFEGPDW LHLKRGRESF
PRGPLVKVDP QGRCGGVLVY GLQMIILKTS QVGSGLVGDD DAFSSGGTVS ARVESSYIIN
LRDLEMKHVK DFVFLHGYIE PVIVILQEEE HTWAGRVSWK HHTCVLSALS INSTLKQHPV
IWSAINLPHD AYKLLAVPSP IGGVLVLCAN TIHYHSQSAS CALALNNYAS SADSSQELPA
SNFSVELDAA HGTWISNDVA LLSTKSGELL LLTLIYDGRA VQRLDLSKSK ASVLASDITS
VGNSLFFLGS RLGDSLLVQF SCRSGPAASL PGLRDEDEDI EGEGHQAKRL RMTSDTFQDT
IGNEELSLFG STPNNSDSAQ KSFSFAVRDS LVNVGPVKDF AYGLRINADA NATGVSKQSN
YELVCCSGHG KNGALCVLRQ SIRPEMITEV ELPGCKGIWT VYHKSSRGHN ADSSKMAADE
DEYHAYLIIS LEARTMVLET ADLLTEVTES VDYYVQGRTI AAGNLFGRRR VIQVFEHGAR
ILDGSFMNQE LSFGASNSES NSGSESSTVS SVSIADPYVL LRMTDDSIRL LVGDPSTCTV
SISSPSVLEG SKRKISACTL YHDKGPEPWL RKASTDAWLS SGVGEAVDSV DGGPQDQGDI
YCVVCYESGA LEIFDVPSFN CVFSVDKFAS GRRHLSDMPI HELEYELNKN SEDNTSSKEI
KNTRVVELAM QRWSGHHTRP FLFAVLADGT ILCYHAYLFD GVDSTKAENS LSSENPAALN
SSGSSKLRNL KFLRIPLDTS TREGTSDGVA SQRITMFKNI SGHQGFFLSG SRPGWCMLFR
ERLRFHSQLC DGSIAAFTVL HNVNCNHGFI YVTAQGVLKI CQLPSASIYD NYWPVQKIPL
KATPHQVTYY AEKNLYPLIV SYPVSKPLNQ VLSSLVDQEA GQQLDNHNMS SDDLQRTYTV
EEFEIQILEP ERSGGPWETK AKIPMQTSEH ALTVRVVTLL NASTGENETL LAVGTAYVQG
EDVAARGRVL LFSFGKNGDN SQNVVTEVYS RELKGAISAV ASIQGHLLIS SGPKIILHKW
NGTELNGVAF FDAPPLYVVS MNVVKSFILL GDVHKSIYFL SWKEQGSQLS LLAKDFESLD
CFATEFLIDG STLSLAVSDE QKNIQVFYYA PKMIESWKGL KLLSRAEFHV GAHVSKFLRL
QMVSSGADKI NRFALLFGTL DGSFGCIAPL DEVTFRRLQS LQKKLVDAVP HVAGLNPLAF
RQFRSSGKAR RSGPDSIVDC ELLCHYEMLP LEEQLELAHQ IGTTRYSILK DLVDLSVGTS
FL