POLR2_ARATH
ID POLR2_ARATH Reviewed; 1456 AA.
AC Q9ZT94; J7MCQ0; Q9SXQ2;
DT 25-OCT-2017, integrated into UniProtKB/Swiss-Prot.
DT 01-MAY-1999, sequence version 1.
DT 03-AUG-2022, entry version 113.
DE RecName: Full=Retrovirus-related Pol polyprotein from transposon RE2;
DE AltName: Full=Retro element 2 {ECO:0000303|PubMed:10689195};
DE Short=AtRE2 {ECO:0000303|PubMed:10689195};
DE Includes:
DE RecName: Full=Protease RE2;
DE EC=3.4.23.-;
DE Includes:
DE RecName: Full=Reverse transcriptase RE2;
DE EC=2.7.7.49;
DE Includes:
DE RecName: Full=Endonuclease RE2;
GN Name=RE2; OrderedLocusNames=At4g02960 {ECO:0000312|EMBL:CAB77781.1};
GN ORFNames=T4I9.16 {ECO:0000312|EMBL:AAC79110.1};
OS Arabidopsis thaliana (Mouse-ear cress).
OC Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
OC Spermatophyta; Magnoliopsida; eudicotyledons; Gunneridae; Pentapetalae;
OC rosids; malvids; Brassicales; Brassicaceae; Camelineae; Arabidopsis.
OX NCBI_TaxID=3702;
RN [1]
RP NUCLEOTIDE SEQUENCE [GENOMIC DNA], GENE FAMILY, AND NOMENCLATURE.
RC STRAIN=cv. Columbia;
RX PubMed=10689195; DOI=10.1016/s0378-1119(99)00565-x;
RA Kuwahara A., Kato A., Komeda Y.;
RT "Isolation and characterization of copia-type retrotransposons in
RT Arabidopsis thaliana.";
RL Gene 244:127-136(2000).
RN [2]
RP NUCLEOTIDE SEQUENCE [GENOMIC DNA], AND
RP VARIANT THR-GLN-LEU-LYS-GLN-TRP-THR-LYS-GLY-ALA-LYS-THR-ILE-ASP-ASP-TYR-
RP MET-GLN-GLY-129 INS.
RC STRAIN=cv. Is-1, cv. No-0, and cv. Ts-1;
RX PubMed=24770782; DOI=10.1007/s00438-014-0855-z;
RA Yamada M., Yamagishi Y., Akaoka M., Ito H., Kato A.;
RT "Genomic localization of AtRE1 and AtRE2, copia-type retrotransposons, in
RT natural variants of Arabidopsis thaliana.";
RL Mol. Genet. Genomics 289:821-835(2014).
RN [3]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=cv. Columbia;
RX PubMed=10617198; DOI=10.1038/47134;
RA Mayer K.F.X., Schueller C., Wambutt R., Murphy G., Volckaert G., Pohl T.,
RA Duesterhoeft A., Stiekema W., Entian K.-D., Terryn N., Harris B.,
RA Ansorge W., Brandt P., Grivell L.A., Rieger M., Weichselgartner M.,
RA de Simone V., Obermaier B., Mache R., Mueller M., Kreis M., Delseny M.,
RA Puigdomenech P., Watson M., Schmidtheini T., Reichert B., Portetelle D.,
RA Perez-Alonso M., Boutry M., Bancroft I., Vos P., Hoheisel J.,
RA Zimmermann W., Wedler H., Ridley P., Langham S.-A., McCullagh B.,
RA Bilham L., Robben J., van der Schueren J., Grymonprez B., Chuang Y.-J.,
RA Vandenbussche F., Braeken M., Weltjens I., Voet M., Bastiaens I., Aert R.,
RA Defoor E., Weitzenegger T., Bothe G., Ramsperger U., Hilbert H., Braun M.,
RA Holzer E., Brandt A., Peters S., van Staveren M., Dirkse W., Mooijman P.,
RA Klein Lankhorst R., Rose M., Hauf J., Koetter P., Berneiser S., Hempel S.,
RA Feldpausch M., Lamberth S., Van den Daele H., De Keyser A., Buysshaert C.,
RA Gielen J., Villarroel R., De Clercq R., van Montagu M., Rogers J.,
RA Cronin A., Quail M.A., Bray-Allen S., Clark L., Doggett J., Hall S.,
RA Kay M., Lennard N., McLay K., Mayes R., Pettett A., Rajandream M.A.,
RA Lyne M., Benes V., Rechmann S., Borkova D., Bloecker H., Scharfe M.,
RA Grimm M., Loehnert T.-H., Dose S., de Haan M., Maarse A.C., Schaefer M.,
RA Mueller-Auer S., Gabel C., Fuchs M., Fartmann B., Granderath K., Dauner D.,
RA Herzl A., Neumann S., Argiriou A., Vitale D., Liguori R., Piravandi E.,
RA Massenet O., Quigley F., Clabauld G., Muendlein A., Felber R., Schnabl S.,
RA Hiller R., Schmidt W., Lecharny A., Aubourg S., Chefdor F., Cooke R.,
RA Berger C., Monfort A., Casacuberta E., Gibbons T., Weber N., Vandenbol M.,
RA Bargues M., Terol J., Torres A., Perez-Perez A., Purnelle B., Bent E.,
RA Johnson S., Tacon D., Jesse T., Heijnen L., Schwarz S., Scholler P.,
RA Heber S., Francs P., Bielke C., Frishman D., Haase D., Lemcke K.,
RA Mewes H.-W., Stocker S., Zaccaria P., Bevan M., Wilson R.K.,
RA de la Bastide M., Habermann K., Parnell L., Dedhia N., Gnoj L., Schutz K.,
RA Huang E., Spiegel L., Sekhon M., Murray J., Sheet P., Cordes M.,
RA Abu-Threideh J., Stoneking T., Kalicki J., Graves T., Harmon G.,
RA Edwards J., Latreille P., Courtney L., Cloud J., Abbott A., Scott K.,
RA Johnson D., Minx P., Bentley D., Fulton B., Miller N., Greco T., Kemp K.,
RA Kramer J., Fulton L., Mardis E., Dante M., Pepin K., Hillier L.W.,
RA Nelson J., Spieth J., Ryan E., Andrews S., Geisel C., Layman D., Du H.,
RA Ali J., Berghoff A., Jones K., Drone K., Cotton M., Joshu C., Antonoiu B.,
RA Zidanic M., Strong C., Sun H., Lamar B., Yordan C., Ma P., Zhong J.,
RA Preston R., Vil D., Shekher M., Matero A., Shah R., Swaby I.K.,
RA O'Shaughnessy A., Rodriguez M., Hoffman J., Till S., Granat S., Shohdy N.,
RA Hasegawa A., Hameed A., Lodhi M., Johnson A., Chen E., Marra M.A.,
RA Martienssen R., McCombie W.R.;
RT "Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana.";
RL Nature 402:769-777(1999).
RN [4]
RP GENOME REANNOTATION.
RC STRAIN=cv. Columbia;
RX PubMed=27862469; DOI=10.1111/tpj.13415;
RA Cheng C.Y., Krishnakumar V., Chan A.P., Thibaud-Nissen F., Schobel S.,
RA Town C.D.;
RT "Araport11: a complete reannotation of the Arabidopsis thaliana reference
RT genome.";
RL Plant J. 89:789-804(2017).
CC -!- CATALYTIC ACTIVITY:
CC Reaction=a 2'-deoxyribonucleoside 5'-triphosphate + DNA(n) =
CC diphosphate + DNA(n+1); Xref=Rhea:RHEA:22508, Rhea:RHEA-COMP:17339,
CC Rhea:RHEA-COMP:17340, ChEBI:CHEBI:33019, ChEBI:CHEBI:61560,
CC ChEBI:CHEBI:173112; EC=2.7.7.49;
CC -!- SEQUENCE CAUTION:
CC Sequence=BAA78424.1; Type=Erroneous gene model prediction; Evidence={ECO:0000305};
CC Sequence=BAM44533.1; Type=Erroneous gene model prediction; Evidence={ECO:0000305};
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AB021264; BAA78424.1; ALT_SEQ; Genomic_DNA.
DR EMBL; AB701744; BAM42647.1; -; Genomic_DNA.
DR EMBL; AB701745; BAM42648.1; -; Genomic_DNA.
DR EMBL; AB703312; BAM44533.1; ALT_SEQ; Genomic_DNA.
DR EMBL; AF069442; AAC79110.1; -; Genomic_DNA.
DR EMBL; AL161495; CAB77781.1; -; Genomic_DNA.
DR EMBL; CP002687; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR PIR; T01397; T01397.
DR AlphaFoldDB; Q9ZT94; -.
DR SMR; Q9ZT94; -.
DR MEROPS; A11.004; -.
DR PeptideAtlas; Q9ZT94; -.
DR PRIDE; Q9ZT94; -.
DR Araport; AT4G02960; -.
DR PRO; PR:Q9ZT94; -.
DR Proteomes; UP000006548; Chromosome 4.
DR ExpressionAtlas; Q9ZT94; baseline and differential.
DR GO; GO:0004190; F:aspartic-type endopeptidase activity; IEA:UniProtKB-KW.
DR GO; GO:0004519; F:endonuclease activity; IEA:UniProtKB-KW.
DR GO; GO:0046872; F:metal ion binding; IEA:UniProtKB-KW.
DR GO; GO:0003676; F:nucleic acid binding; IEA:InterPro.
DR GO; GO:0003964; F:RNA-directed DNA polymerase activity; IEA:UniProtKB-EC.
DR GO; GO:0015074; P:DNA integration; IEA:UniProtKB-KW.
DR GO; GO:0006310; P:DNA recombination; IEA:UniProtKB-KW.
DR GO; GO:0006508; P:proteolysis; IEA:UniProtKB-KW.
DR Gene3D; 3.30.420.10; -; 1.
DR InterPro; IPR043502; DNA/RNA_pol_sf.
DR InterPro; IPR025724; GAG-pre-integrase_dom.
DR InterPro; IPR001584; Integrase_cat-core.
DR InterPro; IPR012337; RNaseH-like_sf.
DR InterPro; IPR036397; RNaseH_sf.
DR InterPro; IPR013103; RVT_2.
DR Pfam; PF13976; gag_pre-integrs; 1.
DR Pfam; PF00665; rve; 1.
DR Pfam; PF07727; RVT_2; 1.
DR SUPFAM; SSF53098; SSF53098; 1.
DR SUPFAM; SSF56672; SSF56672; 1.
DR PROSITE; PS50994; INTEGRASE; 1.
PE 4: Predicted;
KW Aspartyl protease; DNA integration; DNA recombination; Endonuclease;
KW Hydrolase; Magnesium; Metal-binding; Nuclease; Protease;
KW Reference proteome; Transferase; Zinc; Zinc-finger.
FT CHAIN 1..1456
FT /note="Retrovirus-related Pol polyprotein from transposon
FT RE2"
FT /id="PRO_0000441909"
FT DOMAIN 498..661
FT /note="Integrase catalytic"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00457"
FT DOMAIN 965..1208
FT /note="Reverse transcriptase Ty1/copia-type"
FT /evidence="ECO:0000255"
FT ZN_FING 257..273
FT /note="CCHC-type"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00047"
FT REGION 205..252
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 276..295
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 738..896
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 205..249
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 738..760
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 761..785
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 786..839
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 840..854
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 855..882
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT ACT_SITE 313
FT /note="For protease activity"
FT /evidence="ECO:0000250"
FT BINDING 509
FT /ligand="Mg(2+)"
FT /ligand_id="ChEBI:CHEBI:18420"
FT /ligand_note="catalytic"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00457"
FT BINDING 571
FT /ligand="Mg(2+)"
FT /ligand_id="ChEBI:CHEBI:18420"
FT /ligand_note="catalytic"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00457"
FT VARIANT 129
FT /note="R -> RTQLKQWTKGAKTIDDYMQG (in strain: cv. Ts-1 and
FT cv. No-0)"
FT /evidence="ECO:0000269|PubMed:24770782"
FT CONFLICT 187
FT /note="R -> Q (in Ref. 2; BAM42647/BAM42648)"
FT /evidence="ECO:0000305"
FT CONFLICT 815
FT /note="H -> Y (in Ref. 2; BAM42647/BAM42648)"
FT /evidence="ECO:0000305"
FT CONFLICT 1431
FT /note="R -> L (in Ref. 2; BAM42647/BAM42648)"
FT /evidence="ECO:0000305"
SQ SEQUENCE 1456 AA; 162636 MW; 8E9D02D29C5FBA13 CRC64;
MATHAEEIVL VNTNILNVNM SNVTKLTSTN YLMWSRQVHA LFDGYELAGF LDGSTPMPPA
TIGTDAVPRV NPDYTRWRRQ DKLIYSAILG AISMSVQPAV SRATTAAQIW ETLRKIYANP
SYGHVTQLRF ITRFDQLALL GKPMDHDEQV ERVLENLPDD YKPVIDQIAA KDTPPSLTEI
HERLINRESK LLALNSAEVV PITANVVTHR NTNTNRNQNN RGDNRNYNNN NNRSNSWQPS
SSGSRSDNRQ PKPYLGRCQI CSVQGHSAKR CPQLHQFQST TNQQQSTSPF TPWQPRANLA
VNSPYNANNW LLDSGATHHI TSDFNNLSFH QPYTGGDDVM IADGSTIPIT HTGSASLPTS
SRSLDLNKVL YVPNIHKNLI SVYRLCNTNR VSVEFFPASF QVKDLNTGVP LLQGKTKDEL
YEWPIASSQA VSMFASPCSK ATHSSWHSRL GHPSLAILNS VISNHSLPVL NPSHKLLSCS
DCFINKSHKV PFSNSTITSS KPLEYIYSDV WSSPILSIDN YRYYVIFVDH FTRYTWLYPL
KQKSQVKDTF IIFKSLVENR FQTRIGTLYS DNGGEFVVLR DYLSQHGISH FTSPPHTPEH
NGLSERKHRH IVEMGLTLLS HASVPKTYWP YAFSVAVYLI NRLPTPLLQL QSPFQKLFGQ
PPNYEKLKVF GCACYPWLRP YNRHKLEDKS KQCAFMGYSL TQSAYLCLHI PTGRLYTSRH
VQFDERCFPF STTNFGVSTS QEQRSDSAPN WPSHTTLPTT PLVLPAPPCL GPHLDTSPRP
PSSPSPLCTT QVSSSNLPSS SISSPSSSEP TAPSHNGPQP TAQPHQTQNS NSNSPILNNP
NPNSPSPNSP NQNSPLPQSP ISSPHIPTPS TSISEPNSPS SSSTSTPPLP PVLPAPPIIQ
VNAQAPVNTH SMATRAKDGI RKPNQKYSYA TSLAANSEPR TAIQAMKDDR WRQAMGSEIN
AQIGNHTWDL VPPPPPSVTI VGCRWIFTKK FNSDGSLNRY KARLVAKGYN QRPGLDYAET
FSPVIKSTSI RIVLGVAVDR SWPIRQLDVN NAFLQGTLTD EVYMSQPPGF VDKDRPDYVC
RLRKAIYGLK QAPRAWYVEL RTYLLTVGFV NSISDTSLFV LQRGRSIIYM LVYVDDILIT
GNDTVLLKHT LDALSQRFSV KEHEDLHYFL GIEAKRVPQG LHLSQRRYTL DLLARTNMLT
AKPVATPMAT SPKLTLHSGT KLPDPTEYRG IVGSLQYLAF TRPDLSYAVN RLSQYMHMPT
DDHWNALKRV LRYLAGTPDH GIFLKKGNTL SLHAYSDADW AGDTDDYVST NGYIVYLGHH
PISWSSKKQK GVVRSSTEAE YRSVANTSSE LQWICSLLTE LGIQLSHPPV IYCDNVGATY
LCANPVFHSR MKHIALDYHF IRNQVQSGAL RVVHVSTHDQ LADTLTKPLS RVAFQNFSRK
IGVIKVPPSC GGVLRI