POL_SFVCP

ID   POL_SFVCP               Reviewed;        1146 AA.
AC   Q87040;
DT   07-JUL-2009, integrated into UniProtKB/Swiss-Prot.
DT   01-NOV-1996, sequence version 1.
DT   03-AUG-2022, entry version 122.
DE   RecName: Full=Pro-Pol polyprotein;
DE   AltName: Full=Pr125Pol;
DE   Contains:
DE     RecName: Full=Protease/Reverse transcriptase/ribonuclease H;
DE              EC=2.7.7.49;
DE              EC=2.7.7.7;
DE              EC=3.1.26.4;
DE              EC=3.4.23.-;
DE     AltName: Full=p87Pro-RT-RNaseH;
DE   Contains:
DE     RecName: Full=Protease/Reverse transcriptase;
DE              EC=2.7.7.49;
DE              EC=2.7.7.7;
DE              EC=3.4.23.-;
DE     AltName: Full=p65Pro-RT;
DE   Contains:
DE     RecName: Full=Ribonuclease H;
DE              Short=RNase H;
DE              EC=3.1.26.4;
DE   Contains:
DE     RecName: Full=Integrase;
DE              Short=IN;
DE              EC=2.7.7.- {ECO:0000305|PubMed:23872492};
DE              EC=3.1.-.- {ECO:0000305|PubMed:23872492};
DE     AltName: Full=p42In;
GN   Name=pol;
OS   Simian foamy virus (isolate chimpanzee) (SFVcpz).
OC   Viruses; Riboviria; Pararnavirae; Artverviricota; Revtraviricetes;
OC   Ortervirales; Retroviridae; Spumaretrovirinae; Spumavirus.
OX   NCBI_TaxID=298339;
OH   NCBI_TaxID=9598; Pan troglodytes (Chimpanzee).
RN   [1]
RP   NUCLEOTIDE SEQUENCE [GENOMIC DNA].
RX   PubMed=8184531; DOI=10.1006/viro.1994.1285;
RA   Herchenroder O., Renne R., Loncar D., Cobb E.K., Murthy K.K., Schneider J.,
RA   Mergia A., Luciw P.A.;
RT   "Isolation, cloning, and sequencing of simian foamy viruses from
RT   chimpanzees (SFVcpz): high homology to human foamy virus (HFV).";
RL   Virology 201:187-199(1994).
RN   [2]
RP   FUNCTION (INTEGRASE).
RX   PubMed=23872492; DOI=10.3390/v5071850;
RA   Hossain M.A., Ali M.K., Shin C.G.;
RT   "Structural and functional insights into foamy viral integrase.";
RL   Viruses 5:1850-1866(2013).
CC   -!- FUNCTION: The aspartyl protease activity mediates proteolytic cleavages
CC       of Gag and Pol polyproteins. The reverse transcriptase (RT) activity
CC       converts the viral RNA genome into dsDNA in the cytoplasm, shortly
CC       after virus entry into the cell (early reverse transcription) or after
CC       proviral DNA transcription (late reverse transcription). RT consists of
CC       a DNA polymerase activity that can copy either DNA or RNA templates,
CC       and a ribonuclease H (RNase H) activity that cleaves the RNA strand of
CC       RNA-DNA heteroduplexes in a partially processive 3' to 5' endonucleasic
CC       mode. Conversion of viral genomic RNA into dsDNA requires many steps. A
CC       tRNA-Lys1,2 binds to the primer-binding site (PBS) situated at the 5'-
CC       end of the viral RNA. RT uses the 3' end of the tRNA primer to perform
CC       a short round of RNA-dependent minus-strand DNA synthesis. The reading
CC       proceeds through the U5 region and ends after the repeated (R) region
CC       which is present at both ends of viral RNA. The portion of the RNA-DNA
CC       heteroduplex is digested by the RNase H, resulting in a ssDNA product
CC       attached to the tRNA primer. This ssDNA/tRNA hybridizes with the
CC       identical R region situated at the 3' end of viral RNA. This template
CC       exchange, known as minus-strand DNA strong stop transfer, can be either
CC       intra- or intermolecular. RT uses the 3' end of this newly synthesized
CC       short ssDNA to perform the RNA-dependent minus-strand DNA synthesis of
CC       the whole template. RNase H digests the RNA template except for a
CC       polypurine tract (PPT) situated at the 5'-end and near the center of
CC       the genome. It is not clear if both polymerase and RNase H activities
CC       are simultaneous. RNase H probably can proceed both in a polymerase-
CC       dependent (RNA cut into small fragments by the same RT performing DNA
CC       synthesis) and a polymerase-independent mode (cleavage of remaining RNA
CC       fragments by free RTs). Secondly, RT performs DNA-directed plus-strand
CC       DNA synthesis using the PPT that has not been removed by RNase H as
CC       primer. PPT and tRNA primers are then removed by RNase H. The 3' and 5'
CC       ssDNA PBS regions hybridize to form a circular dsDNA intermediate.
CC       Strand displacement synthesis by RT to the PBS and PPT ends produces a
CC       blunt ended, linear dsDNA copy of the viral genome that includes long
CC       terminal repeats (LTRs) at both ends (By similarity). {ECO:0000250}.
CC   -!- FUNCTION: Integrase catalyzes viral DNA integration into the host
CC       chromosome, by performing a series of DNA cutting and joining
CC       reactions. This enzyme activity takes place after virion entry into a
CC       cell and reverse transcription of the RNA genome in dsDNA. The first
CC       step in the integration process is 3' processing. This step requires a
CC       complex comprising at least the viral genome, matrix protein, and
CC       integrase. This complex is called the pre-integration complex (PIC).
CC       The integrase protein removes 2 nucleotides from the 3' end of the
CC       viral DNA right (U5) end, leaving the left (U3) intact. In the second
CC       step, the PIC enters cell nucleus. This process is mediated through the
CC       integrase and allows the virus to infect both dividing (nuclear
CC       membrane disassembled) and G1/S-arrested cells (active translocation),
CC       but with no viral gene expression in the latter. In the third step,
CC       termed strand transfer, the integrase protein joins the previously
CC       processed 3' ends to the 5' ends of strands of target cellular DNA at
CC       the site of integration. It is however not clear how integration then
CC       proceeds to resolve the asymmetrical cleavage of viral DNA (By
CC       similarity). {ECO:0000250}.
CC   -!- CATALYTIC ACTIVITY:
CC       Reaction=Endonucleolytic cleavage to 5'-phosphomonoester.; EC=3.1.26.4;
CC         Evidence={ECO:0000255|PROSITE-ProRule:PRU00408};
CC   -!- CATALYTIC ACTIVITY:
CC       Reaction=a 2'-deoxyribonucleoside 5'-triphosphate + DNA(n) =
CC         diphosphate + DNA(n+1); Xref=Rhea:RHEA:22508, Rhea:RHEA-COMP:17339,
CC         Rhea:RHEA-COMP:17340, ChEBI:CHEBI:33019, ChEBI:CHEBI:61560,
CC         ChEBI:CHEBI:173112; EC=2.7.7.49; Evidence={ECO:0000255|PROSITE-
CC         ProRule:PRU00405};
CC   -!- CATALYTIC ACTIVITY:
CC       Reaction=a 2'-deoxyribonucleoside 5'-triphosphate + DNA(n) =
CC         diphosphate + DNA(n+1); Xref=Rhea:RHEA:22508, Rhea:RHEA-COMP:17339,
CC         Rhea:RHEA-COMP:17340, ChEBI:CHEBI:33019, ChEBI:CHEBI:61560,
CC         ChEBI:CHEBI:173112; EC=2.7.7.7; Evidence={ECO:0000255|PROSITE-
CC         ProRule:PRU00405};
CC   -!- COFACTOR:
CC       Name=Mg(2+); Xref=ChEBI:CHEBI:18420; Evidence={ECO:0000250};
CC       Note=Binds 2 magnesium ions for reverse transcriptase polymerase
CC       activity. {ECO:0000250};
CC   -!- COFACTOR:
CC       Name=Mg(2+); Xref=ChEBI:CHEBI:18420; Evidence={ECO:0000250};
CC       Note=Binds 2 magnesium ions for ribonuclease H (RNase H) activity.
CC       Substrate-binding is a precondition for magnesium binding.
CC       {ECO:0000250};
CC   -!- COFACTOR:
CC       Name=Mg(2+); Xref=ChEBI:CHEBI:18420; Evidence={ECO:0000250};
CC       Note=Magnesium ions are required for integrase activity. Binds at least
CC       1, maybe 2 magnesium ions. {ECO:0000250};
CC   -!- SUBUNIT: The protease is a homodimer, whose active site consists of two
CC       apposed aspartic acid residues. {ECO:0000255|PROSITE-ProRule:PRU00863}.
CC   -!- SUBCELLULAR LOCATION: [Integrase]: Virion {ECO:0000305}. Host nucleus.
CC       Host cytoplasm {ECO:0000305}. Note=Nuclear at initial phase,
CC       cytoplasmic at assembly. {ECO:0000305}.
CC   -!- SUBCELLULAR LOCATION: [Protease/Reverse transcriptase/ribonuclease H]:
CC       Host nucleus {ECO:0000250}. Host cytoplasm {ECO:0000305}. Note=Nuclear
CC       at initial phase, cytoplasmic at assembly. {ECO:0000305}.
CC   -!- DOMAIN: The reverse transcriptase/ribonuclease H (RT) is structured in
CC       five subdomains: finger, palm, thumb, connection and RNase H. Within
CC       the palm subdomain, the 'primer grip' region is thought to be involved
CC       in the positioning of the primer terminus for accommodating the
CC       incoming nucleotide. The RNase H domain stabilizes the association of
CC       RT with primer-template (By similarity). {ECO:0000250}.
CC   -!- DOMAIN: Integrase core domain contains the D-x(n)-D-x(35)-E motif,
CC       named for the phylogenetically conserved glutamic acid and aspartic
CC       acid residues and the invariant 35 amino acid spacing between the
CC       second and third acidic residues. Each acidic residue of the D,D(35)E
CC       motif is independently essential for the 3'-processing and strand
CC       transfer activities of purified integrase protein (By similarity).
CC       {ECO:0000250}.
CC   -!- PTM: Specific enzymatic cleavages in vivo by viral protease yield
CC       mature proteins. The protease is not cleaved off from Pol. Since
CC       cleavage efficiency is not optimal for all sites, long and active
CC       p65Pro-RT, p87Pro-RT-RNaseH and even some Pr125Pol are detected in
CC       infected cells (By similarity). {ECO:0000250}.
CC   -!- MISCELLANEOUS: The reverse transcriptase is an error-prone enzyme that
CC       lacks a proof-reading function. High mutations rate is a direct
CC       consequence of this characteristic. RT also displays frequent template
CC       switching leading to high recombination rate. Recombination mostly
CC       occurs between homologous regions of the two copackaged RNA genomes. If
CC       these two RNA molecules derive from different viral strains, reverse
CC       transcription will give rise to highly recombinated proviral DNAs.
CC   -!- MISCELLANEOUS: Foamy viruses are distinct from other retroviruses in
CC       many respects. Their protease is active as an uncleaved Pro-Pol
CC       protein. Mature particles do not include the usual processed retroviral
CC       structural protein (MA, CA and NC), but instead contain two large Gag
CC       proteins. Their functional nucleic acid appears to be either RNA or
CC       dsDNA (up to 20% of extracellular particles), because they probably
CC       proceed either to an early (before integration) or late reverse
CC       transcription (after assembly). Foamy viruses have the ability to
CC       retrotranspose intracellularly with high efficiency. They bud
CC       predominantly into the endoplasmic reticulum (ER) and occasionally at
CC       the plasma membrane. Budding requires the presence of Env proteins.
CC       Most viral particles probably remain within the infected cell.
CC   ---------------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC   ---------------------------------------------------------------------------
DR   EMBL; U04327; AAA19978.1; -; Genomic_DNA.
DR   RefSeq; NP_056803.1; NC_001364.1.
DR   SMR; Q87040; -.
DR   MEROPS; A09.001; -.
DR   GeneID; 1489965; -.
DR   KEGG; vg:1489965; -.
DR   Proteomes; UP000001063; Genome.
DR   GO; GO:0030430; C:host cell cytoplasm; IEA:UniProtKB-SubCell.
DR   GO; GO:0042025; C:host cell nucleus; IEA:UniProtKB-SubCell.
DR   GO; GO:0004190; F:aspartic-type endopeptidase activity; IEA:UniProtKB-KW.
DR   GO; GO:0003887; F:DNA-directed DNA polymerase activity; IEA:UniProtKB-KW.
DR   GO; GO:0046872; F:metal ion binding; IEA:UniProtKB-KW.
DR   GO; GO:0003723; F:RNA binding; IEA:UniProtKB-KW.
DR   GO; GO:0003964; F:RNA-directed DNA polymerase activity; IEA:UniProtKB-KW.
DR   GO; GO:0004523; F:RNA-DNA hybrid ribonuclease activity; IEA:UniProtKB-EC.
DR   GO; GO:0015074; P:DNA integration; IEA:UniProtKB-KW.
DR   GO; GO:0006310; P:DNA recombination; IEA:UniProtKB-KW.
DR   GO; GO:0075713; P:establishment of integrated proviral latency; IEA:UniProtKB-KW.
DR   GO; GO:0006508; P:proteolysis; IEA:UniProtKB-KW.
DR   GO; GO:0046718; P:viral entry into host cell; IEA:UniProtKB-KW.
DR   GO; GO:0044826; P:viral genome integration into host DNA; IEA:UniProtKB-KW.
DR   GO; GO:0075732; P:viral penetration into host nucleus; IEA:UniProtKB-KW.
DR   Gene3D; 2.40.70.10; -; 1.
DR   Gene3D; 3.30.420.10; -; 2.
DR   Gene3D; 3.30.70.270; -; 2.
DR   InterPro; IPR043502; DNA/RNA_pol_sf.
DR   InterPro; IPR001584; Integrase_cat-core.
DR   InterPro; IPR041588; Integrase_H2C2.
DR   InterPro; IPR021109; Peptidase_aspartic_dom_sf.
DR   InterPro; IPR043128; Rev_trsase/Diguanyl_cyclase.
DR   InterPro; IPR012337; RNaseH-like_sf.
DR   InterPro; IPR002156; RNaseH_domain.
DR   InterPro; IPR036397; RNaseH_sf.
DR   InterPro; IPR000477; RT_dom.
DR   InterPro; IPR041577; RT_RNaseH_2.
DR   InterPro; IPR040903; SH3_11.
DR   InterPro; IPR001641; Spumavirus_A9.
DR   Pfam; PF17921; Integrase_H2C2; 1.
DR   Pfam; PF00075; RNase_H; 1.
DR   Pfam; PF17919; RT_RNaseH_2; 1.
DR   Pfam; PF00665; rve; 1.
DR   Pfam; PF00078; RVT_1; 1.
DR   Pfam; PF18103; SH3_11; 1.
DR   Pfam; PF03539; Spuma_A9PTase; 1.
DR   PRINTS; PR00920; SPUMVIRPTASE.
DR   SUPFAM; SSF53098; SSF53098; 2.
DR   SUPFAM; SSF56672; SSF56672; 1.
DR   PROSITE; PS51531; FV_PR; 1.
DR   PROSITE; PS50994; INTEGRASE; 1.
DR   PROSITE; PS50879; RNASE_H_1; 1.
DR   PROSITE; PS50878; RT_POL; 1.
PE   3: Inferred from homology;
KW   Aspartyl protease; DNA integration; DNA recombination;
KW   DNA-directed DNA polymerase; Endonuclease; Host cytoplasm; Host nucleus;
KW   Hydrolase; Magnesium; Metal-binding; Multifunctional enzyme; Nuclease;
KW   Nucleotidyltransferase; Protease; Reference proteome; RNA-binding;
KW   RNA-directed DNA polymerase; Transferase; Viral genome integration;
KW   Viral penetration into host nucleus; Virion; Virus entry into host cell.
FT   CHAIN           1..1146
FT                   /note="Pro-Pol polyprotein"
FT                   /id="PRO_0000378594"
FT   CHAIN           1..751
FT                   /note="Protease/Reverse transcriptase/ribonuclease H"
FT                   /evidence="ECO:0000250"
FT                   /id="PRO_0000378595"
FT   CHAIN           1..596
FT                   /note="Protease/Reverse transcriptase"
FT                   /evidence="ECO:0000250"
FT                   /id="PRO_0000378596"
FT   CHAIN           597..751
FT                   /note="Ribonuclease H"
FT                   /evidence="ECO:0000250"
FT                   /id="PRO_0000378597"
FT   CHAIN           752..1143
FT                   /note="Integrase"
FT                   /evidence="ECO:0000250"
FT                   /id="PRO_0000378598"
FT   DOMAIN          1..143
FT                   /note="Peptidase A9"
FT                   /evidence="ECO:0000255|PROSITE-ProRule:PRU00863"
FT   DOMAIN          198..363
FT                   /note="Reverse transcriptase"
FT                   /evidence="ECO:0000255|PROSITE-ProRule:PRU00405"
FT   DOMAIN          590..748
FT                   /note="RNase H type-1"
FT                   /evidence="ECO:0000255|PROSITE-ProRule:PRU00408"
FT   DOMAIN          868..1024
FT                   /note="Integrase catalytic"
FT                   /evidence="ECO:0000255|PROSITE-ProRule:PRU00457"
FT   REGION          1114..1146
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   ACT_SITE        24
FT                   /note="For protease activity"
FT                   /evidence="ECO:0000255|PROSITE-ProRule:PRU00863"
FT   BINDING         252
FT                   /ligand="Mg(2+)"
FT                   /ligand_id="ChEBI:CHEBI:18420"
FT                   /ligand_label="1"
FT                   /ligand_note="catalytic; for reverse transcriptase
FT                   activity"
FT                   /evidence="ECO:0000250"
FT   BINDING         314
FT                   /ligand="Mg(2+)"
FT                   /ligand_id="ChEBI:CHEBI:18420"
FT                   /ligand_label="1"
FT                   /ligand_note="catalytic; for reverse transcriptase
FT                   activity"
FT                   /evidence="ECO:0000250"
FT   BINDING         315
FT                   /ligand="Mg(2+)"
FT                   /ligand_id="ChEBI:CHEBI:18420"
FT                   /ligand_label="1"
FT                   /ligand_note="catalytic; for reverse transcriptase
FT                   activity"
FT                   /evidence="ECO:0000250"
FT   BINDING         599
FT                   /ligand="Mg(2+)"
FT                   /ligand_id="ChEBI:CHEBI:18420"
FT                   /ligand_label="2"
FT                   /ligand_note="catalytic; for RNase H activity"
FT                   /evidence="ECO:0000250"
FT   BINDING         646
FT                   /ligand="Mg(2+)"
FT                   /ligand_id="ChEBI:CHEBI:18420"
FT                   /ligand_label="2"
FT                   /ligand_note="catalytic; for RNase H activity"
FT                   /evidence="ECO:0000250"
FT   BINDING         669
FT                   /ligand="Mg(2+)"
FT                   /ligand_id="ChEBI:CHEBI:18420"
FT                   /ligand_label="2"
FT                   /ligand_note="catalytic; for RNase H activity"
FT                   /evidence="ECO:0000250"
FT   BINDING         740
FT                   /ligand="Mg(2+)"
FT                   /ligand_id="ChEBI:CHEBI:18420"
FT                   /ligand_label="2"
FT                   /ligand_note="catalytic; for RNase H activity"
FT                   /evidence="ECO:0000250"
FT   BINDING         874
FT                   /ligand="Mg(2+)"
FT                   /ligand_id="ChEBI:CHEBI:18420"
FT                   /ligand_label="3"
FT                   /ligand_note="catalytic; for integrase activity"
FT                   /evidence="ECO:0000250"
FT   BINDING         936
FT                   /ligand="Mg(2+)"
FT                   /ligand_id="ChEBI:CHEBI:18420"
FT                   /ligand_label="3"
FT                   /ligand_note="catalytic; for integrase activity"
FT                   /evidence="ECO:0000250"
FT   SITE            596..597
FT                   /note="Cleavage; by viral protease; partial"
FT                   /evidence="ECO:0000250"
FT   SITE            751..752
FT                   /note="Cleavage; by viral protease"
FT                   /evidence="ECO:0000250"
SQ   SEQUENCE   1146 AA;  129934 MW;  EDA62B26864FDAA2 CRC64;
     MNPLQLLQPL PAEVKGTKLL AHWDSGATIT CIPESFLEDE QPIKQTLIKT IHGEKQQNVY
     YLTFKVKGRK VEAEVIASPY EYILLSPTDV PWLTQQPLQL TILVPLQEYQ DRILNKTALP
     EEQKQQLKAL FTKYDNLWQH WENQVGHRKI RPHNIATGDY PPRPQKQYPI NPKAKPSIQI
     VIDDLLKQGV LTPQNSTMNT PVYPVPKPDG RWRMVLDYRE VNKTIPLTAA QNQHSAGILA
     TIVRQKYKTT LDLANGFWAH PITPDSYWLT AFTWQGKQYC WTRLPQGFLN SPALFTADAV
     DLLKEVPNVQ VYVDDIYLSH DNPHEHIQQL EKVFQILLQA GYVVSLKKSE IGQRTVEFLG
     FNITKEGRGL TDTFKTKLLN VTPPKDLKQL QSILGLLNFA RNFIPNFAEL VQTLYNLIAS
     SKGKYIEWTE DNTKQLNKVI EALNTASNLE ERLPDQRLVI KVNTSPSAGY VRYYNESGKK
     PIMYLNYVFS KAELKFSMLE KLLTTMHKAL IKAMDLAMGQ EILVYSPIVS MTKIQKTPLP
     ERKALPIRWI TWMTYLEDPR IQFHYDKTLP ELKHIPDVYT SSIPPLKHPS QYEGVFCTDG
     SAIKSPDPTK SNNAGMGIVH AIYNPEYKIL NQWSIPLGHH TAQMAEIAAV EFACKKALKV
     PGPVLVITDS FYVAESANKE LPYWKSNGFV NNKKEPLKHI SKWKSIAECL SIKPDITIQH
     EKGHQPINTS IHTEGNALAD KLATQGSYVV NCNTKKPNLD AELDQLLQGN NVKGYPKQYT
     YYLEDGKVKV SRPEGVKIIP PQSDRQKIVL QAHNLAHTGR EATLLKIANL YWWPNMRKDV
     VKQLGRCKQC LITNASNKTS GPILRPDRPQ KPFDKFFIDY IGPLPPSQGY LYVLVIVDGM
     TGFTWLYPTK APSTSATVKS LNVLTSIAIP KVIHSDQGAA FTSSTFAEWA KERGIHLEFS
     TPYHPQSSGK VERKNSDIKR LLTKLLVGRP TKWYDLLPVV QLALNNTYSP VLKYTPHQLL
     FGIDSNTPFA NQDTLDLTRE EELSLLQEIR ASLYQPSTPP ASSRSWSPVV GQLVQERVAR
     PASLRPRWHK PSTVLEVLNP RTVVILDHLG NNRTVSIDNL KPTSHQNGTT NDTATMDHLE
     QNEQSS