GAG_SIVEK

ID   GAG_SIVEK               Reviewed;         511 AA.
AC   Q1A250;
DT   05-SEP-2006, integrated into UniProtKB/Swiss-Prot.
DT   23-JAN-2007, sequence version 3.
DT   23-FEB-2022, entry version 76.
DE   RecName: Full=Gag polyprotein;
DE   AltName: Full=Pr55Gag;
DE   Contains:
DE     RecName: Full=Matrix protein p17;
DE              Short=MA;
DE   Contains:
DE     RecName: Full=Capsid protein p24;
DE              Short=CA;
DE   Contains:
DE     RecName: Full=Spacer peptide p2;
DE   Contains:
DE     RecName: Full=Nucleocapsid protein p7;
DE              Short=NC;
DE   Contains:
DE     RecName: Full=Spacer peptide p1;
DE   Contains:
DE     RecName: Full=p6-gag;
GN   Name=gag;
OS   Simian immunodeficiency virus (isolate EK505) (SIV-cpz) (Chimpanzee
OS   immunodeficiency virus).
OC   Viruses; Riboviria; Pararnavirae; Artverviricota; Revtraviricetes;
OC   Ortervirales; Retroviridae; Orthoretrovirinae; Lentivirus.
OX   NCBI_TaxID=388912;
OH   NCBI_TaxID=9598; Pan troglodytes (Chimpanzee).
RN   [1]
RP   NUCLEOTIDE SEQUENCE [GENOMIC RNA].
RX   PubMed=16728595; DOI=10.1126/science.1126531;
RA   Keele B.F., Van Heuverswyn F., Li Y., Bailes E., Takehisa J.,
RA   Santiago M.L., Bibollet-Ruche F., Chen Y., Wain L.V., Liegeois F., Loul S.,
RA   Ngole E.M., Bienvenue Y., Delaporte E., Brookfield J.F., Sharp P.M.,
RA   Shaw G.M., Peeters M., Hahn B.H.;
RT   "Chimpanzee reservoirs of pandemic and nonpandemic HIV-1.";
RL   Science 313:523-526(2006).
CC   -!- FUNCTION: Matrix protein p17 targets Gag and Gag-Pol polyproteins to
CC       the plasma membrane via a multipartite membrane binding signal, that
CC       includes its myristoylated N-terminus. Also mediates nuclear
CC       localization of the preintegration complex. Implicated in the release
CC       from host cell mediated by Vpu (By similarity). {ECO:0000250}.
CC   -!- FUNCTION: Capsid protein p24 forms the conical core of the virus that
CC       encapsulates the genomic RNA-nucleocapsid complex. {ECO:0000250}.
CC   -!- FUNCTION: Nucleocapsid protein p7 encapsulates and protects viral
CC       dimeric unspliced (genomic) RNA. Binds these RNAs through its zinc
CC       fingers (By similarity). {ECO:0000250}.
CC   -!- FUNCTION: p6-gag plays a role in budding of the assembled particle by
CC       interacting with the host class E VPS proteins TSG101 and PDCD6IP/AIP1.
CC       {ECO:0000250}.
CC   -!- SUBUNIT: [Matrix protein p17]: Homotrimer. Interacts with gp41 (via C-
CC       terminus). {ECO:0000250|UniProtKB:P04591,
CC       ECO:0000250|UniProtKB:P12493}.
CC   -!- SUBUNIT: [p6-gag]: Interacts with host TSG101 (By similarity).
CC       {ECO:0000250|UniProtKB:P12493}.
CC   -!- SUBCELLULAR LOCATION: [Matrix protein p17]: Virion {ECO:0000305}. Host
CC       nucleus {ECO:0000250}. Host cytoplasm {ECO:0000250}. Host cell membrane
CC       {ECO:0000305}; Lipid-anchor {ECO:0000305}. Note=Following virus entry,
CC       the nuclear localization signal (NLS) of the matrix protein
CC       participates with Vpr to the nuclear localization of the viral genome.
CC       During virus production, the nuclear export activity of the matrix
CC       protein counteracts the NLS to maintain the Gag and Gag-Pol
CC       polyproteins in the cytoplasm, thereby directing unspliced RNA to the
CC       plasma membrane (By similarity). {ECO:0000250}.
CC   -!- SUBCELLULAR LOCATION: [Capsid protein p24]: Virion {ECO:0000305}.
CC   -!- SUBCELLULAR LOCATION: [Nucleocapsid protein p7]: Virion {ECO:0000305}.
CC   -!- ALTERNATIVE PRODUCTS:
CC       Event=Ribosomal frameshifting; Named isoforms=2;
CC         Comment=Translation results in the formation of the Gag polyprotein
CC         most of the time. Ribosomal frameshifting at the gag-pol genes
CC         boundary occurs at low frequency and produces the Gag-Pol
CC         polyprotein. This strategy of translation probably allows the virus
CC         to modulate the quantity of each viral protein. Maintenance of a
CC         correct Gag to Gag-Pol ratio is essential for RNA dimerization and
CC         viral infectivity.;
CC       Name=Gag polyprotein;
CC         IsoId=Q1A250-1; Sequence=Displayed;
CC       Name=Gag-Pol polyprotein;
CC         IsoId=Q1A249-1; Sequence=External;
CC   -!- DOMAIN: Late-budding domains (L domains) are short sequence motifs
CC       essential for viral particle budding. They recruit proteins of the host
CC       ESCRT machinery (Endosomal Sorting Complex Required for Transport) or
CC       ESCRT-associated proteins. p6-gag contains two L domains: a PTAP/PSAP
CC       motif, which interacts with the UEV domain of TSG101 and a LYPX(n)L
CC       motif which interacts with PDCD6IP/AIP1 (By similarity). {ECO:0000250}.
CC   -!- PTM: Capsid protein p24 is phosphorylated. {ECO:0000250}.
CC   -!- PTM: Specific enzymatic cleavages by the viral protease yield mature
CC       proteins. The polyprotein is cleaved during and after budding, this
CC       process is termed maturation (By similarity). {ECO:0000250}.
CC   -!- MISCELLANEOUS: [Isoform Gag polyprotein]: Produced by conventional
CC       translation.
CC   -!- SIMILARITY: Belongs to the primate lentivirus group gag polyprotein
CC       family. {ECO:0000305}.
CC   ---------------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC   ---------------------------------------------------------------------------
DR   EMBL; DQ373065; ABD19492.1; -; Genomic_RNA.
DR   SMR; Q1A250; -.
DR   PRO; PR:Q1A250; -.
DR   Proteomes; UP000008436; Genome.
DR   GO; GO:0030430; C:host cell cytoplasm; IEA:UniProtKB-SubCell.
DR   GO; GO:0042025; C:host cell nucleus; IEA:UniProtKB-SubCell.
DR   GO; GO:0020002; C:host cell plasma membrane; IEA:UniProtKB-SubCell.
DR   GO; GO:0016020; C:membrane; IEA:UniProtKB-KW.
DR   GO; GO:0019013; C:viral nucleocapsid; IEA:UniProtKB-KW.
DR   GO; GO:0003723; F:RNA binding; IEA:UniProtKB-KW.
DR   GO; GO:0005198; F:structural molecule activity; IEA:InterPro.
DR   GO; GO:0008270; F:zinc ion binding; IEA:InterPro.
DR   GO; GO:0039702; P:viral budding via host ESCRT complex; IEA:UniProtKB-KW.
DR   Gene3D; 1.10.1200.30; -; 1.
DR   Gene3D; 1.10.150.90; -; 1.
DR   Gene3D; 1.10.375.10; -; 1.
DR   InterPro; IPR045345; Gag_p24_C.
DR   InterPro; IPR000721; Gag_p24_N.
DR   InterPro; IPR014817; Gag_p6.
DR   InterPro; IPR000071; Lentvrl_matrix_N.
DR   InterPro; IPR012344; Matrix_HIV/RSV_N.
DR   InterPro; IPR008916; Retrov_capsid_C.
DR   InterPro; IPR008919; Retrov_capsid_N.
DR   InterPro; IPR010999; Retrovr_matrix.
DR   InterPro; IPR001878; Znf_CCHC.
DR   InterPro; IPR036875; Znf_CCHC_sf.
DR   Pfam; PF00540; Gag_p17; 1.
DR   Pfam; PF00607; Gag_p24; 1.
DR   Pfam; PF19317; Gag_p24_C; 1.
DR   Pfam; PF08705; Gag_p6; 1.
DR   Pfam; PF00098; zf-CCHC; 2.
DR   PRINTS; PR00234; HIV1MATRIX.
DR   SMART; SM00343; ZnF_C2HC; 2.
DR   SUPFAM; SSF47836; SSF47836; 1.
DR   SUPFAM; SSF47943; SSF47943; 1.
DR   SUPFAM; SSF57756; SSF57756; 1.
DR   PROSITE; PS50158; ZF_CCHC; 2.
PE   3: Inferred from homology;
KW   Capsid protein; Host cell membrane; Host cytoplasm; Host membrane;
KW   Host nucleus; Host-virus interaction; Lipoprotein; Membrane; Metal-binding;
KW   Myristate; Phosphoprotein; Reference proteome; Repeat;
KW   Ribosomal frameshifting; RNA-binding; Viral budding;
KW   Viral budding via the host ESCRT complexes; Viral nucleoprotein;
KW   Viral release from host cell; Virion; Zinc; Zinc-finger.
FT   INIT_MET        1
FT                   /note="Removed; by host"
FT                   /evidence="ECO:0000250"
FT   CHAIN           2..511
FT                   /note="Gag polyprotein"
FT                   /id="PRO_0000261252"
FT   CHAIN           2..135
FT                   /note="Matrix protein p17"
FT                   /evidence="ECO:0000250"
FT                   /id="PRO_0000249357"
FT   CHAIN           136..366
FT                   /note="Capsid protein p24"
FT                   /evidence="ECO:0000250"
FT                   /id="PRO_0000249358"
FT   PEPTIDE         367..380
FT                   /note="Spacer peptide p2"
FT                   /evidence="ECO:0000250"
FT                   /id="PRO_0000249359"
FT   CHAIN           381..437
FT                   /note="Nucleocapsid protein p7"
FT                   /evidence="ECO:0000250"
FT                   /id="PRO_0000249360"
FT   PEPTIDE         438..453
FT                   /note="Spacer peptide p1"
FT                   /evidence="ECO:0000250"
FT                   /id="PRO_0000249361"
FT   CHAIN           454..511
FT                   /note="p6-gag"
FT                   /evidence="ECO:0000250"
FT                   /id="PRO_0000249362"
FT   ZN_FING         393..410
FT                   /note="CCHC-type 1"
FT                   /evidence="ECO:0000255|PROSITE-ProRule:PRU00047"
FT   ZN_FING         414..431
FT                   /note="CCHC-type 2"
FT                   /evidence="ECO:0000255|PROSITE-ProRule:PRU00047"
FT   REGION          107..129
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   REGION          443..511
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   MOTIF           16..22
FT                   /note="Nuclear export signal"
FT                   /evidence="ECO:0000250"
FT   MOTIF           26..32
FT                   /note="Nuclear localization signal"
FT                   /evidence="ECO:0000250"
FT   MOTIF           462..465
FT                   /note="PTAP/PSAP motif"
FT   MOTIF           493..503
FT                   /note="LYPX(n)L motif"
FT   COMPBIAS        477..491
FT                   /note="Basic and acidic residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        492..511
FT                   /note="Polar residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   SITE            135..136
FT                   /note="Cleavage; by viral protease"
FT                   /evidence="ECO:0000250"
FT   SITE            366..367
FT                   /note="Cleavage; by viral protease"
FT                   /evidence="ECO:0000250"
FT   SITE            380..381
FT                   /note="Cleavage; by viral protease"
FT                   /evidence="ECO:0000250"
FT   SITE            437..438
FT                   /note="Cleavage; by viral protease"
FT                   /evidence="ECO:0000250"
FT   SITE            453..454
FT                   /note="Cleavage; by viral protease"
FT                   /evidence="ECO:0000250"
FT   LIPID           2
FT                   /note="N-myristoyl glycine; by host"
FT                   /evidence="ECO:0000250"
SQ   SEQUENCE   511 AA;  56750 MW;  5DEE460AA84CE2F6 CRC64;
     MGARASVLTG GKLDQWEKIY LRPGGKKKYM MKHLVWASRE LERFACNPGL MDTAEGCAQL
     LRQLEPALKT GSEGLRSLFN TLAVLYCVHN NIKVQNTQEA LEKLREKMKA EQKEPEPEQA
     AGAAAAPESS ISRNYPLVQN AQGQMVHQPL SPRTLNAWVK VVEEKAFNPE VIPMFMALSE
     GATPQDLNTM LNTVGGHQAA MQMLKEVINE EAAEWDRGHP VHMGPIPPGQ VREPRGSDIA
     GTTSTLAEQV AWMTANPPVP VGDIYRRWIV LGLNKIVRMY SPASILDIKQ GPKETFRDYV
     DRFYKTLRAE QATQEVKNWM TETLLVQNAN PDCKNILRAL GPGASLEEMM TACQGVGGPA
     HKARVLAEAM TQAQTATSVF MQRGNFKGIR KTIKCFNCGK EGHLARNCKA PRKKGCWKCG
     QEGHQMKDCR SGERQANFLG KVWPLSKGRP GNFPQTTTRK EPTAPPIESY GYQEEKTTQG
     TEREEKEKTE SSLYPPLTSL KSLFGSDPSL Q