GAG_AVISN

ID   GAG_AVISN               Reviewed;         313 AA.
AC   P03342;
DT   21-JUL-1986, integrated into UniProtKB/Swiss-Prot.
DT   21-JUL-1986, sequence version 1.
DT   02-JUN-2021, entry version 91.
DE   RecName: Full=Gag polyprotein;
DE   AltName: Full=Core polyprotein;
DE   Contains:
DE     RecName: Full=Matrix protein p15;
DE              Short=MA;
DE   Contains:
DE     RecName: Full=RNA-binding phosphoprotein p12;
DE     AltName: Full=pp12;
DE   Contains:
DE     RecName: Full=Capsid protein p30;
DE              Short=CA;
DE   Flags: Fragment;
GN   Name=gag;
OS   Avian spleen necrosis virus.
OC   Viruses; Riboviria; Pararnavirae; Artverviricota; Revtraviricetes;
OC   Ortervirales; Retroviridae; Orthoretrovirinae; Gammaretrovirus.
OX   NCBI_TaxID=11899;
OH   NCBI_TaxID=8976; Galliformes.
RN   [1]
RP   NUCLEOTIDE SEQUENCE [GENOMIC DNA] (PROVIRUS).
RX   PubMed=6951170; DOI=10.1073/pnas.79.4.1230;
RA   O'Rear J.J., Temin H.M.;
RT   "Spontaneous changes in nucleotide sequence in proviruses of spleen
RT   necrosis virus, an avian retrovirus.";
RL   Proc. Natl. Acad. Sci. U.S.A. 79:1230-1234(1982).
RN   [2]
RP   PROTEIN SEQUENCE OF 200-235.
RX   PubMed=6169843; DOI=10.1128/jvi.39.3.845-854.1981;
RA   Oroszlan S., Barbacid M., Copeland T.D., Aaronson S.A., Gilden R.V.;
RT   "Chemical and Immunological characterization of the major structural
RT   protein (p28) of MMC-1, a rhesus monkey endogenous type C virus: homology
RT   with the major structural protein of avian reticuloendotheliosis virus.";
RL   J. Virol. 39:845-854(1981).
CC   -!- FUNCTION: Gag polyprotein plays a role in budding and is processed by
CC       the viral protease during virion maturation outside the cell. During
CC       budding, it recruits, in a PPXY-dependent or independent manner, Nedd4-
CC       like ubiquitin ligases that conjugate ubiquitin molecules to Gag, or to
CC       Gag binding host factors. Interaction with HECT ubiquitin ligases
CC       probably link the viral protein to the host ESCRT pathway and
CC       facilitate release (By similarity). {ECO:0000250}.
CC   -!- FUNCTION: Matrix protein p15 targets Gag and gag-pol polyproteins to
CC       the plasma membrane via a multipartite membrane binding signal, that
CC       includes its myristoylated N-terminus. Also mediates nuclear
CC       localization of the preintegration complex (By similarity).
CC       {ECO:0000250}.
CC   -!- FUNCTION: Capsid protein p30 forms the spherical core of the virion
CC       that encapsulates the genomic RNA-nucleocapsid complex. {ECO:0000250}.
CC   -!- FUNCTION: Nucleocapsid protein p10 is involved in the packaging and
CC       encapsidation of two copies of the genome. Binds with high affinity to
CC       conserved elements within the packaging signal, located near the 5'-end
CC       of the genome. This binding is dependent on genome dimerization (By
CC       similarity). {ECO:0000250}.
CC   -!- SUBUNIT: Capsid protein p30 is a homohexamer, that further associates
CC       as homomultimer. The virus core is composed of a lattice formed from
CC       hexagonal rings, each containing six capsid monomers (By similarity).
CC       {ECO:0000250}.
CC   -!- SUBCELLULAR LOCATION: [Gag polyprotein]: Virion {ECO:0000250}. Host
CC       cell membrane {ECO:0000305}; Lipid-anchor {ECO:0000305}.
CC   -!- SUBCELLULAR LOCATION: [Matrix protein p15]: Virion {ECO:0000305}.
CC   -!- SUBCELLULAR LOCATION: [Capsid protein p30]: Virion {ECO:0000305}.
CC   -!- DOMAIN: Late-budding domains (L domains) are short sequence motifs
CC       essential for viral particle budding. They recruit proteins of the host
CC       ESCRT machinery (Endosomal Sorting Complex Required for Transport) or
CC       ESCRT-associated proteins. RNA-binding phosphoprotein p12 contains one
CC       L domain: a PPXY motif which potentially interacts with the WW domain 3
CC       of NEDD4 E3 ubiquitin ligase. Matrix protein p15 contains one L domain:
CC       a PTAP/PSAP motif, which potentially interacts with the UEV domain of
CC       TSG101 (By similarity). {ECO:0000250}.
CC   -!- PTM: Specific enzymatic cleavages by the viral protease yield mature
CC       proteins. The protease is released by autocatalytic cleavage. The
CC       polyprotein is cleaved during and after budding, this process is termed
CC       maturation (By similarity). {ECO:0000250}.
CC   ---------------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC   ---------------------------------------------------------------------------
DR   EMBL; V01200; CAA24513.1; -; Genomic_DNA.
DR   PIR; A93904; FOVDA.
DR   SMR; P03342; -.
DR   GO; GO:0020002; C:host cell plasma membrane; IEA:UniProtKB-SubCell.
DR   GO; GO:0016020; C:membrane; IEA:UniProtKB-KW.
DR   GO; GO:0019013; C:viral nucleocapsid; IEA:UniProtKB-KW.
DR   GO; GO:0003723; F:RNA binding; IEA:UniProtKB-KW.
DR   GO; GO:0039660; F:structural constituent of virion; IEA:UniProtKB-KW.
DR   GO; GO:0039702; P:viral budding via host ESCRT complex; IEA:UniProtKB-KW.
DR   Gene3D; 1.10.150.180; -; 1.
DR   Gene3D; 1.10.375.10; -; 1.
DR   InterPro; IPR000840; G_retro_matrix.
DR   InterPro; IPR036946; G_retro_matrix_sf.
DR   InterPro; IPR003036; Gag_P30.
DR   InterPro; IPR008919; Retrov_capsid_N.
DR   InterPro; IPR010999; Retrovr_matrix.
DR   Pfam; PF01140; Gag_MA; 1.
DR   Pfam; PF02093; Gag_p30; 1.
DR   SUPFAM; SSF47836; SSF47836; 1.
DR   SUPFAM; SSF47943; SSF47943; 1.
PE   1: Evidence at protein level;
KW   Capsid protein; Direct protein sequencing; Host cell membrane;
KW   Host membrane; Host-virus interaction; Lipoprotein; Membrane; Myristate;
KW   RNA-binding; Viral budding; Viral budding via the host ESCRT complexes;
KW   Viral matrix protein; Viral nucleoprotein; Viral release from host cell;
KW   Virion.
FT   INIT_MET        1
FT                   /note="Removed; by host"
FT                   /evidence="ECO:0000250"
FT   CHAIN           2..>313
FT                   /note="Gag polyprotein"
FT                   /id="PRO_0000390796"
FT   CHAIN           2..?
FT                   /note="Matrix protein p15"
FT                   /evidence="ECO:0000255"
FT                   /id="PRO_0000040821"
FT   CHAIN           ?..199
FT                   /note="RNA-binding phosphoprotein p12"
FT                   /evidence="ECO:0000255"
FT                   /id="PRO_0000040822"
FT   CHAIN           200..>313
FT                   /note="Capsid protein p30"
FT                   /evidence="ECO:0000255"
FT                   /id="PRO_0000040823"
FT   REGION          113..135
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   REGION          167..197
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   REGION          280..313
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   MOTIF           127..130
FT                   /note="PTAP/PSAP motif"
FT   MOTIF           154..157
FT                   /note="PPXY motif"
FT   COMPBIAS        168..183
FT                   /note="Polar residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        280..303
FT                   /note="Basic and acidic residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   LIPID           2
FT                   /note="N-myristoyl glycine; by host"
FT                   /evidence="ECO:0000250"
FT   CONFLICT        210
FT                   /note="T -> G (in Ref. 2; AA sequence)"
FT                   /evidence="ECO:0000305"
FT   CONFLICT        235
FT                   /note="S -> F (in Ref. 2; AA sequence)"
FT                   /evidence="ECO:0000305"
FT   NON_TER         313
SQ   SEQUENCE   313 AA;  35362 MW;  6712FE769E57C45A CRC64;
     MGQAGSKGLL TPLECILKNF SDFKKRAGDY GEDVDSFALR KLCELEWPTF GVGWPKEGTL
     DFKVVAAVRN IVFGNPGHPD QVIYITVWTD ITIERPKYLK SCGCKPHRTS KVLLASQKVN
     PRRPVLPSAP ESPPRIRRAQ FLDERPLSPA PAPPPPYPEV SAIVEDTREG QQPDSTVMTS
     PPHTRSGLEF GAQGPSGMYP LRETGERDMT GRPMRTYVPF TTSDLYNWKN QNPSSFSQAP
     DQVISLLESV FYTHQPTWDD CQQLLRTLFT TEERERVRTE SRREVRNDQG VQVTDEREIE
     AQFPATRPDW VGS