GAG_HTL1M

ID   GAG_HTL1M               Reviewed;         429 AA.
AC   P14077;
DT   01-JAN-1990, integrated into UniProtKB/Swiss-Prot.
DT   23-JAN-2007, sequence version 3.
DT   23-FEB-2022, entry version 121.
DE   RecName: Full=Gag polyprotein;
DE   AltName: Full=Pr53Gag;
DE   Contains:
DE     RecName: Full=Matrix protein p19;
DE              Short=MA;
DE   Contains:
DE     RecName: Full=Capsid protein p24;
DE              Short=CA;
DE   Contains:
DE     RecName: Full=Nucleocapsid protein p15-gag;
DE              Short=NC-gag;
GN   Name=gag;
OS   Human T-cell leukemia virus 1 (strain Japan MT-2 subtype A) (HTLV-1).
OC   Viruses; Riboviria; Pararnavirae; Artverviricota; Revtraviricetes;
OC   Ortervirales; Retroviridae; Orthoretrovirinae; Deltaretrovirus.
OX   NCBI_TaxID=11928;
OH   NCBI_TaxID=9606; Homo sapiens (Human).
RN   [1]
RP   NUCLEOTIDE SEQUENCE [GENOMIC RNA].
RX   PubMed=2678008; DOI=10.1093/nar/17.19.7998;
RA   Gray G.S., Bartman T., White M.;
RT   "Nucleotide sequence of the core (gag) gene from HTLV-1 isolate MT-2.";
RL   Nucleic Acids Res. 17:7998-7998(1989).
RN   [2]
RP   MYRISTOYLATION AT GLY-2.
RX   PubMed=2547372; DOI=10.1016/0006-291x(89)92370-x;
RA   Shoji S., Tashiro A., Furuishi K., Takenaka O., Kida Y., Horiuchi S.,
RA   Funakoshi T., Kubota Y.;
RT   "Antibodies to an NH2-terminal myristoyl glycine moiety can detect NH2-
RT   terminal myristoylated proteins in the retrovirus-infected cells.";
RL   Biochem. Biophys. Res. Commun. 162:724-732(1989).
RN   [3]
RP   STRUCTURE BY NMR OF 131-264.
RX   PubMed=11243788; DOI=10.1006/jmbi.2000.4395;
RA   Cornilescu C.C., Bouamr F., Yao X., Carter C., Tjandra N.;
RT   "Structural analysis of the N-terminal domain of the human T-cell leukemia
RT   virus capsid protein.";
RL   J. Mol. Biol. 306:783-797(2001).
CC   -!- FUNCTION: [Gag polyprotein]: The matrix domain targets Gag, Gag-Pro and
CC       Gag-Pro-Pol polyproteins to the plasma membrane via a multipartite
CC       membrane binding signal, that includes its myristoylated N-terminus.
CC       {ECO:0000250|UniProtKB:P03345}.
CC   -!- FUNCTION: [Matrix protein p19]: Matrix protein.
CC       {ECO:0000250|UniProtKB:P03345}.
CC   -!- FUNCTION: [Capsid protein p24]: Forms the spherical core of the virus
CC       that encapsulates the genomic RNA-nucleocapsid complex.
CC       {ECO:0000250|UniProtKB:P03345}.
CC   -!- FUNCTION: [Nucleocapsid protein p15-gag]: Binds strongly to viral
CC       nucleic acids and promote their aggregation. Also destabilizes the
CC       nucleic acids duplexes via highly structured zinc-binding motifs.
CC       {ECO:0000250|UniProtKB:P03345}.
CC   -!- SUBUNIT: [Gag polyprotein]: Homodimer; the homodimers are part of the
CC       immature particles. Interacts with human TSG101 and NEDD4; these
CC       interactions are essential for budding and release of viral particles.
CC       {ECO:0000250|UniProtKB:P03345}.
CC   -!- SUBUNIT: [Matrix protein p19]: Homodimer; further assembles as
CC       homohexamers. {ECO:0000250|UniProtKB:P03345}.
CC   -!- SUBCELLULAR LOCATION: [Matrix protein p19]: Virion
CC       {ECO:0000250|UniProtKB:P03345}.
CC   -!- SUBCELLULAR LOCATION: [Capsid protein p24]: Virion
CC       {ECO:0000250|UniProtKB:P03345}.
CC   -!- SUBCELLULAR LOCATION: [Nucleocapsid protein p15-gag]: Virion
CC       {ECO:0000250|UniProtKB:P03345}.
CC   -!- ALTERNATIVE PRODUCTS:
CC       Event=Ribosomal frameshifting; Named isoforms=3;
CC         Comment=This strategy of translation probably allows the virus to
CC         modulate the quantity of each viral protein. {ECO:0000305};
CC       Name=Gag polyprotein;
CC         IsoId=P14077-1; Sequence=Displayed;
CC       Name=Gag-Pro polyprotein;
CC         IsoId=P14077-2; Sequence=Not described;
CC       Name=Gag-Pol polyprotein;
CC         IsoId=P14077-3; Sequence=Not described;
CC   -!- DOMAIN: [Gag polyprotein]: Late-budding domains (L domains) are short
CC       sequence motifs essential for viral particle release. They can occur
CC       individually or in close proximity within structural proteins. They
CC       interacts with sorting cellular proteins of the multivesicular body
CC       (MVB) pathway. Most of these proteins are class E vacuolar protein
CC       sorting factors belonging to ESCRT-I, ESCRT-II or ESCRT-III complexes.
CC       Matrix protein p19 contains two L domains: a PTAP/PSAP motif which
CC       interacts with the UEV domain of TSG101, and a PPXY motif which binds
CC       to the WW domains of the ubiquitin ligase NEDD4.
CC       {ECO:0000250|UniProtKB:P03345}.
CC   -!- DOMAIN: [Capsid protein p24]: The capsid protein N-terminus seems to be
CC       involved in Gag-Gag interactions. {ECO:0000250|UniProtKB:P03345}.
CC   -!- DOMAIN: [Nucleocapsid protein p15-gag]: The C-terminus is acidic.
CC       {ECO:0000250|UniProtKB:P03345}.
CC   -!- PTM: [Gag polyprotein]: Specific enzymatic cleavages by the viral
CC       protease yield mature proteins. The polyprotein is cleaved during and
CC       after budding, this process is termed maturation.
CC       {ECO:0000250|UniProtKB:P03345}.
CC   -!- PTM: [Matrix protein p19]: Phosphorylation of the matrix protein p19 by
CC       MAPK1 seems to play a role in budding. {ECO:0000250|UniProtKB:P03345}.
CC   -!- PTM: [Gag polyprotein]: Ubiquitinated by host NEDD4.
CC       {ECO:0000250|UniProtKB:P03345}.
CC   -!- PTM: [Gag polyprotein]: Myristoylated (PubMed:2547372). Myristoylation
CC       of the matrix (MA) domain mediates the transport and binding of Gag
CC       polyproteins to the host plasma membrane and is required for the
CC       assembly of viral particles. {ECO:0000250|UniProtKB:P03345,
CC       ECO:0000269|PubMed:2547372}.
CC   -!- MISCELLANEOUS: HTLV-1 lineages are divided in four clades, A
CC       (Cosmopolitan), B (Central African group), C (Melanesian group) and D
CC       (New Central African group). {ECO:0000305}.
CC   -!- MISCELLANEOUS: [Isoform Gag polyprotein]: Produced by conventional
CC       translation. {ECO:0000250|UniProtKB:P03345}.
CC   -!- MISCELLANEOUS: [Isoform Gag-Pro polyprotein]: Produced by -1 ribosomal
CC       frameshifting at the gag-pro genes boundary.
CC       {ECO:0000250|UniProtKB:P03345}.
CC   -!- MISCELLANEOUS: [Isoform Gag-Pol polyprotein]: Produced by -1 ribosomal
CC       frameshifting at the gag-pol genes boundary.
CC       {ECO:0000250|UniProtKB:P03345}.
CC   ---------------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC   ---------------------------------------------------------------------------
DR   EMBL; X15951; CAA34075.1; -; Genomic_RNA.
DR   PIR; S06073; S06073.
DR   PDB; 1G03; NMR; -; A=131-264.
DR   PDBsum; 1G03; -.
DR   SMR; P14077; -.
DR   iPTMnet; P14077; -.
DR   EvolutionaryTrace; P14077; -.
DR   GO; GO:0019013; C:viral nucleocapsid; IEA:UniProtKB-KW.
DR   GO; GO:0003676; F:nucleic acid binding; IEA:InterPro.
DR   GO; GO:0005198; F:structural molecule activity; IEA:InterPro.
DR   GO; GO:0008270; F:zinc ion binding; IEA:InterPro.
DR   GO; GO:0016032; P:viral process; IEA:InterPro.
DR   Gene3D; 1.10.1200.30; -; 1.
DR   Gene3D; 1.10.375.10; -; 1.
DR   InterPro; IPR003139; D_retro_matrix.
DR   InterPro; IPR045345; Gag_p24_C.
DR   InterPro; IPR000721; Gag_p24_N.
DR   InterPro; IPR008916; Retrov_capsid_C.
DR   InterPro; IPR008919; Retrov_capsid_N.
DR   InterPro; IPR010999; Retrovr_matrix.
DR   InterPro; IPR001878; Znf_CCHC.
DR   InterPro; IPR036875; Znf_CCHC_sf.
DR   Pfam; PF02228; Gag_p19; 1.
DR   Pfam; PF00607; Gag_p24; 1.
DR   Pfam; PF19317; Gag_p24_C; 1.
DR   Pfam; PF00098; zf-CCHC; 1.
DR   SMART; SM00343; ZnF_C2HC; 2.
DR   SUPFAM; SSF47836; SSF47836; 1.
DR   SUPFAM; SSF47943; SSF47943; 1.
DR   SUPFAM; SSF57756; SSF57756; 1.
DR   PROSITE; PS50158; ZF_CCHC; 1.
PE   1: Evidence at protein level;
KW   3D-structure; Capsid protein; Disulfide bond; Host-virus interaction;
KW   Lipoprotein; Metal-binding; Myristate; Phosphoprotein; Repeat;
KW   Ribosomal frameshifting; Ubl conjugation; Viral nucleoprotein; Virion;
KW   Zinc; Zinc-finger.
FT   INIT_MET        1
FT                   /note="Removed; by host"
FT                   /evidence="ECO:0000255"
FT   CHAIN           2..429
FT                   /note="Gag polyprotein"
FT                   /id="PRO_0000259773"
FT   CHAIN           2..130
FT                   /note="Matrix protein p19"
FT                   /id="PRO_0000038817"
FT   CHAIN           131..344
FT                   /note="Capsid protein p24"
FT                   /id="PRO_0000038818"
FT   CHAIN           345..429
FT                   /note="Nucleocapsid protein p15-gag"
FT                   /id="PRO_0000038819"
FT   ZN_FING         355..372
FT                   /note="CCHC-type 1"
FT                   /evidence="ECO:0000255|PROSITE-ProRule:PRU00047"
FT   ZN_FING         378..395
FT                   /note="CCHC-type 2"
FT                   /evidence="ECO:0000255|PROSITE-ProRule:PRU00047"
FT   REGION          93..143
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   MOTIF           118..121
FT                   /note="PPXY motif"
FT                   /evidence="ECO:0000250|UniProtKB:P03345"
FT   MOTIF           124..127
FT                   /note="PTAP/PSAP motif"
FT                   /evidence="ECO:0000250|UniProtKB:P03345"
FT   COMPBIAS        95..126
FT                   /note="Pro residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   SITE            130..131
FT                   /note="Cleavage; by viral protease"
FT                   /evidence="ECO:0000250|UniProtKB:P03345"
FT   SITE            344..345
FT                   /note="Cleavage; by viral protease"
FT                   /evidence="ECO:0000250|UniProtKB:P03345"
FT   MOD_RES         105
FT                   /note="Phosphoserine; by host MAPK1"
FT                   /evidence="ECO:0000250|UniProtKB:P03345"
FT   LIPID           2
FT                   /note="N-myristoyl glycine; by host"
FT                   /evidence="ECO:0000255, ECO:0000269|PubMed:2547372"
FT   DISULFID        61
FT                   /note="Interchain"
FT                   /evidence="ECO:0000250|UniProtKB:P03345"
FT   STRAND          136..138
FT                   /evidence="ECO:0007829|PDB:1G03"
FT   HELIX           148..159
FT                   /evidence="ECO:0007829|PDB:1G03"
FT   STRAND          160..163
FT                   /evidence="ECO:0007829|PDB:1G03"
FT   HELIX           166..177
FT                   /evidence="ECO:0007829|PDB:1G03"
FT   HELIX           182..190
FT                   /evidence="ECO:0007829|PDB:1G03"
FT   HELIX           195..216
FT                   /evidence="ECO:0007829|PDB:1G03"
FT   STRAND          217..219
FT                   /evidence="ECO:0007829|PDB:1G03"
FT   HELIX           227..231
FT                   /evidence="ECO:0007829|PDB:1G03"
FT   HELIX           238..252
FT                   /evidence="ECO:0007829|PDB:1G03"
FT   TURN            254..257
FT                   /evidence="ECO:0007829|PDB:1G03"
FT   TURN            261..263
FT                   /evidence="ECO:0007829|PDB:1G03"
SQ   SEQUENCE   429 AA;  47585 MW;  EF5201C934EF0291 CRC64;
     MGQIFSRSAS PIPRPPRGLA AHHWLNFLQA AYRLEPGPSS YDFHQLKKFL KIALETPVWI
     CPINYSLLAS LLPKGYPGRV NEILHILIQT QAQIPSRPAP PPPSSPTHDP PDSDPQIPPP
     YVEPTAPQVL PVMHPHGAPP NHRPWQMKDL QAIKQEVSQA APGSPQFMQT IRLAVQQFDP
     TAKDLQDLLQ YLCSSLVASL HHQQLDSLIS EAETRGITSY NPLAGPLRVQ ANNPQQQGLR
     REYQQLWLAA FAALPGSAKD PSWASILQGL EEPYHAFVER LNIALDNGLP EGTPKDPILR
     SLAYSNANKE CQKLLQARGH TNSPLGDMLR ACQTWTPKDK TKVLVVQPKK PPPNQPCFRC
     GKAGHWSRDC TQPRPPPGPC PLCQDPTHWK RDCPRLKPTI PEPEPEEDAL LLDLPADIPH
     PKNSIGGEV