ENV_SIVVT

ID   ENV_SIVVT               Reviewed;         865 AA.
AC   P05886;
DT   01-NOV-1988, integrated into UniProtKB/Swiss-Prot.
DT   01-NOV-1988, sequence version 1.
DT   25-MAY-2022, entry version 127.
DE   RecName: Full=Envelope glycoprotein gp160;
DE   AltName: Full=Env polyprotein;
DE   Contains:
DE     RecName: Full=Surface protein gp120;
DE              Short=SU;
DE     AltName: Full=Glycoprotein 120;
DE              Short=gp120;
DE   Contains:
DE     RecName: Full=Transmembrane protein gp41;
DE              Short=TM;
DE     AltName: Full=Glycoprotein 32;
DE              Short=gp32;
DE   Flags: Precursor;
GN   Name=env;
OS   Simian immunodeficiency virus agm.vervet (isolate AGM TYO-1) (SIV-agm.ver)
OS   (Simian immunodeficiency virus African green monkey vervet).
OC   Viruses; Riboviria; Pararnavirae; Artverviricota; Revtraviricetes;
OC   Ortervirales; Retroviridae; Orthoretrovirinae; Lentivirus.
OX   NCBI_TaxID=11731;
OH   NCBI_TaxID=9527; Cercopithecidae (Old World monkeys).
RN   [1]
RP   NUCLEOTIDE SEQUENCE [GENOMIC DNA].
RX   PubMed=3374586; DOI=10.1038/333457a0;
RA   Fukasawa M., Miura T., Hasegawa A., Morikawa S., Tsujimoto H., Miki K.,
RA   Kitamura T., Hayami M.;
RT   "Sequence of simian immunodeficiency virus from African green monkey, a new
RT   member of the HIV/SIV group.";
RL   Nature 333:457-461(1988).
CC   -!- FUNCTION: The surface protein gp120 (SU) attaches the virus to the host
CC       lymphoid cell by binding to the primary receptor CD4. This interaction
CC       induces a structural rearrangement creating a high affinity binding
CC       site for a chemokine coreceptor like CCR5. This peculiar 2 stage
CC       receptor-interaction strategy allows gp120 to maintain the highly
CC       conserved coreceptor-binding site in a cryptic conformation, protected
CC       from neutralizing antibodies. These changes are transmitted to the
CC       transmembrane protein gp41 and are thought to activate its fusogenic
CC       potential by unmasking its fusion peptide (By similarity).
CC       {ECO:0000250}.
CC   -!- FUNCTION: Surface protein gp120 (SU) may target the virus to gut-
CC       associated lymphoid tissue (GALT) by binding host ITGA4/ITGB7 (alpha-
CC       4/beta-7 integrins), a complex that mediates T-cell migration to the
CC       GALT. Interaction between gp120 and ITGA4/ITGB7 would allow the virus
CC       to enter GALT early in the infection, infecting and killing most of
CC       GALT's resting CD4+ T-cells. This T-cell depletion is believed to be
CC       the major insult to the host immune system leading to AIDS (By
CC       similarity). {ECO:0000250}.
CC   -!- FUNCTION: The surface protein gp120 is a ligand for CD209/DC-SIGN and
CC       CLEC4M/DC-SIGNR, which are respectively found on dendritic cells (DCs),
CC       and on endothelial cells of liver sinusoids and lymph node sinuses.
CC       These interactions allow capture of viral particles at mucosal surfaces
CC       by these cells and subsequent transmission to permissive cells. DCs are
CC       professional antigen presenting cells, critical for host immunity by
CC       inducing specific immune responses against a broad variety of
CC       pathogens. They act as sentinels in various tissues where they take up
CC       antigen, process it, and present it to T-cells following migration to
CC       lymphoid organs. SIV subverts the migration properties of dendritic
CC       cells to gain access to CD4+ T-cells in lymph nodes. Virus transmission
CC       to permissive T-cells occurs either in trans (without DCs infection,
CC       through viral capture and transmission), or in cis (following DCs
CC       productive infection, through the usual CD4-gp120 interaction), thereby
CC       inducing a robust infection. In trans infection, bound virions remain
CC       infectious over days and it is proposed that they are not degraded, but
CC       protected in non-lysosomal acidic organelles within the DCs close to
CC       the cell membrane thus contributing to the viral infectious potential
CC       during DCs' migration from the periphery to the lymphoid tissues. On
CC       arrival at lymphoid tissues, intact virions recycle back to DCs' cell
CC       surface allowing virus transmission to CD4+ T-cells. Virion capture
CC       also seems to lead to MHC-II-restricted viral antigen presentation, and
CC       probably to the activation of SIV-specific CD4+ cells (By similarity).
CC       {ECO:0000250}.
CC   -!- FUNCTION: The transmembrane protein gp41 (TM) acts as a class I viral
CC       fusion protein. Under the current model, the protein has at least 3
CC       conformational states: pre-fusion native state, pre-hairpin
CC       intermediate state, and post-fusion hairpin state. During fusion of
CC       viral and target intracellular membranes, the coiled coil regions
CC       (heptad repeats) assume a trimer-of-hairpins structure, positioning the
CC       fusion peptide in close proximity to the C-terminal region of the
CC       ectodomain. The formation of this structure appears to drive apposition
CC       and subsequent fusion of viral and target cell membranes. Complete
CC       fusion occurs in host cell endosomes. The virus undergoes clathrin-
CC       dependent internalization long before endosomal fusion, thus minimizing
CC       the surface exposure of conserved viral epitopes during fusion and
CC       reducing the efficacy of inhibitors targeting these epitopes. Membranes
CC       fusion leads to delivery of the nucleocapsid into the cytoplasm (By
CC       similarity). {ECO:0000250}.
CC   -!- FUNCTION: The envelope glycoprotein gp160 precursor down-modulates cell
CC       surface CD4 antigen by interacting with it in the endoplasmic reticulum
CC       and blocking its transport to the cell surface. {ECO:0000250}.
CC   -!- FUNCTION: The gp120-gp41 heterodimer allows rapid transcytosis of the
CC       virus through CD4 negative cells such as simple epithelial monolayers
CC       of the intestinal, rectal and endocervical epithelial barriers. Both
CC       gp120 and gp41 specifically recognize glycosphingolipids galactosyl-
CC       ceramide (GalCer) or 3' sulfo-galactosyl-ceramide (GalS) present in the
CC       lipid rafts structures of epithelial cells. Binding to these
CC       alternative receptors allows the rapid transcytosis of the virus
CC       through the epithelial cells. This transcytotic vesicle-mediated
CC       transport of virions from the apical side to the basolateral side of
CC       the epithelial cells does not involve infection of the cells themselves
CC       (By similarity). {ECO:0000250}.
CC   -!- SUBUNIT: [Surface protein gp120]: The mature envelope protein (Env)
CC       consists of a homotrimer of non-covalently associated gp120-gp41
CC       heterodimers. The resulting complex protrudes from the virus surface as
CC       a spike. Interacts with host CD4 and CCR5 (By similarity). Gp120 also
CC       interacts with the C-type lectins CD209/DC-SIGN and CLEC4M/DC-SIGNR
CC       (collectively referred to as DC-SIGN(R)). {ECO:0000250}.
CC   -!- SUBUNIT: [Transmembrane protein gp41]: The mature envelope protein
CC       (Env) consists of a homotrimer of non-covalently associated gp120-gp41
CC       heterodimers. The resulting complex protrudes from the virus surface as
CC       a spike. {ECO:0000250}.
CC   -!- SUBCELLULAR LOCATION: [Transmembrane protein gp41]: Virion membrane
CC       {ECO:0000250}; Single-pass type I membrane protein {ECO:0000250}. Host
CC       cell membrane {ECO:0000250}; Single-pass type I membrane protein
CC       {ECO:0000250}. Host endosome membrane {ECO:0000305}; Single-pass type I
CC       membrane protein {ECO:0000305}. Note=It is probably concentrated at the
CC       site of budding and incorporated into the virions possibly by contacts
CC       between the cytoplasmic tail of Env and the N-terminus of Gag.
CC       {ECO:0000250}.
CC   -!- SUBCELLULAR LOCATION: [Surface protein gp120]: Virion membrane
CC       {ECO:0000250}; Peripheral membrane protein {ECO:0000250}. Host cell
CC       membrane {ECO:0000250}; Peripheral membrane protein {ECO:0000250}. Host
CC       endosome membrane {ECO:0000305}; Peripheral membrane protein
CC       {ECO:0000305}. Note=The surface protein is not anchored to the viral
CC       envelope, but associates with the extravirion surface through its
CC       binding to TM. It is probably concentrated at the site of budding and
CC       incorporated into the virions possibly by contacts between the
CC       cytoplasmic tail of Env and the N-terminus of Gag (By similarity).
CC       {ECO:0000250}.
CC   -!- DOMAIN: Some of the most genetically diverse regions of the viral
CC       genome are present in Env. They are called variable regions 1 through 5
CC       (V1 through V5) (By similarity). {ECO:0000250}.
CC   -!- DOMAIN: The YXXL motif is involved in determining the exact site of
CC       viral release at the surface of infected mononuclear cells and promotes
CC       endocytosis. {ECO:0000250}.
CC   -!- DOMAIN: The 17 amino acids long immunosuppressive region is present in
CC       many retroviral envelope proteins. Synthetic peptides derived from this
CC       relatively conserved sequence inhibit immune function in vitro and in
CC       vivo (By similarity). {ECO:0000250}.
CC   -!- PTM: Specific enzymatic cleavages in vivo yield mature proteins.
CC       Envelope glycoproteins are synthesized as an inactive precursor that is
CC       heavily N-glycosylated and processed likely by host cell furin in the
CC       Golgi to yield the mature SU and TM proteins. The cleavage site between
CC       SU and TM requires the minimal sequence [KR]-X-[KR]-R (By similarity).
CC       {ECO:0000250}.
CC   -!- MISCELLANEOUS: This is an African green monkey isolate.
CC   ---------------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC   ---------------------------------------------------------------------------
DR   EMBL; X07805; CAA30663.2; -; Genomic_DNA.
DR   PIR; G30045; VCLJG4.
DR   GO; GO:0044175; C:host cell endosome membrane; IEA:UniProtKB-SubCell.
DR   GO; GO:0020002; C:host cell plasma membrane; IEA:UniProtKB-SubCell.
DR   GO; GO:0016021; C:integral component of membrane; IEA:UniProtKB-KW.
DR   GO; GO:0019031; C:viral envelope; IEA:UniProtKB-KW.
DR   GO; GO:0055036; C:virion membrane; IEA:UniProtKB-SubCell.
DR   GO; GO:0005198; F:structural molecule activity; IEA:InterPro.
DR   GO; GO:0039663; P:membrane fusion involved in viral entry into host cell; IEA:UniProtKB-KW.
DR   GO; GO:0046718; P:viral entry into host cell; IEA:UniProtKB-KW.
DR   GO; GO:0019062; P:virion attachment to host cell; IEA:UniProtKB-KW.
DR   CDD; cd09909; HIV-1-like_HR1-HR2; 1.
DR   Gene3D; 2.170.40.20; -; 2.
DR   InterPro; IPR036377; Gp120_core_sf.
DR   InterPro; IPR000328; GP41-like.
DR   InterPro; IPR000777; HIV1_Gp120.
DR   Pfam; PF00516; GP120; 1.
DR   Pfam; PF00517; GP41; 1.
DR   SUPFAM; SSF56502; SSF56502; 1.
PE   3: Inferred from homology;
KW   Apoptosis; Cleavage on pair of basic residues; Coiled coil; Disulfide bond;
KW   Fusion of virus membrane with host membrane; Glycoprotein;
KW   Host cell membrane; Host endosome; Host membrane; Host-virus interaction;
KW   Membrane; Signal; Transmembrane; Transmembrane helix;
KW   Viral attachment to host cell; Viral envelope protein;
KW   Viral penetration into host cytoplasm; Virion; Virus entry into host cell.
FT   SIGNAL          1..20
FT                   /evidence="ECO:0000255"
FT   CHAIN           21..865
FT                   /note="Envelope glycoprotein gp160"
FT                   /id="PRO_0000239508"
FT   CHAIN           21..536
FT                   /note="Surface protein gp120"
FT                   /evidence="ECO:0000250"
FT                   /id="PRO_0000038462"
FT   CHAIN           537..865
FT                   /note="Transmembrane protein gp41"
FT                   /evidence="ECO:0000250"
FT                   /id="PRO_0000038463"
FT   TOPO_DOM        21..705
FT                   /note="Extracellular"
FT                   /evidence="ECO:0000255"
FT   TRANSMEM        706..726
FT                   /note="Helical"
FT                   /evidence="ECO:0000255"
FT   TOPO_DOM        727..865
FT                   /note="Cytoplasmic"
FT                   /evidence="ECO:0000255"
FT   REGION          113..165
FT                   /note="V1"
FT   REGION          166..209
FT                   /note="V2"
FT   REGION          311..343
FT                   /note="V3"
FT   REGION          403..444
FT                   /note="V4"
FT   REGION          487..494
FT                   /note="V5"
FT   REGION          537..557
FT                   /note="Fusion peptide"
FT                   /evidence="ECO:0000255"
FT   REGION          600..616
FT                   /note="Immunosuppression"
FT                   /evidence="ECO:0000250"
FT   REGION          682..703
FT                   /note="MPER; binding to GalCer"
FT                   /evidence="ECO:0000250"
FT   REGION          744..763
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COILED          650..675
FT                   /evidence="ECO:0000255"
FT   MOTIF           732..735
FT                   /note="YXXL motif; contains endocytosis signal"
FT                   /evidence="ECO:0000250"
FT   COMPBIAS        746..763
FT                   /note="Basic and acidic residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   SITE            536..537
FT                   /note="Cleavage; by host furin"
FT                   /evidence="ECO:0000255"
FT   SITE            770
FT                   /note="In-frame UAG termination codon"
FT   CARBOHYD        35
FT                   /note="N-linked (GlcNAc...) asparagine; by host"
FT                   /evidence="ECO:0000255"
FT   CARBOHYD        68
FT                   /note="N-linked (GlcNAc...) asparagine; by host"
FT                   /evidence="ECO:0000255"
FT   CARBOHYD        117
FT                   /note="N-linked (GlcNAc...) asparagine; by host"
FT                   /evidence="ECO:0000255"
FT   CARBOHYD        150
FT                   /note="N-linked (GlcNAc...) asparagine; by host"
FT                   /evidence="ECO:0000255"
FT   CARBOHYD        165
FT                   /note="N-linked (GlcNAc...) asparagine; by host"
FT                   /evidence="ECO:0000255"
FT   CARBOHYD        195
FT                   /note="N-linked (GlcNAc...) asparagine; by host"
FT                   /evidence="ECO:0000255"
FT   CARBOHYD        198
FT                   /note="N-linked (GlcNAc...) asparagine; by host"
FT                   /evidence="ECO:0000255"
FT   CARBOHYD        210
FT                   /note="N-linked (GlcNAc...) asparagine; by host"
FT                   /evidence="ECO:0000255"
FT   CARBOHYD        252
FT                   /note="N-linked (GlcNAc...) asparagine; by host"
FT                   /evidence="ECO:0000255"
FT   CARBOHYD        255
FT                   /note="N-linked (GlcNAc...) asparagine; by host"
FT                   /evidence="ECO:0000255"
FT   CARBOHYD        266
FT                   /note="N-linked (GlcNAc...) asparagine; by host"
FT                   /evidence="ECO:0000255"
FT   CARBOHYD        276
FT                   /note="N-linked (GlcNAc...) asparagine; by host"
FT                   /evidence="ECO:0000255"
FT   CARBOHYD        282
FT                   /note="N-linked (GlcNAc...) asparagine; by host"
FT                   /evidence="ECO:0000255"
FT   CARBOHYD        294
FT                   /note="N-linked (GlcNAc...) asparagine; by host"
FT                   /evidence="ECO:0000255"
FT   CARBOHYD        306
FT                   /note="N-linked (GlcNAc...) asparagine; by host"
FT                   /evidence="ECO:0000255"
FT   CARBOHYD        316
FT                   /note="N-linked (GlcNAc...) asparagine; by host"
FT                   /evidence="ECO:0000255"
FT   CARBOHYD        373
FT                   /note="N-linked (GlcNAc...) asparagine; by host"
FT                   /evidence="ECO:0000255"
FT   CARBOHYD        414
FT                   /note="N-linked (GlcNAc...) asparagine; by host"
FT                   /evidence="ECO:0000255"
FT   CARBOHYD        451
FT                   /note="N-linked (GlcNAc...) asparagine; by host"
FT                   /evidence="ECO:0000255"
FT   CARBOHYD        488
FT                   /note="N-linked (GlcNAc...) asparagine; by host"
FT                   /evidence="ECO:0000255"
FT   CARBOHYD        491
FT                   /note="N-linked (GlcNAc...) asparagine; by host"
FT                   /evidence="ECO:0000255"
FT   CARBOHYD        645
FT                   /note="N-linked (GlcNAc...) asparagine; by host"
FT                   /evidence="ECO:0000255"
FT   CARBOHYD        661
FT                   /note="N-linked (GlcNAc...) asparagine; by host"
FT                   /evidence="ECO:0000255"
FT   DISULFID        42..55
FT                   /evidence="ECO:0000250"
FT   DISULFID        101..218
FT                   /evidence="ECO:0000250"
FT   DISULFID        108..209
FT                   /evidence="ECO:0000250"
FT   DISULFID        113..166
FT                   /evidence="ECO:0000250"
FT   DISULFID        231..261
FT                   /evidence="ECO:0000250"
FT   DISULFID        241..253
FT                   /evidence="ECO:0000250"
FT   DISULFID        311..344
FT                   /evidence="ECO:0000250"
FT   DISULFID        396..471
FT                   /evidence="ECO:0000250"
SQ   SEQUENCE   865 AA;  99026 MW;  6CEF0F09001D6D95 CRC64;
     MRYTIITLGI IVIGIGIVLS KQWITVFYGI PVWKNSSVQA FCMTPTTSLW ATTNCIPDDH
     DYTEVPLNIT EPFEAWGDRN PLIAQAASNI HLLFEQTMKP CVKLSPLCIK MNCVELNSTR
     ERATTPTTTP KSTGLPCVGP TSGENLQSCN ASIIEREMED EPASNCTFAM AGYVRDQKKN
     YYSVVWNDAE IYCKNKTNST SKECYMIHCN DSVIKEACDK TYWDQLRLRY CAPAGYALLK
     CNDEDYNGYK QNCSNVSVVH CTGLMNTTVT TGLLLNGSYH ENRTQIWQKH RVNNNTVLIL
     FNKHYNLSVT CRRPGNKTVL PVTIMAGLVF HSQKYNMKLR QAWCHFEGNW RGAWREVKQK
     IVELPKDRYK GTNNTEHIYL QRQWGDPEAS NLWFNCQGEF FYCKMDWFLN YLNNKTWDAY
     HNFCSSKKKG HAPGPCVQRT YVAYHIRSVI NDSYTLSKKT YAPPREGHLQ CRSTVTGMTV
     ELNYNSKNRT NVTLSPQIES IWAAELGRYK LVEITPIGFA PTEVRRYTGG HERQKRVPFV
     LGFLGFLGAA GTAMGAAASS LTVQSRHLLA GILQQQKNLL AAVEAQQQML KLTIWGVKNL
     NARVTALEKY LEDQARLNSW GCAWKQVCHT TVEWPWTNRT PDWQNMTWLE WERQIADLES
     NITGQLVKAR EQEEKNLDAY QKLTSWSDFW SWFDFSKWLN ILKMGFLVIV GIIGLRLLYT
     VYGCIVRVRQ GYVPLSPQIH IHQVGKGRPD NADEPGEGGD NSRIKLESWX KDSKSRCMQL
     TAWLTRLNTW LYNSCLTLLI QLRKAFQYLQ YGLAELKTGA QEILQTLAGV AQNACHQIWL
     ACRSAYRNIV NSPRRVRQGL EEILN