CSP_PLACB

ID   CSP_PLACB               Reviewed;         378 AA.
AC   P08672;
DT   01-JAN-1988, integrated into UniProtKB/Swiss-Prot.
DT   01-JAN-1988, sequence version 1.
DT   03-AUG-2022, entry version 65.
DE   RecName: Full=Circumsporozoite protein {ECO:0000303|PubMed:3802196};
DE            Short=CS {ECO:0000303|PubMed:3802196};
DE   Contains:
DE     RecName: Full=Circumsporozoite protein C-terminus {ECO:0000305};
DE   Flags: Precursor;
GN   Name=CSP {ECO:0000250|UniProtKB:P23093};
OS   Plasmodium cynomolgi (strain Berok).
OC   Eukaryota; Sar; Alveolata; Apicomplexa; Aconoidasida; Haemosporida;
OC   Plasmodiidae; Plasmodium; Plasmodium (Plasmodium).
OX   NCBI_TaxID=5828;
RN   [1]
RP   NUCLEOTIDE SEQUENCE [GENOMIC DNA], POLYMORPHISM, AND REPEATS.
RX   PubMed=3802196; DOI=10.1016/0092-8674(87)90434-x;
RA   Galinski M.R., Arnot D.E., Cochrane A.H., Barnwell J.W., Nussenzweig R.S.,
RA   Enea V.;
RT   "The circumsporozoite gene of the Plasmodium cynomolgi complex.";
RL   Cell 48:311-319(1987).
CC   -!- FUNCTION: Essential sporozoite protein (By similarity). In the mosquito
CC       vector, required for sporozoite development in the oocyst, migration
CC       through the vector hemolymph and entry into the vector salivary glands
CC       (By similarity). In the vertebrate host, required for sporozoite
CC       migration through the host dermis and infection of host hepatocytes (By
CC       similarity). Binds to highly sulfated heparan sulfate proteoglycans
CC       (HSPGs) on the surface of host hepatocytes (By similarity).
CC       {ECO:0000250|UniProtKB:P02893, ECO:0000250|UniProtKB:P23093}.
CC   -!- FUNCTION: [Circumsporozoite protein C-terminus]: In the vertebrate
CC       host, binds to highly sulfated heparan sulfate proteoglycans (HSPGs) on
CC       the surface of host hepatocytes and is required for sporozoite invasion
CC       of the host hepatocytes. {ECO:0000250|UniProtKB:P23093}.
CC   -!- SUBCELLULAR LOCATION: Cell membrane {ECO:0000250|UniProtKB:P19597};
CC       Lipid-anchor, GPI-anchor {ECO:0000255}. Cytoplasm
CC       {ECO:0000250|UniProtKB:P23093}. Note=Localizes to the cytoplasm and the
CC       cell membrane in oocysts at day 6 post infection and then gradually
CC       distributes over the entire cell surface of the sporoblast and the
CC       budding sporozoites. {ECO:0000250|UniProtKB:P23093}.
CC   -!- DOMAIN: The N-terminus is involved in the initial binding to heparan
CC       sulfate proteoglycans (HSPGs) on the surface of host hepatocytes (By
CC       similarity). The N-terminus masks the TSP type-1 (TSR) domain which
CC       maintains the sporozoites in a migratory state, enabling them to
CC       complete their journey to the salivary gland in the mosquito vector and
CC       then to the host liver. The unmasking of the TSP type-1 (TSR) domain
CC       when the sporozoite interacts with the host hepatocyte also protects
CC       sporozoites from host antibodies (By similarity).
CC       {ECO:0000250|UniProtKB:P23093, ECO:0000250|UniProtKB:Q7K740}.
CC   -!- DOMAIN: The TSP type-1 (TSR) domain is required for sporozoite
CC       development and invasion. CSP has two conformational states, an
CC       adhesive conformation in which the TSP type-1 (TSR) domain is exposed
CC       and a nonadhesive conformation in which the TSR is masked by the N-
CC       terminus. TSR-exposed conformation occurs during sporozoite development
CC       in the oocyst in the mosquito vector and during host hepatocyte
CC       invasion. TSR-masked conformation occurs during sporozoite migration
CC       through the hemolymph to salivary glands in the mosquito vector and in
CC       the host dermis. {ECO:0000250|UniProtKB:P23093}.
CC   -!- DOMAIN: The GPI-anchor is essential for cell membrane localization and
CC       for sporozoite formation inside the oocyst.
CC       {ECO:0000250|UniProtKB:P23093}.
CC   -!- PTM: During host cell invasion, proteolytically cleaved at the cell
CC       membrane in the region I by a papain-like cysteine protease of parasite
CC       origin (By similarity). Cleavage is triggered by the sporozoite contact
CC       with highly sulfated heparan sulfate proteoglycans (HSPGs) present on
CC       the host hepatocyte cell surface (By similarity). Cleavage exposes the
CC       TSP type-1 (TSR) domain and is required for productive invasion of host
CC       hepatocytes but not for adhesion to the host cell membrane (By
CC       similarity). Cleavage is dispensable for sporozoite development in the
CC       oocyst, motility and for traversal of host and vector cells (By
CC       similarity). {ECO:0000250|UniProtKB:P02893,
CC       ECO:0000250|UniProtKB:P23093}.
CC   -!- PTM: O-glycosylated; maybe by POFUT2. {ECO:0000250|UniProtKB:P19597}.
CC   -!- POLYMORPHISM: The sequence of the repeats varies across Plasmodium
CC       species and strains. {ECO:0000269|PubMed:3802196}.
CC   -!- SIMILARITY: Belongs to the plasmodium circumsporozoite protein family.
CC       {ECO:0000305}.
CC   ---------------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC   ---------------------------------------------------------------------------
DR   EMBL; M15104; AAA29532.1; -; Genomic_DNA.
DR   PIR; D26255; OZZQAB.
DR   AlphaFoldDB; P08672; -.
DR   SMR; P08672; -.
DR   GO; GO:0031225; C:anchored component of membrane; IEA:UniProtKB-KW.
DR   GO; GO:0009986; C:cell surface; IEA:InterPro.
DR   GO; GO:0005737; C:cytoplasm; IEA:UniProtKB-SubCell.
DR   GO; GO:0005886; C:plasma membrane; IEA:UniProtKB-SubCell.
DR   Gene3D; 2.20.100.10; -; 1.
DR   InterPro; IPR003067; Crcmsprzoite.
DR   InterPro; IPR000884; TSP1_rpt.
DR   InterPro; IPR036383; TSP1_rpt_sf.
DR   Pfam; PF00090; TSP_1; 1.
DR   PRINTS; PR01303; CRCMSPRZOITE.
DR   SMART; SM00209; TSP1; 1.
DR   SUPFAM; SSF82895; SSF82895; 1.
DR   PROSITE; PS50092; TSP1; 1.
PE   3: Inferred from homology;
KW   Cell membrane; Cytoplasm; Disulfide bond; Glycoprotein; GPI-anchor;
KW   Lipoprotein; Malaria; Membrane; Repeat; Signal; Sporozoite.
FT   SIGNAL          1..22
FT                   /evidence="ECO:0000255"
FT   CHAIN           23..355
FT                   /note="Circumsporozoite protein"
FT                   /evidence="ECO:0000255"
FT                   /id="PRO_0000024521"
FT   CHAIN           ?..355
FT                   /note="Circumsporozoite protein C-terminus"
FT                   /evidence="ECO:0000250|UniProtKB:P23093"
FT                   /id="PRO_0000455474"
FT   PROPEP          356..378
FT                   /note="Removed in mature form"
FT                   /evidence="ECO:0000255"
FT                   /id="PRO_0000455475"
FT   REPEAT          97..102
FT                   /note="1-1; truncated"
FT                   /evidence="ECO:0000305|PubMed:3802196"
FT   REPEAT          103..111
FT                   /note="1-2"
FT                   /evidence="ECO:0000305|PubMed:3802196"
FT   REPEAT          112..120
FT                   /note="1-3"
FT                   /evidence="ECO:0000305|PubMed:3802196"
FT   REPEAT          121..129
FT                   /note="1-4"
FT                   /evidence="ECO:0000305|PubMed:3802196"
FT   REPEAT          130..138
FT                   /note="1-5"
FT                   /evidence="ECO:0000305|PubMed:3802196"
FT   REPEAT          139..147
FT                   /note="1-6"
FT                   /evidence="ECO:0000305|PubMed:3802196"
FT   REPEAT          148..156
FT                   /note="1-7"
FT                   /evidence="ECO:0000305|PubMed:3802196"
FT   REPEAT          157..165
FT                   /note="1-8"
FT                   /evidence="ECO:0000305|PubMed:3802196"
FT   REPEAT          166..174
FT                   /note="1-9"
FT                   /evidence="ECO:0000305|PubMed:3802196"
FT   REPEAT          175..183
FT                   /note="1-10"
FT                   /evidence="ECO:0000305|PubMed:3802196"
FT   REPEAT          184..191
FT                   /note="1-11"
FT                   /evidence="ECO:0000305|PubMed:3802196"
FT   REPEAT          193..208
FT                   /note="2-1"
FT                   /evidence="ECO:0000305|PubMed:3802196"
FT   REPEAT          209..224
FT                   /note="2-2"
FT                   /evidence="ECO:0000305|PubMed:3802196"
FT   REPEAT          225..240
FT                   /note="2-3"
FT                   /evidence="ECO:0000305|PubMed:3802196"
FT   REPEAT          241..251
FT                   /note="2-4; approximate; truncated"
FT                   /evidence="ECO:0000305|PubMed:3802196"
FT   REPEAT          252..260
FT                   /note="2-5; approximate; truncated"
FT                   /evidence="ECO:0000305|PubMed:3802196"
FT   REPEAT          261..268
FT                   /note="2-6; approximate; truncated"
FT                   /evidence="ECO:0000305|PubMed:3802196"
FT   DOMAIN          304..356
FT                   /note="TSP type-1"
FT                   /evidence="ECO:0000255|PROSITE-ProRule:PRU00210"
FT   REGION          50..288
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   REGION          81..89
FT                   /note="Required for the binding to heparan sulfate
FT                   proteoglycans (HSPGs) on the surface of host hepatocytes"
FT                   /evidence="ECO:0000250|UniProtKB:Q7K740"
FT   REGION          92..96
FT                   /note="Region I; contains the proteolytic cleavage site"
FT                   /evidence="ECO:0000250|UniProtKB:P23093"
FT   REGION          97..191
FT                   /note="11 X 9 AA tandem repeats of P-[AE]-G-D-G-A-P-A-[AG]"
FT                   /evidence="ECO:0000305|PubMed:3802196"
FT   REGION          193..268
FT                   /note="6 X 16 AA approximate tandem repeats of N-R-A-G-G-Q-
FT                   P-A-A-G-G-N-Q-A-G-G"
FT                   /evidence="ECO:0000305|PubMed:3802196"
FT   COMPBIAS        62..94
FT                   /note="Basic and acidic residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   LIPID           355
FT                   /note="GPI-anchor amidated cysteine"
FT                   /evidence="ECO:0000255"
FT   CARBOHYD        319
FT                   /note="O-linked (Fuc) threonine"
FT                   /evidence="ECO:0000250|UniProtKB:P19597"
FT   DISULFID        316..350
FT                   /evidence="ECO:0000250|UniProtKB:Q7K740"
FT   DISULFID        320..355
FT                   /evidence="ECO:0000250|UniProtKB:Q7K740"
SQ   SEQUENCE   378 AA;  36286 MW;  779BA081C140793F CRC64;
     MKNFNLLVVS SILLVDLFPT NCGHNVHFSR AINLNGVSFN NVDASSLGAA QVRQSASRGR
     GLGENPKDEE GADKPKKKEE KKVEPKKPRE NKLKQPPAGD GAPEGDGAPA APAGDGAPAA
     PAGDGAPAAP AGDGAPAAPA GDGAPAAPAG DGAPAAPAGD GAPAAPAGDG APAAPAGDGA
     PAAPAGDGAP AGNRAGGQPA AGGNQAGGNR AGGQPAAGGN QAGGNRAGGQ PAAGGNQAGG
     QPAAGGNQAG AQAGGNQAGA QAGGANAGNK KAGEAGGNAG AGQGQNNEAA NVPNAKLVKE
     YLDKIRSTLG VEWSPCSVTC GKGVRMRRKV SAANKKPEEL DVNDLETEVC TMDKCAGIFN
     VVSNSLRLVI LLVLALFN