CSP_PLAFL

ID   CSP_PLAFL               Reviewed;         315 AA.
AC   P05691;
DT   01-NOV-1988, integrated into UniProtKB/Swiss-Prot.
DT   01-NOV-1988, sequence version 1.
DT   03-AUG-2022, entry version 49.
DE   RecName: Full=Circumsporozoite protein {ECO:0000303|PubMed:2442154};
DE            Short=CS {ECO:0000303|PubMed:2442154};
DE   Contains:
DE     RecName: Full=Circumsporozoite protein C-terminus {ECO:0000305};
DE   Flags: Fragment;
GN   Name=CSP {ECO:0000250|UniProtKB:P02893};
OS   Plasmodium falciparum (isolate le5).
OC   Eukaryota; Sar; Alveolata; Apicomplexa; Aconoidasida; Haemosporida;
OC   Plasmodiidae; Plasmodium; Plasmodium (Laverania).
OX   NCBI_TaxID=5840;
RN   [1]
RP   NUCLEOTIDE SEQUENCE [GENOMIC DNA], POLYMORPHISM, AND REPEATS.
RX   PubMed=2442154; DOI=10.1016/s0021-9258(18)45298-2;
RA   la Cruz V.F., Lal A.A., McCutchan T.F.;
RT   "Sequence variation in putative functional domains of the circumsporozoite
RT   protein of Plasmodium falciparum. Implications for vaccine development.";
RL   J. Biol. Chem. 262:11935-11939(1987).
CC   -!- FUNCTION: Essential sporozoite protein (By similarity). In the mosquito
CC       vector, required for sporozoite development in the oocyst, migration
CC       through the vector hemolymph and entry into the vector salivary glands
CC       (By similarity). In the vertebrate host, required for sporozoite
CC       migration through the host dermis and infection of host hepatocytes (By
CC       similarity). Binds to highly sulfated heparan sulfate proteoglycans
CC       (HSPGs) on the surface of host hepatocytes (By similarity).
CC       {ECO:0000250|UniProtKB:P02893, ECO:0000250|UniProtKB:P23093}.
CC   -!- FUNCTION: [Circumsporozoite protein C-terminus]: In the vertebrate
CC       host, binds to highly sulfated heparan sulfate proteoglycans (HSPGs) on
CC       the surface of host hepatocytes and is required for sporozoite invasion
CC       of the host hepatocytes. {ECO:0000250|UniProtKB:P23093}.
CC   -!- SUBCELLULAR LOCATION: Cell membrane {ECO:0000250|UniProtKB:P19597};
CC       Lipid-anchor, GPI-anchor {ECO:0000255}. Cytoplasm
CC       {ECO:0000250|UniProtKB:P23093}. Note=Localizes to the cytoplasm and the
CC       cell membrane in oocysts at day 6 post infection and then gradually
CC       distributes over the entire cell surface of the sporoblast and the
CC       budding sporozoites. {ECO:0000250|UniProtKB:P23093}.
CC   -!- DOMAIN: The N-terminus is involved in the initial binding to heparan
CC       sulfate proteoglycans (HSPGs) on the surface of host hepatocytes (By
CC       similarity). The N-terminus masks the TSP type-1 (TSR) domain which
CC       maintains the sporozoites in a migratory state, enabling them to
CC       complete their journey to the salivary gland in the mosquito vector and
CC       then to the host liver. The unmasking of the TSP type-1 (TSR) domain
CC       when the sporozoite interacts with the host hepatocyte also protects
CC       sporozoites from host antibodies (By similarity).
CC       {ECO:0000250|UniProtKB:P23093, ECO:0000250|UniProtKB:Q7K740}.
CC   -!- DOMAIN: The TSP type-1 (TSR) domain is required for sporozoite
CC       development and invasion. CSP has two conformational states, an
CC       adhesive conformation in which the TSP type-1 (TSR) domain is exposed
CC       and a nonadhesive conformation in which the TSR is masked by the N-
CC       terminus. TSR-exposed conformation occurs during sporozoite development
CC       in the oocyst in the mosquito vector and during host hepatocyte
CC       invasion. TSR-masked conformation occurs during sporozoite migration
CC       through the hemolymph to salivary glands in the mosquito vector and in
CC       the host dermis. {ECO:0000250|UniProtKB:P23093}.
CC   -!- DOMAIN: The GPI-anchor is essential for cell membrane localization and
CC       for sporozoite formation inside the oocyst.
CC       {ECO:0000250|UniProtKB:P23093}.
CC   -!- PTM: During host cell invasion, proteolytically cleaved at the cell
CC       membrane in the region I by a papain-like cysteine protease of parasite
CC       origin (By similarity). Cleavage is triggered by the sporozoite contact
CC       with highly sulfated heparan sulfate proteoglycans (HSPGs) present on
CC       the host hepatocyte cell surface (By similarity). Cleavage exposes the
CC       TSP type-1 (TSR) domain and is required for productive invasion of host
CC       hepatocytes but not for adhesion to the host cell membrane (By
CC       similarity). Cleavage is dispensable for sporozoite development in the
CC       oocyst, motility and for traversal of host and vector cells (By
CC       similarity). {ECO:0000250|UniProtKB:P02893,
CC       ECO:0000250|UniProtKB:P23093}.
CC   -!- PTM: O-glycosylated; maybe by POFUT2. {ECO:0000250|UniProtKB:P19597}.
CC   -!- POLYMORPHISM: The sequence of the repeats varies across Plasmodium
CC       species and strains. {ECO:0000269|PubMed:2442154}.
CC   -!- SIMILARITY: Belongs to the plasmodium circumsporozoite protein family.
CC       {ECO:0000305}.
CC   ---------------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC   ---------------------------------------------------------------------------
DR   EMBL; M17802; AAA29538.1; -; Genomic_DNA.
DR   AlphaFoldDB; P05691; -.
DR   SMR; P05691; -.
DR   GO; GO:0031225; C:anchored component of membrane; IEA:UniProtKB-KW.
DR   GO; GO:0005737; C:cytoplasm; IEA:UniProtKB-SubCell.
DR   GO; GO:0005886; C:plasma membrane; IEA:UniProtKB-SubCell.
PE   3: Inferred from homology;
KW   Cell membrane; Cytoplasm; Glycoprotein; GPI-anchor; Lipoprotein; Malaria;
KW   Membrane; Repeat; Sporozoite.
FT   CHAIN           <1..>315
FT                   /note="Circumsporozoite protein"
FT                   /id="PRO_0000217181"
FT   CHAIN           ?..>315
FT                   /note="Circumsporozoite protein C-terminus"
FT                   /evidence="ECO:0000250|UniProtKB:P23093"
FT                   /id="PRO_0000455487"
FT   REPEAT          107..110
FT                   /note="1"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          111..114
FT                   /note="2"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          115..118
FT                   /note="3"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          119..122
FT                   /note="4"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          123..126
FT                   /note="5"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          127..130
FT                   /note="6"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          131..134
FT                   /note="7"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          135..138
FT                   /note="8"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          139..142
FT                   /note="9"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          143..146
FT                   /note="10"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          147..150
FT                   /note="11"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          151..154
FT                   /note="12"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          155..158
FT                   /note="13"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          159..162
FT                   /note="14"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          163..166
FT                   /note="15"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          167..170
FT                   /note="16"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          171..174
FT                   /note="17"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          175..178
FT                   /note="18"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          179..182
FT                   /note="19"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          183..186
FT                   /note="20"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          187..190
FT                   /note="21"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          191..194
FT                   /note="22"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          195..198
FT                   /note="23"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          199..202
FT                   /note="24"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          203..206
FT                   /note="25"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          207..210
FT                   /note="26"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          211..214
FT                   /note="27"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          215..218
FT                   /note="28"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          219..222
FT                   /note="29"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          223..226
FT                   /note="30"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          227..230
FT                   /note="31"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          231..234
FT                   /note="32"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          235..238
FT                   /note="33"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          239..242
FT                   /note="34"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          243..246
FT                   /note="35"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          247..250
FT                   /note="36"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          251..254
FT                   /note="37"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          255..258
FT                   /note="38"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          259..262
FT                   /note="39"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REPEAT          263..266
FT                   /note="40"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   REGION          53..315
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   REGION          88..95
FT                   /note="Required for the binding to heparan sulfate
FT                   proteoglycans (HSPGs) on the surface of host hepatocytes"
FT                   /evidence="ECO:0000250|UniProtKB:Q7K740"
FT   REGION          96..100
FT                   /note="Region I; contains the proteolytic cleavage site"
FT                   /evidence="ECO:0000250|UniProtKB:P23093"
FT   REGION          107..266
FT                   /note="40 X 4 AA tandem repeats of P-N-[AV]-[ND]"
FT                   /evidence="ECO:0000305|PubMed:2442154"
FT   COMPBIAS        65..91
FT                   /note="Basic and acidic residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   COMPBIAS        105..304
FT                   /note="Polar residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   NON_TER         1
FT   NON_TER         315
SQ   SEQUENCE   315 AA;  33649 MW;  A334DB11FA7FD777 CRC64;
     EALFQEYQCY GSSSNTRVLN ELNYDNAGTN LYNELEMNYY GKQENWYSLK KNSRSLGEND
     DGNNNNGDNG REGKDEDKRD GNNEDNEKLR KPKHKKLKQP ADGNPDPNAN PNVDPNANPN
     VDPNANPNVD PNANPNANPN ANPNANPNAN PNANPNANPN ANPNANPNAN PNANPNVDPN
     ANPNANPNAN PNANPNANPN ANPNANPNAN PNANPNANPN ANPNANPNAN PNANPNANPN
     ANPNANPNAN PNANPNANPN ANPNANPNKN NQGNGQGHNM PNDPNRNVDE NANGNNAVKN
     NNNEEPSDQH IEKYL