CSG_HALMA
ID CSG_HALMA Reviewed; 875 AA.
AC Q5V7F4;
DT 16-APR-2014, integrated into UniProtKB/Swiss-Prot.
DT 16-APR-2014, sequence version 2.
DT 25-MAY-2022, entry version 67.
DE RecName: Full=Cell surface glycoprotein;
DE AltName: Full=S-layer glycoprotein;
DE Flags: Precursor;
GN Name=csg1; OrderedLocusNames=pNG5138;
OS Haloarcula marismortui (strain ATCC 43049 / DSM 3752 / JCM 8966 / VKM
OS B-1809) (Halobacterium marismortui).
OG Plasmid pNG500.
OC Archaea; Euryarchaeota; Stenosarchaea group; Halobacteria; Halobacteriales;
OC Haloarculaceae; Haloarcula.
OX NCBI_TaxID=272569;
RN [1]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=ATCC 43049 / DSM 3752 / JCM 8966 / VKM B-1809; PLASMID=pNG500;
RX PubMed=15520287; DOI=10.1101/gr.2700304;
RA Baliga N.S., Bonneau R., Facciotti M.T., Pan M., Glusman G., Deutsch E.W.,
RA Shannon P., Chiu Y., Weng R.S., Gan R.R., Hung P., Date S.V., Marcotte E.,
RA Hood L., Ng W.V.;
RT "Genome sequence of Haloarcula marismortui: a halophilic archaeon from the
RT Dead Sea.";
RL Genome Res. 14:2221-2234(2004).
RN [2]
RP GLYCOSYLATION AT ASN-455.
RX PubMed=21815949; DOI=10.1111/j.1365-2958.2011.07781.x;
RA Calo D., Guan Z., Naparstek S., Eichler J.;
RT "Different routes to the same ending: comparing the N-glycosylation
RT processes of Haloferax volcanii and Haloarcula marismortui, two halophilic
RT archaea from the Dead Sea.";
RL Mol. Microbiol. 81:1166-1177(2011).
CC -!- FUNCTION: S-layer protein. The S-layer is a paracrystalline mono-
CC layered assembly of proteins which coat the surface of the cell.
CC {ECO:0000250|UniProtKB:P25062}.
CC -!- SUBCELLULAR LOCATION: Secreted, cell wall, S-layer
CC {ECO:0000250|UniProtKB:P25062}. Cell membrane
CC {ECO:0000250|UniProtKB:P25062}.
CC -!- PTM: Asn-455 is glycosylated by a pentasaccharide comprising a hexose,
CC 2 hexuronic acids, a methyl ester of a hexuronic acid and a final
CC hexose. The complete pentasaccharide is first assembled on dolichol
CC phosphate and then transferred the glycan to the target Asn.
CC {ECO:0000269|PubMed:21815949}.
CC -!- PTM: Cleaved by the archaeosortase ArtA at the C-terminus, with removal
CC of a short hydrophobic segment. {ECO:0000250|UniProtKB:P25062}.
CC -!- PTM: Lipidation. {ECO:0000250|UniProtKB:P25062}.
CC -!- SIMILARITY: Belongs to the halobacterial S-layer protein family.
CC {ECO:0000305}.
CC -!- SEQUENCE CAUTION:
CC Sequence=AAV44550.1; Type=Erroneous initiation; Note=Extended N-terminus.; Evidence={ECO:0000305};
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AY596294; AAV44550.1; ALT_INIT; Genomic_DNA.
DR RefSeq; WP_049938478.1; NZ_CP039135.1.
DR AlphaFoldDB; Q5V7F4; -.
DR SMR; Q5V7F4; -.
DR iPTMnet; Q5V7F4; -.
DR EnsemblBacteria; AAV44550; AAV44550; pNG5138.
DR GeneID; 40150781; -.
DR KEGG; hma:pNG5138; -.
DR PATRIC; fig|272569.17.peg.290; -.
DR HOGENOM; CLU_015552_0_0_2; -.
DR Proteomes; UP000001169; Plasmid pNG500.
DR GO; GO:0005576; C:extracellular region; IEA:UniProtKB-KW.
DR GO; GO:0016021; C:integral component of membrane; IEA:UniProtKB-KW.
DR GO; GO:0005886; C:plasma membrane; IEA:UniProtKB-SubCell.
DR GO; GO:0030115; C:S-layer; IEA:UniProtKB-SubCell.
DR GO; GO:0071555; P:cell wall organization; IEA:UniProtKB-KW.
DR InterPro; IPR026458; Major_cell_surface_glycoprot.
DR InterPro; IPR026371; PGF_CTERM.
DR InterPro; IPR026452; Surf_glycop_sig_pep.
DR TIGRFAMs; TIGR04207; halo_sig_pep; 1.
DR TIGRFAMs; TIGR04216; halo_surf_glyco; 1.
DR TIGRFAMs; TIGR04126; PGF_CTERM; 1.
PE 1: Evidence at protein level;
KW Cell membrane; Cell wall; Cell wall biogenesis/degradation; Glycoprotein;
KW Membrane; Plasmid; Reference proteome; S-layer; Secreted; Signal;
KW Transmembrane; Transmembrane helix.
FT SIGNAL 1..23
FT /evidence="ECO:0000255"
FT CHAIN 24..?
FT /note="Cell surface glycoprotein"
FT /id="PRO_0000428762"
FT PROPEP ?..875
FT /note="Removed by archaeosortase"
FT /evidence="ECO:0000250|UniProtKB:P25062"
FT /id="PRO_0000444304"
FT TRANSMEM 851..875
FT /note="Helical"
FT /evidence="ECO:0000255"
FT REGION 137..158
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 197..217
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 380..414
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 794..852
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT MOTIF 852..854
FT /note="PGF sorting signal"
FT /evidence="ECO:0000250|UniProtKB:P25062"
FT COMPBIAS 197..211
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 380..404
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 799..844
FT /note="Acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT CARBOHYD 253
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00498"
FT CARBOHYD 455
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000269|PubMed:21815949"
FT CARBOHYD 563
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00498"
FT CARBOHYD 715
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00498"
FT CARBOHYD 774
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00498"
SQ SEQUENCE 875 AA; 91013 MW; BDC957FE143EAF24 CRC64;
MTNTKQKINA VFLSALMVMS VFAAAVAFSG AAAAANRGAG FTYSTGPTDS NGGGNGDSVG
QVGPGAVVFQ GEEDLEDGGN FGSNTDIGQL QKVSGDNSGI LLGNPIPQDQ PTGSYTFDGN
SGTDGVTLQT PRVTSVEVQN GGSGDVTGST LQTSSSGPDA FVRADYNFQE AEDLEITVED
ENGLEVTNEI VVQKTGLPTA DRNNDNGASG SNGDFDVGWE LDTTDIDEGQ YTITVEGTED
LTFGDASETV TVNITSDQQA SLNLDNDEVV QGENLQFNVE NSPEGNYHVV LVESSEFRDG
ITADQASRIF RNVGDVQEVG LVDNTGPVSA STVASNVGSD QEVADVTRYA YGVVEIDGGS
GVGSIETQFL DDSSVDVELY PASDSSNDGY ASGGSHASSV TVRDTDGDGT DDSEDAIVTD
LLETDDDQSF DVVEGEITLD SPSGAYITGS QIDVNGTANQ GVDQVALYAR DNNDYELIEI
DGSNTVSVDG DDTFSEEDVV LSQGSKGGNS IVSLPGSYRI GVIDVQDADL DSDGTVDDTL
TTSDFNSGVS GATALRVTDT ALNGTFTTYN GQIASDDGQI DVDGQAPGKD NVIVAFVDSR
GNAAAQVVSV DDDDSFSEED IDITSLSEGT VTAHILSSGR DGEYGDTGTS SDSAFVNTIE
TGYAGGSSTG DQVREQILAN TVDDTASDDL IVNEQFRLTD GLTTIESVSS PVEANGTLEV
QGNTNRVPDD NTITVEILNS EDESVTVEST DEWGSDGQWS VNVDLSDVDI EPGNYTVEAD
DGDNTDRTSV TVVEAGSLEE EQPDTETPEP DTETPEPDTE TPEPDTETPE PDTETPEPDT
ETEEATTEAS GPGFTAAIAL IALVAAALLA VRRDN