SLAP1_ACET2
ID SLAP1_ACET2 Reviewed; 2313 AA.
AC Q06852; A3DJZ5;
DT 01-JUN-1994, integrated into UniProtKB/Swiss-Prot.
DT 17-APR-2007, sequence version 2.
DT 03-AUG-2022, entry version 119.
DE RecName: Full=Cell surface glycoprotein 1;
DE AltName: Full=Outer layer protein B;
DE AltName: Full=S-layer protein 1;
DE Flags: Precursor;
GN Name=olpB; OrderedLocusNames=Cthe_3078;
OS Acetivibrio thermocellus (strain ATCC 27405 / DSM 1237 / JCM 9322 / NBRC
OS 103400 / NCIMB 10682 / NRRL B-4536 / VPI 7372) (Clostridium thermocellum).
OC Bacteria; Firmicutes; Clostridia; Eubacteriales; Oscillospiraceae;
OC Acetivibrio.
OX NCBI_TaxID=203119;
RN [1]
RP NUCLEOTIDE SEQUENCE [GENOMIC DNA].
RX PubMed=8458832; DOI=10.1128/jb.175.7.1891-1899.1993;
RA Fujino T., Beguin P., Aubert J.-P.;
RT "Organization of a Clostridium thermocellum gene cluster encoding the
RT cellulosomal scaffolding protein CipA and a protein possibly involved in
RT attachment of the cellulosome to the cell surface.";
RL J. Bacteriol. 175:1891-1899(1993).
RN [2]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=ATCC 27405 / DSM 1237 / JCM 9322 / NBRC 103400 / NCIMB 10682 / NRRL
RC B-4536 / VPI 7372;
RG US DOE Joint Genome Institute;
RA Copeland A., Lucas S., Lapidus A., Barry K., Detter J.C.,
RA Glavina del Rio T., Hammon N., Israni S., Dalin E., Tice H., Pitluck S.,
RA Chertkov O., Brettin T., Bruce D., Han C., Tapia R., Gilna P., Schmutz J.,
RA Larimer F., Land M., Hauser L., Kyrpides N., Mikhailova N., Wu J.H.D.,
RA Newcomb M., Richardson P.;
RT "Complete sequence of Clostridium thermocellum ATCC 27405.";
RL Submitted (FEB-2007) to the EMBL/GenBank/DDBJ databases.
CC -!- SUBUNIT: Assembled into mono-layered crystalline arrays.
CC -!- SUBCELLULAR LOCATION: Secreted, cell wall, S-layer.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; X67506; CAA47841.1; -; Genomic_DNA.
DR EMBL; CP000568; ABN54274.1; -; Genomic_DNA.
DR PIR; T18262; T18262.
DR AlphaFoldDB; Q06852; -.
DR SMR; Q06852; -.
DR STRING; 203119.Cthe_3078; -.
DR EnsemblBacteria; ABN54274; ABN54274; Cthe_3078.
DR KEGG; cth:Cthe_3078; -.
DR eggNOG; COG1361; Bacteria.
DR eggNOG; COG2911; Bacteria.
DR eggNOG; COG3266; Bacteria.
DR HOGENOM; CLU_230037_0_0_9; -.
DR OMA; NICNYAS; -.
DR BioCyc; MetaCyc:MON-16411; -.
DR Proteomes; UP000002145; Chromosome.
DR GO; GO:0005576; C:extracellular region; IEA:UniProtKB-KW.
DR GO; GO:0030115; C:S-layer; IEA:UniProtKB-SubCell.
DR GO; GO:0030246; F:carbohydrate binding; IEA:InterPro.
DR GO; GO:0000272; P:polysaccharide catabolic process; IEA:InterPro.
DR InterPro; IPR008965; CBM2/CBM3_carb-bd_dom_sf.
DR InterPro; IPR002102; Cohesin_dom.
DR InterPro; IPR001119; SLH_dom.
DR Pfam; PF00963; Cohesin; 7.
DR Pfam; PF00395; SLH; 3.
DR SUPFAM; SSF49384; SSF49384; 7.
DR PROSITE; PS51272; SLH; 3.
PE 3: Inferred from homology;
KW Cell wall; Reference proteome; Repeat; S-layer; Secreted; Signal.
FT SIGNAL 1..28
FT /evidence="ECO:0000255"
FT CHAIN 29..2313
FT /note="Cell surface glycoprotein 1"
FT /id="PRO_0000032634"
FT DOMAIN 34..197
FT /note="Cohesin 1"
FT DOMAIN 205..367
FT /note="Cohesin 2"
FT DOMAIN 407..569
FT /note="Cohesin 3"
FT DOMAIN 609..771
FT /note="Cohesin 4"
FT DOMAIN 811..973
FT /note="Cohesin 5"
FT DOMAIN 1013..1175
FT /note="Cohesin 6"
FT DOMAIN 1211..1375
FT /note="Cohesin 7"
FT DOMAIN 2067..2140
FT /note="SLH 1"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00777"
FT DOMAIN 2141..2204
FT /note="SLH 2"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00777"
FT DOMAIN 2211..2274
FT /note="SLH 3"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00777"
FT REGION 369..400
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 571..602
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 772..805
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 974..1007
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1177..1203
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1374..2111
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1383..2025
FT /note="Approximate tandem repeats of T-P-S-D-E-P"
FT COMPBIAS 1383..2037
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT CONFLICT 367..972
FT /note="Missing (in Ref. 1; CAA47841)"
FT /evidence="ECO:0000305"
FT CONFLICT 1762..1804
FT /note="Missing (in Ref. 1; CAA47841)"
FT /evidence="ECO:0000305"
SQ SEQUENCE 2313 AA; 248168 MW; 961524654302E572 CRC64;
MKRKNKVLSI LLTLLLIIST TSVNMSFAEA TPSIEMVLDK TEVHVGDVIT ATIKVNNIRK
LAGYQLNIKF DPEVLQPVDP ATGEEFTDKS MPVNRVLLTN SKYGPTPVAG NDIKSGIINF
ATGYNNLTAY KSSGIDEHTG IIGEIGFKVL KKQNTSIRFE DTLSMPGAIS GTSLFDWDAE
TITGYEVIQP DLIVVEAEPL KDASVALELD KTKVKVGDII TATIKIENMK NFAGYQLNIK
YDPTMLEAIE LETGSAIAKR TWPVTGGTVL QSDNYGKTTA VANDVGAGII NFAEAYSNLT
KYRETGVAEE TGIIGKIGFR VLKAGSTAIR FEDTTAMPGA IEGTYMFDWY GENIKGYSVV
QPGEIVVEGE EPGEEPTEEP VPTETSVDPT PTVTEEPVPS ELPDSYVIME LDKTKVKVGD
IITATIKIEN MKNFAGYQLN IKYDPTMLEA IELETGSAIA KRTWPVTGGT VLQSDNYGKT
TAVANDVGAG IINFAEAYSN LTKYRETGVA EETGIIGKIG FRVLKAGSTA IRFEDTTAMP
GAIEGTYMFD WYGENIKGYS VVQPGEIVVE GEEPGEEPTE EPVPTETSVD PTPTVTEEPV
PSELPDSYVI MELDKTKVKV GDIITATIKI ENMKNFAGYQ LNIKYDPTML EAIELETGSA
IAKRTWPVTG GTVLQSDNYG KTTAVANDVG AGIINFAEAY SNLTKYRETG VAEETGIIGK
IGFRVLKAGS TAIRFEDTTA MPGAIEGTYM FDWYGENIKG YSVVQPGEIV AEGEEPGEEP
TEEPVPTETS ADPTPTVTEE PVPSELPDSY VIMELDKTKV KVGDIITATI KIENMKNFAG
YQLNIKYDPT MLEAIELETG SAIAKRTWPV TGGTVLQSDN YGKTTAVAND VGAGIINFAE
AYSNLTKYRE TGVAEETGII GKIGFRVLKA GSTAIRFEDT TAMPGAIEGT YMFDWYGENI
KGYSVVQPGE IVAEGEEPGE EPTEEPVPTE TPVDPTPTVT EEPVPSELPD SYVIMELDKT
KVKVGDIITA TIKIENMKNF AGYQLNIKYD PTMLEAIELE TGSAIAKRTW PVTGGTVLQS
DNYGKTTAVA NDVGAGIINF AEAYSNLTKY RETGVAEETG IIGKIGFRVL KAGSTAIRFE
DTTAMPGAIE GTYMFDWYGE NIKGYSVVQP GEIVAEGEEP TEEPVPTETP VDPTPTVTEE
PVPSELPDSY VIMELDKTKV KEGDVIIATI RVNNIKNLAG YQIGIKYDPK VLEAFNIETG
DPIDEGTWPA VGGTILKNRD YLPTGVAINN VSKGILNFAA YYVYFDDYRE EGKSEDTGII
GNIGFRVLKA EDTTIRFEEL ESMPGSIDGT YMLDWYLNRI SGYVVIQPAP IKAASDEPIP
TDTPSDEPTP SDEPTPSDEP TPSDEPTPSD EPTPSETPEE PIPTDTPSDE PTPSDEPTPS
DEPTPSDEPT PSDEPTPSET PEEPIPTDTP SDEPTPSDEP TPSDEPTPSD EPTPSDEPTP
SETPEEPIPT DTPSDEPTPS DEPTPSDEPT PSDEPTPSDE PTPSETPEEP IPTDTPSDEP
TPSDEPTPSD EPTPSDEPTP SDEPTPSDEP TPSDEPTPSE TPEEPIPTDT PSDEPTPSDE
PTPSDEPTPS DEPTPSDEPT PSDEPTPSDE PTPSETPEEP IPTDTPSDEP TPSDEPTPSD
EPTPSDEPTP SDEPTPSETP EEPIPTDTPS DEPTPSDEPT PSDEPTPSDE PTPSDEPTPS
ETPEEPIPTD TPSDEPTPSD EPTPSDEPTP SDEPTPSDEP TPSETPEEPI PTDTPSDEPT
PSDEPTPSDE PTPSDEPTPS DEPTPSETPE EPIPTDTPSD EPTPSDEPTP SDEPTPSDEP
TPSDEPTPSE TPEEPIPTDT PSDEPTPSDE PTPSDEPTPS DEPTPSDEPT PSETPEEPIP
TDTPSDEPTP SDEPTPSDEP TPSDEPTPSD EPTPSDEPTP SDEPTPSETP EEPIPTDTPS
DEPTPSDEPT PSDEPTPSDE PTPSDEPTPS DEPTPSDEPT PSETPEEPTP TTTPTPTPST
TPTSGSGGSG GSGGGGGGGG GTVPTSPTPT PTSKPTSTPA PTEIEEPTPS DVPGAIGGEH
RAYLRGYPDG SFRPERNITR AEAAVIFAKL LGADESYGAQ SASPYSDLAD THWAAWAIKF
ATSQGLFKGY PDGTFKPDQN ITRAEFATVV LHFLTKVKGQ EIMSKLATID ISNPKFDDCV
GHWAQEFIEK LTSLGYISGY PDGTFKPQNY IKRSESVALI NRALERGPLN GAPKLFPDVN
ESYWAFGDIM DGALDHSYII EDEKEKFVKL LED