NRX3A_HUMAN
ID NRX3A_HUMAN Reviewed; 1643 AA.
AC Q9Y4C0; A6NGR4; A7MD34; O95378; Q8IUE3; Q9NS47; Q9P1V3; Q9P1V6; Q9UIE2;
AC Q9UIE3; Q9ULA5; Q9Y486;
DT 16-NOV-2001, integrated into UniProtKB/Swiss-Prot.
DT 03-MAR-2009, sequence version 4.
DT 03-AUG-2022, entry version 202.
DE RecName: Full=Neurexin-3;
DE AltName: Full=Neurexin III-alpha;
DE AltName: Full=Neurexin-3-alpha;
DE Flags: Precursor;
GN Name=NRXN3; Synonyms=C14orf60, KIAA0743;
OS Homo sapiens (Human).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae;
OC Homo.
OX NCBI_TaxID=9606;
RN [1]
RP NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM 4A), AND TISSUE SPECIFICITY.
RC TISSUE=Heart;
RX PubMed=12379233; DOI=10.1016/s0006-291x(02)02403-8;
RA Occhi G., Rampazzo A., Beffagna G., Antonio Danieli G.;
RT "Identification and characterization of heart-specific splicing of human
RT neurexin 3 mRNA (NRXN3).";
RL Biochem. Biophys. Res. Commun. 298:151-155(2002).
RN [2]
RP NUCLEOTIDE SEQUENCE [GENOMIC DNA], AND ALTERNATIVE SPLICING.
RX PubMed=11944992; DOI=10.1006/geno.2002.6734;
RA Rowen L., Young J., Birditt B., Kaur A., Madan A., Philipps D.L., Qin S.,
RA Minx P., Wilson R.K., Hood L., Graveley B.R.;
RT "Analysis of the human neurexin genes: alternative splicing and the
RT generation of protein diversity.";
RL Genomics 79:587-597(2002).
RN [3]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 3A).
RC TISSUE=Brain;
RX PubMed=9872452; DOI=10.1093/dnares/5.5.277;
RA Nagase T., Ishikawa K., Suyama M., Kikuno R., Miyajima N., Tanaka A.,
RA Kotani H., Nomura N., Ohara O.;
RT "Prediction of the coding sequences of unidentified human genes. XI. The
RT complete sequences of 100 new cDNA clones from brain which code for large
RT proteins in vitro.";
RL DNA Res. 5:277-286(1998).
RN [4]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RX PubMed=12508121; DOI=10.1038/nature01348;
RA Heilig R., Eckenberg R., Petit J.-L., Fonknechten N., Da Silva C.,
RA Cattolico L., Levy M., Barbe V., De Berardinis V., Ureta-Vidal A.,
RA Pelletier E., Vico V., Anthouard V., Rowen L., Madan A., Qin S., Sun H.,
RA Du H., Pepin K., Artiguenave F., Robert C., Cruaud C., Bruels T.,
RA Jaillon O., Friedlander L., Samson G., Brottier P., Cure S., Segurens B.,
RA Aniere F., Samain S., Crespeau H., Abbasi N., Aiach N., Boscus D.,
RA Dickhoff R., Dors M., Dubois I., Friedman C., Gouyvenoux M., James R.,
RA Madan A., Mairey-Estrada B., Mangenot S., Martins N., Menard M., Oztas S.,
RA Ratcliffe A., Shaffer T., Trask B., Vacherie B., Bellemere C., Belser C.,
RA Besnard-Gonnet M., Bartol-Mavel D., Boutard M., Briez-Silla S.,
RA Combette S., Dufosse-Laurent V., Ferron C., Lechaplais C., Louesse C.,
RA Muselet D., Magdelenat G., Pateau E., Petit E., Sirvain-Trukniewicz P.,
RA Trybou A., Vega-Czarny N., Bataille E., Bluet E., Bordelais I., Dubois M.,
RA Dumont C., Guerin T., Haffray S., Hammadi R., Muanga J., Pellouin V.,
RA Robert D., Wunderle E., Gauguet G., Roy A., Sainte-Marthe L., Verdier J.,
RA Verdier-Discala C., Hillier L.W., Fulton L., McPherson J., Matsuda F.,
RA Wilson R., Scarpelli C., Gyapay G., Wincker P., Saurin W., Quetier F.,
RA Waterston R., Hood L., Weissenbach J.;
RT "The DNA sequence and analysis of human chromosome 14.";
RL Nature 421:601-607(2003).
RN [5]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RA Mural R.J., Istrail S., Sutton G.G., Florea L., Halpern A.L., Mobarry C.M.,
RA Lippert R., Walenz B., Shatkay H., Dew I., Miller J.R., Flanigan M.J.,
RA Edwards N.J., Bolanos R., Fasulo D., Halldorsson B.V., Hannenhalli S.,
RA Turner R., Yooseph S., Lu F., Nusskern D.R., Shue B.C., Zheng X.H.,
RA Zhong F., Delcher A.L., Huson D.H., Kravitz S.A., Mouchard L., Reinert K.,
RA Remington K.A., Clark A.G., Waterman M.S., Eichler E.E., Adams M.D.,
RA Hunkapiller M.W., Myers E.W., Venter J.C.;
RL Submitted (JUL-2005) to the EMBL/GenBank/DDBJ databases.
RN [6]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 3A).
RX PubMed=15489334; DOI=10.1101/gr.2596504;
RG The MGC Project Team;
RT "The status, quality, and expansion of the NIH full-length cDNA project:
RT the Mammalian Gene Collection (MGC).";
RL Genome Res. 14:2121-2127(2004).
RN [7]
RP TISSUE SPECIFICITY.
RX PubMed=19926856; DOI=10.1073/pnas.0809510106;
RA Bottos A., Destro E., Rissone A., Graziano S., Cordara G., Assenzio B.,
RA Cera M.R., Mascia L., Bussolino F., Arese M.;
RT "The synaptic proteins neurexins and neuroligins are widely expressed in
RT the vascular system and contribute to its functions.";
RL Proc. Natl. Acad. Sci. U.S.A. 106:20782-20787(2009).
CC -!- FUNCTION: Neuronal cell surface protein that may be involved in cell
CC recognition and cell adhesion. May mediate intracellular signaling.
CC -!- SUBUNIT: The laminin G-like domain 2 binds to NXPH1. Specific isoforms
CC bind to alpha-dystroglycan. The cytoplasmic C-terminal region binds to
CC CASK (By similarity). {ECO:0000250}.
CC -!- SUBCELLULAR LOCATION: Membrane {ECO:0000305}; Single-pass type I
CC membrane protein {ECO:0000305}.
CC -!- ALTERNATIVE PRODUCTS:
CC Event=Alternative promoter usage, Alternative splicing; Named isoforms=7;
CC Comment=A number of isoforms, alpha-type and beta-type, are produced
CC by alternative promoter usage. Beta-type isoforms differ from
CC alpha-type isoforms in their N-terminus. Additional isoforms produced
CC by alternative splicing seem to exist. {ECO:0000269|PubMed:11944992};
CC Name=1a;
CC IsoId=Q9Y4C0-1; Sequence=Displayed;
CC Name=3a;
CC IsoId=Q9Y4C0-3; Sequence=VSP_036463, VSP_036464;
CC Name=4a;
CC IsoId=Q9Y4C0-4; Sequence=VSP_041699, VSP_041700, VSP_041701,
CC VSP_041702, VSP_041703, VSP_041704;
CC Name=1b;
CC IsoId=Q9HDB5-1; Sequence=External;
CC Name=2b;
CC IsoId=Q9HDB5-2; Sequence=External;
CC Name=3b;
CC IsoId=Q9HDB5-3; Sequence=External;
CC Name=4b;
CC IsoId=Q9HDB5-4; Sequence=External;
CC -!- TISSUE SPECIFICITY: Expressed in the blood vessel walls (at protein
CC level). Highly expressed in brain, lung, and pancreas; a lower level of
CC expression is detectable in heart, placenta, liver, and kidney, whereas
CC no expression can be observed in skeletal muscle. Isoform 4a is heart-
CC specific. {ECO:0000269|PubMed:12379233, ECO:0000269|PubMed:19926856}.
CC -!- MISCELLANEOUS: [Isoform 3a]: Produced by alternative splicing.
CC {ECO:0000305}.
CC -!- MISCELLANEOUS: [Isoform 4a]: Produced by alternative splicing.
CC {ECO:0000305}.
CC -!- SIMILARITY: Belongs to the neurexin family. {ECO:0000305}.
CC -!- SEQUENCE CAUTION:
CC Sequence=BAA34463.2; Type=Erroneous initiation; Note=Extended N-terminus.; Evidence={ECO:0000305};
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AJ316284; CAC87720.2; -; mRNA.
DR EMBL; AF099810; AAC68909.1; -; Genomic_DNA.
DR EMBL; AF123462; AAD13621.1; -; Genomic_DNA.
DR EMBL; AB018286; BAA34463.2; ALT_INIT; mRNA.
DR EMBL; AC008056; AAF09143.1; -; Genomic_DNA.
DR EMBL; AC012099; AAF15058.1; -; Genomic_DNA.
DR EMBL; AC009396; AAF21147.1; -; Genomic_DNA.
DR EMBL; AC008045; AAF28465.1; -; Genomic_DNA.
DR EMBL; AC011440; AAF61277.1; -; Genomic_DNA.
DR EMBL; AC026888; AAF87841.1; -; Genomic_DNA.
DR EMBL; CH471061; EAW81316.1; -; Genomic_DNA.
DR EMBL; BC152457; AAI52458.1; -; mRNA.
DR CCDS; CCDS9870.1; -. [Q9Y4C0-3]
DR RefSeq; NP_004787.2; NM_004796.5. [Q9Y4C0-3]
DR AlphaFoldDB; Q9Y4C0; -.
DR SMR; Q9Y4C0; -.
DR BioGRID; 114770; 22.
DR ELM; Q9Y4C0; -.
DR IntAct; Q9Y4C0; 2.
DR GlyGen; Q9Y4C0; 6 sites.
DR iPTMnet; Q9Y4C0; -.
DR PhosphoSitePlus; Q9Y4C0; -.
DR BioMuta; NRXN3; -.
DR DMDM; 224471902; -.
DR EPD; Q9Y4C0; -.
DR jPOST; Q9Y4C0; -.
DR MassIVE; Q9Y4C0; -.
DR MaxQB; Q9Y4C0; -.
DR PaxDb; Q9Y4C0; -.
DR PeptideAtlas; Q9Y4C0; -.
DR PRIDE; Q9Y4C0; -.
DR ProteomicsDB; 86154; -. [Q9Y4C0-1]
DR ProteomicsDB; 86155; -. [Q9Y4C0-3]
DR ProteomicsDB; 86156; -. [Q9Y4C0-4]
DR Antibodypedia; 106; 252 antibodies from 31 providers.
DR DNASU; 9369; -.
DR Ensembl; ENST00000554719.5; ENSP00000451648.1; ENSG00000021645.20. [Q9Y4C0-3]
DR Ensembl; ENST00000554738.5; ENSP00000450683.1; ENSG00000021645.20. [Q9Y4C0-4]
DR GeneID; 9369; -.
DR UCSC; uc001xun.5; human. [Q9Y4C0-1]
DR CTD; 9369; -.
DR DisGeNET; 9369; -.
DR GeneCards; NRXN3; -.
DR HGNC; HGNC:8010; NRXN3.
DR HPA; ENSG00000021645; Tissue enhanced (brain, retina).
DR MIM; 600567; gene.
DR neXtProt; NX_Q9Y4C0; -.
DR OpenTargets; ENSG00000021645; -.
DR PharmGKB; PA31788; -.
DR VEuPathDB; HostDB:ENSG00000021645; -.
DR eggNOG; KOG3514; Eukaryota.
DR GeneTree; ENSGT00940000154618; -.
DR HOGENOM; CLU_001710_0_1_1; -.
DR InParanoid; Q9Y4C0; -.
DR PhylomeDB; Q9Y4C0; -.
DR TreeFam; TF321302; -.
DR PathwayCommons; Q9Y4C0; -.
DR Reactome; R-HSA-6794361; Neurexins and neuroligins.
DR SignaLink; Q9Y4C0; -.
DR SIGNOR; Q9Y4C0; -.
DR BioGRID-ORCS; 9369; 10 hits in 1067 CRISPR screens.
DR ChiTaRS; NRXN3; human.
DR GenomeRNAi; 9369; -.
DR Pharos; Q9Y4C0; Tbio.
DR Proteomes; UP000005640; Chromosome 14.
DR RNAct; Q9Y4C0; protein.
DR Bgee; ENSG00000021645; Expressed in cerebellar vermis and 162 other tissues.
DR ExpressionAtlas; Q9Y4C0; baseline and differential.
DR Genevisible; Q9Y4C0; HS.
DR GO; GO:0005887; C:integral component of plasma membrane; TAS:ProtInc.
DR GO; GO:0005886; C:plasma membrane; TAS:Reactome.
DR GO; GO:0050839; F:cell adhesion molecule binding; TAS:BHF-UCL.
DR GO; GO:0046872; F:metal ion binding; IEA:UniProtKB-KW.
DR GO; GO:0097109; F:neuroligin family protein binding; TAS:BHF-UCL.
DR GO; GO:0038023; F:signaling receptor activity; TAS:ProtInc.
DR GO; GO:0030534; P:adult behavior; IGI:BHF-UCL.
DR GO; GO:0007411; P:axon guidance; TAS:ProtInc.
DR GO; GO:0007612; P:learning; IGI:BHF-UCL.
DR GO; GO:0007158; P:neuron cell-cell adhesion; TAS:BHF-UCL.
DR GO; GO:0035176; P:social behavior; IGI:BHF-UCL.
DR GO; GO:0071625; P:vocalization behavior; IGI:BHF-UCL.
DR CDD; cd00110; LamG; 6.
DR InterPro; IPR013320; ConA-like_dom_sf.
DR InterPro; IPR000742; EGF-like_dom.
DR InterPro; IPR000152; EGF-type_Asp/Asn_hydroxyl_site.
DR InterPro; IPR001791; Laminin_G.
DR InterPro; IPR003585; Neurexin-like.
DR InterPro; IPR027789; Syndecan/Neurexin_dom.
DR Pfam; PF00008; EGF; 1.
DR Pfam; PF02210; Laminin_G_2; 6.
DR Pfam; PF01034; Syndecan; 1.
DR SMART; SM00294; 4.1m; 1.
DR SMART; SM00181; EGF; 3.
DR SMART; SM00282; LamG; 6.
DR SUPFAM; SSF49899; SSF49899; 6.
DR PROSITE; PS00010; ASX_HYDROXYL; 1.
DR PROSITE; PS50026; EGF_3; 3.
DR PROSITE; PS50025; LAM_G_DOMAIN; 6.
PE 1: Evidence at protein level;
KW Alternative promoter usage; Alternative splicing; Calcium; Cell adhesion;
KW Disulfide bond; EGF-like domain; Glycoprotein; Membrane; Metal-binding;
KW Reference proteome; Repeat; Signal; Transmembrane; Transmembrane helix.
FT SIGNAL 1..27
FT /evidence="ECO:0000250"
FT CHAIN 28..1643
FT /note="Neurexin-3"
FT /id="PRO_0000019499"
FT TOPO_DOM 28..1568
FT /note="Extracellular"
FT /evidence="ECO:0000255"
FT TRANSMEM 1569..1589
FT /note="Helical"
FT /evidence="ECO:0000255"
FT TOPO_DOM 1590..1643
FT /note="Cytoplasmic"
FT /evidence="ECO:0000255"
FT DOMAIN 28..202
FT /note="Laminin G-like 1"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00122"
FT DOMAIN 198..235
FT /note="EGF-like 1"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00076"
FT DOMAIN 258..440
FT /note="Laminin G-like 2"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00122"
FT DOMAIN 447..639
FT /note="Laminin G-like 3"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00122"
FT DOMAIN 643..680
FT /note="EGF-like 2"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00076"
FT DOMAIN 685..857
FT /note="Laminin G-like 4"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00122"
FT DOMAIN 871..1046
FT /note="Laminin G-like 5"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00122"
FT DOMAIN 1049..1086
FT /note="EGF-like 3"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00076"
FT DOMAIN 1090..1260
FT /note="Laminin G-like 6"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00122"
FT REGION 1294..1318
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1611..1643
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1294..1317
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1625..1643
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT BINDING 304
FT /ligand="Ca(2+)"
FT /ligand_id="ChEBI:CHEBI:29108"
FT /evidence="ECO:0000250"
FT BINDING 321
FT /ligand="Ca(2+)"
FT /ligand_id="ChEBI:CHEBI:29108"
FT /evidence="ECO:0000250"
FT BINDING 374
FT /ligand="Ca(2+)"
FT /ligand_id="ChEBI:CHEBI:29108"
FT /evidence="ECO:0000250"
FT CARBOHYD 58
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT CARBOHYD 105
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT CARBOHYD 757
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT CARBOHYD 1189
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT CARBOHYD 1257
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT CARBOHYD 1301
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT DISULFID 202..213
FT /evidence="ECO:0000250"
FT DISULFID 207..222
FT /evidence="ECO:0000250"
FT DISULFID 224..234
FT /evidence="ECO:0000250"
FT DISULFID 404..440
FT /evidence="ECO:0000250"
FT DISULFID 610..639
FT /evidence="ECO:0000250"
FT DISULFID 647..658
FT /evidence="ECO:0000250"
FT DISULFID 652..667
FT /evidence="ECO:0000250"
FT DISULFID 669..679
FT /evidence="ECO:0000250"
FT DISULFID 1018..1046
FT /evidence="ECO:0000250"
FT DISULFID 1053..1064
FT /evidence="ECO:0000250"
FT DISULFID 1058..1073
FT /evidence="ECO:0000250"
FT DISULFID 1075..1085
FT /evidence="ECO:0000250"
FT VAR_SEQ 1..373
FT /note="Missing (in isoform 3a)"
FT /evidence="ECO:0000303|PubMed:15489334,
FT ECO:0000303|PubMed:9872452"
FT /id="VSP_036463"
FT VAR_SEQ 237..242
FT /note="Missing (in isoform 4a)"
FT /evidence="ECO:0000303|PubMed:12379233"
FT /id="VSP_041699"
FT VAR_SEQ 252
FT /note="Q -> QGRSK (in isoform 4a)"
FT /evidence="ECO:0000303|PubMed:12379233"
FT /id="VSP_041700"
FT VAR_SEQ 750..759
FT /note="DCIRINCNSS -> G (in isoform 4a)"
FT /evidence="ECO:0000303|PubMed:12379233"
FT /id="VSP_041701"
FT VAR_SEQ 1205
FT /note="T -> TGNTDNERFQMVKQKIPFKYNRPVEEWLQEK (in isoform
FT 4a)"
FT /evidence="ECO:0000303|PubMed:12379233"
FT /id="VSP_041702"
FT VAR_SEQ 1334..1542
FT /note="Missing (in isoform 3a)"
FT /evidence="ECO:0000303|PubMed:15489334,
FT ECO:0000303|PubMed:9872452"
FT /id="VSP_036464"
FT VAR_SEQ 1335..1373
FT /note="TGGELVIPLLVEDPLATPPIATRAPSITLPPTFRPLLTI -> GRSARSSNA
FT ARSLRAALTWTWRLTYTFTPIIFISCVVHS (in isoform 4a)"
FT /evidence="ECO:0000303|PubMed:12379233"
FT /id="VSP_041703"
FT VAR_SEQ 1374..1643
FT /note="Missing (in isoform 4a)"
FT /evidence="ECO:0000303|PubMed:12379233"
FT /id="VSP_041704"
FT CONFLICT 1043
FT /note="E -> K (in Ref. 3; BAA34463 and 6; AAI52458)"
FT /evidence="ECO:0000305"
SQ SEQUENCE 1643 AA; 180599 MW; E7360348E263C6EC CRC64;
MSSTLHSVFF TLKVSILLGS LLGLCLGLEF MGLPNQWARY LRWDASTRSD LSFQFKTNVS
TGLLLYLDDG GVCDFLCLSL VDGRVQLRFS MDCAETAVLS NKQVNDSSWH FLMVSRDRLR
TVLMLDGEGQ SGELQPQRPY MDVVSDLFLG GVPTDIRPSA LTLDGVQAMP GFKGLILDLK
YGNSEPRLLG SRGVQMDAEG PCGERPCENG GICFLLDGHP TCDCSTTGYG GKLCSEDVSQ
DPGLSHLMMS EQAREENVAT FRGSEYLCYD LSQNPIQSSS DEITLSFKTW QRNGLILHTG
KSADYVNLAL KDGAVSLVIN LGSGAFEAIV EPVNGKFNDN AWHDVKVTRN LRQVTISVDG
ILTTTGYTQE DYTMLGSDDF FYVGGSPSTA DLPGSPVSNN FMGCLKEVVY KNNDIRLELS
RLARIADTKM KIYGEVVFKC ENVATLDPIN FETPEAYISL PKWNTKRMGS ISFDFRTTEP
NGLILFTHGK PQERKDARSQ KNTKVDFFAV ELLDGNLYLL LDMGSGTIKV KATQKKANDG
EWYHVDIQRD GRSGTISVNS RRTPFTASGE SEILDLEGDM YLGGLPENRA GLILPTELWT
AMLNYGYVGC IRDLFIDGRS KNIRQLAEMQ NAAGVKSSCS RMSAKQCDSY PCKNNAVCKD
GWNRFICDCT GTGYWGRTCE REASILSYDG SMYMKIIMPM VMHTEAEDVS FRFMSQRAYG
LLVATTSRDS ADTLRLELDG GRVKLMVNLD CIRINCNSSK GPETLYAGQK LNDNEWHTVR
VVRRGKSLKL TVDDDVAEGT MVGDHTRLEF HNIETGIMTE KRYISVVPSS FIGHLQSLMF
NGLLYIDLCK NGDIDYCELK ARFGLRNIIA DPVTFKTKSS YLSLATLQAY TSMHLFFQFK
TTSPDGFILF NSGDGNDFIA VELVKGYIHY VFDLGNGPNV IKGNSDRPLN DNQWHNVVIT
RDNSNTHSLK VDTKVVTQVI NGAKNLDLKG DLYMAGLAQG MYSNLPKLVA SRDGFQGCLA
SVDLNGRLPD LINDALHRSG QIERGCEGPS TTCQEDSCAN QGVCMQQWEG FTCDCSMTSY
SGNQCNDPGA TYIFGKSGGL ILYTWPANDR PSTRSDRLAV GFSTTVKDGI LVRIDSAPGL
GDFLQLHIEQ GKIGVVFNIG TVDISIKEER TPVNDGKYHV VRFTRNGGNA TLQVDNWPVN
EHYPTGRQLT IFNTQAQIAI GGKDKGRLFQ GQLSGLYYDG LKVLNMAAEN NPNIKINGSV
RLVGEVPSIL GTTQTTSMPP EMSTTVMETT TTMATTTTRK NRSTASIQPT SDDLVSSAEC
SSDDEDFVEC EPSTTGGELV IPLLVEDPLA TPPIATRAPS ITLPPTFRPL LTIIETTKDS
LSMTSEAGLP CLSDQGSDGC DDDGLVISGY GSGETFDSNL PPTDDEDFYT TFSLVTDKSL
STSIFEGGYK AHAPKWESKD FRPNKVSETS RTTTTSLSPE LIRFTASSSS GMVPKLPAGK
MNNRDLKPQP DIVLLPLPTA YELDSTKLKS PLITSPMFRN VPTANPTEPG IRRVPGASEV
IRESSSTTGM VVGIVAAAAL CILILLYAMY KYRNRDEGSY QVDETRNYIS NSAQSNGTLM
KEKQQSSKSG HKKQKNKDRE YYV