PRP31_MOUSE
ID PRP31_MOUSE Reviewed; 499 AA.
AC Q8CCF0; E9QPM6; Q6P7X2; Q8BQ91; Q8C8U4; Q8C8V5; Q8CCG6; Q8CF52; Q8VBW3;
DT 21-MAR-2006, integrated into UniProtKB/Swiss-Prot.
DT 27-JUL-2011, sequence version 3.
DT 03-AUG-2022, entry version 150.
DE RecName: Full=U4/U6 small nuclear ribonucleoprotein Prp31;
DE AltName: Full=Pre-mRNA-processing factor 31;
DE AltName: Full=U4/U6 snRNP 61 kDa protein;
DE Short=Protein 61K;
GN Name=Prpf31; Synonyms=Prp31;
OS Mus musculus (Mouse).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae;
OC Murinae; Mus; Mus.
OX NCBI_TaxID=10090;
RN [1]
RP NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM 1).
RC STRAIN=C57BL/6J X CBA/J; TISSUE=Lung;
RX PubMed=11867543; DOI=10.1093/emboj/21.5.1148;
RA Makarova O.V., Makarov E.M., Liu S., Vornlocher H.-P., Luehrmann R.;
RT "Protein 61K, encoded by a gene (PRPF31) linked to autosomal dominant
RT retinitis pigmentosa, is required for U4/U6.U5 tri-snRNP formation and pre-
RT mRNA splicing.";
RL EMBO J. 21:1148-1157(2002).
RN [2]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORMS 1 AND 4).
RC STRAIN=C57BL/6J; TISSUE=Cerebellum, and Testis;
RX PubMed=16141072; DOI=10.1126/science.1112014;
RA Carninci P., Kasukawa T., Katayama S., Gough J., Frith M.C., Maeda N.,
RA Oyama R., Ravasi T., Lenhard B., Wells C., Kodzius R., Shimokawa K.,
RA Bajic V.B., Brenner S.E., Batalov S., Forrest A.R., Zavolan M., Davis M.J.,
RA Wilming L.G., Aidinis V., Allen J.E., Ambesi-Impiombato A., Apweiler R.,
RA Aturaliya R.N., Bailey T.L., Bansal M., Baxter L., Beisel K.W., Bersano T.,
RA Bono H., Chalk A.M., Chiu K.P., Choudhary V., Christoffels A.,
RA Clutterbuck D.R., Crowe M.L., Dalla E., Dalrymple B.P., de Bono B.,
RA Della Gatta G., di Bernardo D., Down T., Engstrom P., Fagiolini M.,
RA Faulkner G., Fletcher C.F., Fukushima T., Furuno M., Futaki S.,
RA Gariboldi M., Georgii-Hemming P., Gingeras T.R., Gojobori T., Green R.E.,
RA Gustincich S., Harbers M., Hayashi Y., Hensch T.K., Hirokawa N., Hill D.,
RA Huminiecki L., Iacono M., Ikeo K., Iwama A., Ishikawa T., Jakt M.,
RA Kanapin A., Katoh M., Kawasawa Y., Kelso J., Kitamura H., Kitano H.,
RA Kollias G., Krishnan S.P., Kruger A., Kummerfeld S.K., Kurochkin I.V.,
RA Lareau L.F., Lazarevic D., Lipovich L., Liu J., Liuni S., McWilliam S.,
RA Madan Babu M., Madera M., Marchionni L., Matsuda H., Matsuzawa S., Miki H.,
RA Mignone F., Miyake S., Morris K., Mottagui-Tabar S., Mulder N., Nakano N.,
RA Nakauchi H., Ng P., Nilsson R., Nishiguchi S., Nishikawa S., Nori F.,
RA Ohara O., Okazaki Y., Orlando V., Pang K.C., Pavan W.J., Pavesi G.,
RA Pesole G., Petrovsky N., Piazza S., Reed J., Reid J.F., Ring B.Z.,
RA Ringwald M., Rost B., Ruan Y., Salzberg S.L., Sandelin A., Schneider C.,
RA Schoenbach C., Sekiguchi K., Semple C.A., Seno S., Sessa L., Sheng Y.,
RA Shibata Y., Shimada H., Shimada K., Silva D., Sinclair B., Sperling S.,
RA Stupka E., Sugiura K., Sultana R., Takenaka Y., Taki K., Tammoja K.,
RA Tan S.L., Tang S., Taylor M.S., Tegner J., Teichmann S.A., Ueda H.R.,
RA van Nimwegen E., Verardo R., Wei C.L., Yagi K., Yamanishi H.,
RA Zabarovsky E., Zhu S., Zimmer A., Hide W., Bult C., Grimmond S.M.,
RA Teasdale R.D., Liu E.T., Brusic V., Quackenbush J., Wahlestedt C.,
RA Mattick J.S., Hume D.A., Kai C., Sasaki D., Tomaru Y., Fukuda S.,
RA Kanamori-Katayama M., Suzuki M., Aoki J., Arakawa T., Iida J., Imamura K.,
RA Itoh M., Kato T., Kawaji H., Kawagashira N., Kawashima T., Kojima M.,
RA Kondo S., Konno H., Nakano K., Ninomiya N., Nishio T., Okada M., Plessy C.,
RA Shibata K., Shiraki T., Suzuki S., Tagami M., Waki K., Watahiki A.,
RA Okamura-Oho Y., Suzuki H., Kawai J., Hayashizaki Y.;
RT "The transcriptional landscape of the mammalian genome.";
RL Science 309:1559-1563(2005).
RN [3]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 1).
RA Ebert L., Muenstermann E., Schatten R., Henze S., Bohn E., Mollenhauer J.,
RA Wiemann S., Schick M., Korn B.;
RT "Cloning of mouse full open reading frames in Gateway(R) system entry
RT vector (pDONR201).";
RL Submitted (JUL-2005) to the EMBL/GenBank/DDBJ databases.
RN [4]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=C57BL/6J;
RX PubMed=19468303; DOI=10.1371/journal.pbio.1000112;
RA Church D.M., Goodstadt L., Hillier L.W., Zody M.C., Goldstein S., She X.,
RA Bult C.J., Agarwala R., Cherry J.L., DiCuccio M., Hlavina W., Kapustin Y.,
RA Meric P., Maglott D., Birtle Z., Marques A.C., Graves T., Zhou S.,
RA Teague B., Potamousis K., Churas C., Place M., Herschleb J., Runnheim R.,
RA Forrest D., Amos-Landgraf J., Schwartz D.C., Cheng Z., Lindblad-Toh K.,
RA Eichler E.E., Ponting C.P.;
RT "Lineage-specific biology revealed by a finished genome assembly of the
RT mouse.";
RL PLoS Biol. 7:E1000112-E1000112(2009).
RN [5]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORMS 1 AND 2).
RC STRAIN=FVB/N, and NMRI; TISSUE=Brain, and Mammary tumor;
RX PubMed=15489334; DOI=10.1101/gr.2596504;
RG The MGC Project Team;
RT "The status, quality, and expansion of the NIH full-length cDNA project:
RT the Mammalian Gene Collection (MGC).";
RL Genome Res. 14:2121-2127(2004).
RN [6]
RP PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT THR-455, AND IDENTIFICATION BY
RP MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
RC TISSUE=Brain, Brown adipose tissue, Heart, Lung, Spleen, and Testis;
RX PubMed=21183079; DOI=10.1016/j.cell.2010.12.001;
RA Huttlin E.L., Jedrychowski M.P., Elias J.E., Goswami T., Rad R.,
RA Beausoleil S.A., Villen J., Haas W., Sowa M.E., Gygi S.P.;
RT "A tissue-specific atlas of mouse protein phosphorylation and expression.";
RL Cell 143:1174-1189(2010).
RN [7]
RP ACETYLATION [LARGE SCALE ANALYSIS] AT LYS-438, AND IDENTIFICATION BY MASS
RP SPECTROMETRY [LARGE SCALE ANALYSIS].
RC TISSUE=Embryonic fibroblast;
RX PubMed=23806337; DOI=10.1016/j.molcel.2013.06.001;
RA Park J., Chen Y., Tishkoff D.X., Peng C., Tan M., Dai L., Xie Z., Zhang Y.,
RA Zwaans B.M., Skinner M.E., Lombard D.B., Zhao Y.;
RT "SIRT5-mediated lysine desuccinylation impacts diverse metabolic
RT pathways.";
RL Mol. Cell 50:919-930(2013).
CC -!- FUNCTION: Involved in pre-mRNA splicing as component of the
CC spliceosome. Required for the assembly of the U4/U5/U6 tri-snRNP
CC complex, one of the building blocks of the spliceosome.
CC {ECO:0000250|UniProtKB:Q8WWY3}.
CC -!- SUBUNIT: Identified in the spliceosome B complex. Component of the
CC U4/U6-U5 tri-snRNP complex composed of the U4, U6 and U5 snRNAs and at
CC least PRPF3, PRPF4, PRPF6, PRPF8, PRPF31, SNRNP200, TXNL4A, SNRNP40,
CC DDX23, CD2BP2, PPIH, SNU13, EFTUD2, SART1 and USP39. Interacts with a
CC complex formed by SNU13 and U4 snRNA, but not with SNU13 or U4 snRNA
CC alone. The complex formed by SNU13 and PRPF31 binds also U4atac snRNA,
CC a characteristic component of specific, less abundant spliceosomal
CC complexes. Interacts with PRPF6/U5 snRNP-associated 102 kDa protein.
CC Component of some MLL1/MLL complex, at least composed of the core
CC components KMT2A/MLL1, ASH2L, HCFC1/HCF1, WDR5 and RBBP5, as well as
CC the facultative components BAP18, CHD8, E2F6, HSP70, INO80C, KANSL1,
CC LAS1L, MAX, MCRS1, MGA, KAT8/MOF, PELP1, PHF20, PRP31, RING2,
CC RUVB1/TIP49A, RUVB2/TIP49B, SENP3, TAF1, TAF4, TAF6, TAF7, TAF9 and
CC TEX10. Interacts (via its NLS) with CTNNBL1. Interacts with USH1G (By
CC similarity). {ECO:0000250|UniProtKB:Q8WWY3}.
CC -!- SUBCELLULAR LOCATION: Nucleus {ECO:0000250|UniProtKB:Q8WWY3}. Nucleus
CC speckle {ECO:0000250|UniProtKB:Q8WWY3}. Nucleus, Cajal body
CC {ECO:0000250|UniProtKB:Q8WWY3}. Note=Predominantly found in speckles
CC and in Cajal bodies. {ECO:0000250|UniProtKB:Q8WWY3}.
CC -!- ALTERNATIVE PRODUCTS:
CC Event=Alternative splicing; Named isoforms=3;
CC Name=1;
CC IsoId=Q8CCF0-1; Sequence=Displayed;
CC Name=2;
CC IsoId=Q8CCF0-2; Sequence=VSP_017591;
CC Name=4;
CC IsoId=Q8CCF0-4; Sequence=VSP_017589, VSP_017590;
CC -!- DOMAIN: Interacts with the snRNP via the Nop domain.
CC {ECO:0000250|UniProtKB:Q8WWY3}.
CC -!- DOMAIN: The coiled coil domain is formed by two non-contiguous helices.
CC {ECO:0000250|UniProtKB:Q8WWY3}.
CC -!- MISCELLANEOUS: [Isoform 4]: May be produced at very low levels due to a
CC premature stop codon in the mRNA, leading to nonsense-mediated mRNA
CC decay. {ECO:0000305}.
CC -!- SIMILARITY: Belongs to the PRP31 family. {ECO:0000305}.
CC -!- SEQUENCE CAUTION:
CC Sequence=BAC25109.1; Type=Frameshift; Evidence={ECO:0000305};
CC Sequence=BAC31903.1; Type=Miscellaneous discrepancy; Note=Contaminating sequence. The C-terminus matches chromosome 19 region.; Evidence={ECO:0000305};
CC Sequence=BAC31931.1; Type=Erroneous translation; Note=Wrong choice of frame.; Evidence={ECO:0000305};
CC Sequence=BAC34578.1; Type=Frameshift; Evidence={ECO:0000305};
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AY040823; AAK77987.1; -; mRNA.
DR EMBL; AK005294; BAC25109.1; ALT_FRAME; mRNA.
DR EMBL; AK033190; BAC28192.1; -; mRNA.
DR EMBL; AK033283; BAC28220.1; -; mRNA.
DR EMBL; AK044398; BAC31903.1; ALT_SEQ; mRNA.
DR EMBL; AK044457; BAC31931.1; ALT_SEQ; mRNA.
DR EMBL; AK051260; BAC34578.1; ALT_FRAME; mRNA.
DR EMBL; CT010189; CAJ18397.1; -; mRNA.
DR EMBL; AC130680; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; BC018376; AAH18376.1; -; mRNA.
DR EMBL; BC057877; AAH57877.1; -; mRNA.
DR EMBL; BC061461; AAH61461.1; -; mRNA.
DR CCDS; CCDS39729.1; -. [Q8CCF0-1]
DR CCDS; CCDS51965.1; -. [Q8CCF0-2]
DR RefSeq; NP_001153186.1; NM_001159714.1. [Q8CCF0-2]
DR RefSeq; NP_081604.3; NM_027328.4. [Q8CCF0-1]
DR AlphaFoldDB; Q8CCF0; -.
DR SMR; Q8CCF0; -.
DR BioGRID; 213160; 16.
DR IntAct; Q8CCF0; 2.
DR STRING; 10090.ENSMUSP00000008517; -.
DR iPTMnet; Q8CCF0; -.
DR PhosphoSitePlus; Q8CCF0; -.
DR EPD; Q8CCF0; -.
DR jPOST; Q8CCF0; -.
DR MaxQB; Q8CCF0; -.
DR PaxDb; Q8CCF0; -.
DR PeptideAtlas; Q8CCF0; -.
DR PRIDE; Q8CCF0; -.
DR ProteomicsDB; 291893; -. [Q8CCF0-1]
DR ProteomicsDB; 291894; -. [Q8CCF0-2]
DR Antibodypedia; 32797; 279 antibodies from 31 providers.
DR DNASU; 68988; -.
DR Ensembl; ENSMUST00000008517; ENSMUSP00000008517; ENSMUSG00000008373. [Q8CCF0-1]
DR Ensembl; ENSMUST00000108636; ENSMUSP00000104276; ENSMUSG00000008373. [Q8CCF0-2]
DR Ensembl; ENSMUST00000125782; ENSMUSP00000146017; ENSMUSG00000008373. [Q8CCF0-4]
DR Ensembl; ENSMUST00000179769; ENSMUSP00000136031; ENSMUSG00000008373. [Q8CCF0-2]
DR GeneID; 68988; -.
DR KEGG; mmu:68988; -.
DR UCSC; uc009evi.2; mouse. [Q8CCF0-1]
DR UCSC; uc012ewf.1; mouse. [Q8CCF0-2]
DR CTD; 26121; -.
DR MGI; MGI:1916238; Prpf31.
DR VEuPathDB; HostDB:ENSMUSG00000008373; -.
DR eggNOG; KOG2574; Eukaryota.
DR GeneTree; ENSGT00550000075069; -.
DR HOGENOM; CLU_026337_2_0_1; -.
DR InParanoid; Q8CCF0; -.
DR OMA; IIGNGPM; -.
DR OrthoDB; 791296at2759; -.
DR PhylomeDB; Q8CCF0; -.
DR TreeFam; TF300677; -.
DR Reactome; R-MMU-72163; mRNA Splicing - Major Pathway.
DR BioGRID-ORCS; 68988; 25 hits in 74 CRISPR screens.
DR PRO; PR:Q8CCF0; -.
DR Proteomes; UP000000589; Chromosome 7.
DR RNAct; Q8CCF0; protein.
DR Bgee; ENSMUSG00000008373; Expressed in primitive streak and 262 other tissues.
DR Genevisible; Q8CCF0; MM.
DR GO; GO:0015030; C:Cajal body; ISO:MGI.
DR GO; GO:0071339; C:MLL1 complex; ISS:UniProtKB.
DR GO; GO:0016607; C:nuclear speck; ISO:MGI.
DR GO; GO:0005654; C:nucleoplasm; ISO:MGI.
DR GO; GO:0005634; C:nucleus; ISS:UniProtKB.
DR GO; GO:0071011; C:precatalytic spliceosome; IBA:GO_Central.
DR GO; GO:0097526; C:spliceosomal tri-snRNP complex; IBA:GO_Central.
DR GO; GO:0071005; C:U2-type precatalytic spliceosome; ISS:UniProtKB.
DR GO; GO:0005687; C:U4 snRNP; ISO:MGI.
DR GO; GO:0046540; C:U4/U6 x U5 tri-snRNP complex; ISS:UniProtKB.
DR GO; GO:0005690; C:U4atac snRNP; ISS:UniProtKB.
DR GO; GO:0042802; F:identical protein binding; ISO:MGI.
DR GO; GO:0043021; F:ribonucleoprotein complex binding; ISO:MGI.
DR GO; GO:0070990; F:snRNP binding; ISO:MGI.
DR GO; GO:0030621; F:U4 snRNA binding; ISO:MGI.
DR GO; GO:0030622; F:U4atac snRNA binding; ISS:UniProtKB.
DR GO; GO:0000398; P:mRNA splicing, via spliceosome; ISS:UniProtKB.
DR GO; GO:0071166; P:ribonucleoprotein complex localization; ISO:MGI.
DR GO; GO:0000244; P:spliceosomal tri-snRNP complex assembly; ISO:MGI.
DR Gene3D; 1.10.246.90; -; 1.
DR InterPro; IPR042239; Nop_C.
DR InterPro; IPR002687; Nop_dom.
DR InterPro; IPR036070; Nop_dom_sf.
DR InterPro; IPR012976; NOSIC.
DR InterPro; IPR027105; Prp31.
DR InterPro; IPR019175; Prp31_C.
DR PANTHER; PTHR13904; PTHR13904; 1.
DR Pfam; PF01798; Nop; 1.
DR Pfam; PF09785; Prp31_C; 1.
DR SMART; SM00931; NOSIC; 1.
DR SUPFAM; SSF89124; SSF89124; 1.
DR PROSITE; PS51358; NOP; 1.
PE 1: Evidence at protein level;
KW Acetylation; Alternative splicing; Coiled coil; Isopeptide bond;
KW mRNA processing; mRNA splicing; Nucleus; Phosphoprotein;
KW Reference proteome; Ribonucleoprotein; RNA-binding; Spliceosome;
KW Ubl conjugation.
FT CHAIN 1..499
FT /note="U4/U6 small nuclear ribonucleoprotein Prp31"
FT /id="PRO_0000227800"
FT DOMAIN 215..333
FT /note="Nop"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00690"
FT REGION 1..43
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 334..357
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COILED 85..120
FT /evidence="ECO:0000250|UniProtKB:Q8WWY3"
FT COILED 181..215
FT /evidence="ECO:0000250|UniProtKB:Q8WWY3"
FT MOTIF 351..364
FT /note="Nuclear localization signal (NLS)"
FT /evidence="ECO:0000250|UniProtKB:Q8WWY3"
FT COMPBIAS 9..40
FT /note="Acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT SITE 247
FT /note="Interaction with U4 snRNA"
FT /evidence="ECO:0000250|UniProtKB:Q8WWY3"
FT SITE 270
FT /note="Interaction with U4 snRNA and U4atac snRNA"
FT /evidence="ECO:0000250|UniProtKB:Q8WWY3"
FT SITE 289
FT /note="Interaction with U4atac snRNA"
FT /evidence="ECO:0000250|UniProtKB:Q8WWY3"
FT SITE 293
FT /note="Interaction with U4 snRNA and U4atac snRNA"
FT /evidence="ECO:0000250|UniProtKB:Q8WWY3"
FT SITE 298
FT /note="Interaction with U4 snRNA and U4atac snRNA"
FT /evidence="ECO:0000250|UniProtKB:Q8WWY3"
FT MOD_RES 379
FT /note="Phosphoserine"
FT /evidence="ECO:0000250|UniProtKB:Q8WWY3"
FT MOD_RES 395
FT /note="Phosphoserine"
FT /evidence="ECO:0000250|UniProtKB:Q8WWY3"
FT MOD_RES 432
FT /note="Phosphoserine"
FT /evidence="ECO:0000250|UniProtKB:Q8WWY3"
FT MOD_RES 438
FT /note="N6-acetyllysine"
FT /evidence="ECO:0007744|PubMed:23806337"
FT MOD_RES 439
FT /note="Phosphoserine"
FT /evidence="ECO:0000250|UniProtKB:Q8WWY3"
FT MOD_RES 440
FT /note="Phosphothreonine"
FT /evidence="ECO:0000250|UniProtKB:Q8WWY3"
FT MOD_RES 450
FT /note="Phosphoserine"
FT /evidence="ECO:0000250|UniProtKB:Q8WWY3"
FT MOD_RES 455
FT /note="Phosphothreonine"
FT /evidence="ECO:0007744|PubMed:21183079"
FT CROSSLNK 471
FT /note="Glycyl lysine isopeptide (Lys-Gly) (interchain with
FT G-Cter in SUMO2)"
FT /evidence="ECO:0000250|UniProtKB:Q8WWY3"
FT CROSSLNK 478
FT /note="Glycyl lysine isopeptide (Lys-Gly) (interchain with
FT G-Cter in SUMO2)"
FT /evidence="ECO:0000250|UniProtKB:Q8WWY3"
FT VAR_SEQ 60..65
FT /note="FAEIMM -> VSLLRS (in isoform 4)"
FT /evidence="ECO:0000303|PubMed:16141072"
FT /id="VSP_017589"
FT VAR_SEQ 66..499
FT /note="Missing (in isoform 4)"
FT /evidence="ECO:0000303|PubMed:16141072"
FT /id="VSP_017590"
FT VAR_SEQ 316..321
FT /note="Missing (in isoform 2)"
FT /evidence="ECO:0000303|PubMed:15489334"
FT /id="VSP_017591"
FT CONFLICT 77
FT /note="V -> A (in Ref. 1; AAK77987, 2; BAC31931, 3;
FT CAJ18397 and 5; AAH18376/AAH57877)"
FT /evidence="ECO:0000305"
FT CONFLICT 104
FT /note="E -> V (in Ref. 2; BAC28192)"
FT /evidence="ECO:0000305"
FT CONFLICT 177
FT /note="Q -> R (in Ref. 2; BAC28220/BAC28192)"
FT /evidence="ECO:0000305"
FT CONFLICT 382
FT /note="E -> G (in Ref. 2; BAC25109)"
FT /evidence="ECO:0000305"
FT CONFLICT 481
FT /note="S -> F (in Ref. 5; AAH61461)"
FT /evidence="ECO:0000305"
SQ SEQUENCE 499 AA; 55430 MW; A8149257A6213D4F CRC64;
MSLADELLAD LEEAAEEEEG GSYGEEEEEP AIEDVQEETQ LDLSGDSVKS IAKLWDSKMF
AEIMMKIEEY ISKQANVSEV MGPVEAAPEY RVIVDANNLT VEIENELNII HKFIRDKYSK
RFPELESLVP NALDYIRTVK ELGNSLDKCK NNENLQQILT NATIMVVSVT ASTTQGQQLS
DEELERLEEA CDMALELNAS KHRIYEYVES RMSFIAPNLS IIIGASTAAK IMGVAGGLTN
LSKMPACNIM LLGAQRKTLS GFSSTSVLPH TGYIYHSDIV QSLPPDLRRK AARLVAAKCT
LAARVDSFHE STEGKVGYEL KDEIERKFDK WQEPPPVKQV KPLPAPLDGQ RKKRGGRRYR
KMKERLGLTE IRKQANRMSF GEIEEDAYQE DLGFSLGHLG KSGSGRVRQT QVNEATKARI
SKTLQRTLQK QSVVYGGKST IRDRSSGTAS SVAFTPLQGL EIVNPQAAEK KVAEANQKYF
SSMAEFLKVK GEKSGTMST