NSD1_HUMAN
ID NSD1_HUMAN Reviewed; 2696 AA.
AC Q96L73; Q96PD8; Q96RN7;
DT 03-JUL-2003, integrated into UniProtKB/Swiss-Prot.
DT 01-DEC-2001, sequence version 1.
DT 03-AUG-2022, entry version 198.
DE RecName: Full=Histone-lysine N-methyltransferase, H3 lysine-36 specific;
DE EC=2.1.1.357 {ECO:0000269|PubMed:21196496};
DE AltName: Full=Androgen receptor coactivator 267 kDa protein;
DE AltName: Full=Androgen receptor-associated protein of 267 kDa;
DE AltName: Full=H3-K36-HMTase;
DE AltName: Full=Lysine N-methyltransferase 3B;
DE AltName: Full=Nuclear receptor-binding SET domain-containing protein 1;
DE Short=NR-binding SET domain-containing protein;
GN Name=NSD1; Synonyms=ARA267, KMT3B;
OS Homo sapiens (Human).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae;
OC Homo.
OX NCBI_TaxID=9606;
RN [1]
RP NUCLEOTIDE SEQUENCE [MRNA] (ISOFORMS 1 AND 2), AND INTERACTION WITH AR.
RX PubMed=11509567; DOI=10.1074/jbc.m104765200;
RA Wang X., Yeh S., Wu G., Hsu C.-L., Wang L., Chang T., Yang Y., Guo Y.,
RA Chang C.;
RT "Identification and characterization of a novel androgen receptor
RT coregulator ARA267-alpha in prostate cancer cells.";
RL J. Biol. Chem. 276:40417-40423(2001).
RN [2]
RP NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM 1).
RX PubMed=11733144; DOI=10.1016/s0378-1119(01)00750-8;
RA Kurotaki N., Harada N., Yoshiura K., Sugano S., Niikawa N., Matsumoto N.;
RT "Molecular characterization of NSD1, a human homologue of the mouse Nsd1
RT gene.";
RL Gene 279:197-204(2001).
RN [3]
RP NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM 3), AND CHROMOSOMAL TRANSLOCATION WITH
RP NUP98.
RX PubMed=11493482; DOI=10.1182/blood.v98.4.1264;
RA Jaju R.J., Fidler C., Haas O.A., Strickson A.J., Watkins F., Clark K.,
RA Cross N.C., Cheng J.F., Aplan P.D., Kearney L., Boultwood J.,
RA Wainscoat J.S.;
RT "A novel gene, NSD1, is fused to NUP98 in the t(5;11)(q35;p15.5) in de novo
RT childhood acute myeloid leukemia.";
RL Blood 98:1264-1267(2001).
RN [4]
RP INVOLVEMENT IN SOTOS.
RX PubMed=11896389; DOI=10.1038/ng863;
RA Kurotaki N., Imaizumi K., Harada N., Masuno M., Kondoh T., Nagai T.,
RA Ohashi H., Naritomi K., Tsukahara M., Makita Y., Sugimoto T., Sonoda T.,
RA Hasegawa T., Chinen Y., Tomita Ha H.A., Kinoshita A., Mizuguchi T.,
RA Yoshiura Ki K., Ohta T., Kishino T., Fukushima Y., Niikawa N.,
RA Matsumoto N.;
RT "Haploinsufficiency of NSD1 causes Sotos syndrome.";
RL Nat. Genet. 30:365-366(2002).
RN [5]
RP INVOLVEMENT IN SOTOS AND BWS.
RX PubMed=14997421; DOI=10.1086/383093;
RA Baujat G., Rio M., Rossignol S., Sanlaville D., Lyonnet S., Le Merrer M.,
RA Munnich A., Gicquel C., Cormier-Daire V., Colleaux L.;
RT "Paradoxical NSD1 mutations in Beckwith-Wiedemann syndrome and 11p15
RT anomalies in Sotos syndrome.";
RL Am. J. Hum. Genet. 74:715-720(2004).
RN [6]
RP INVOLVEMENT IN MYELODYSPLASTIC SYNDROME.
RX PubMed=15382262; DOI=10.1002/gcc.20103;
RA La Starza R., Gorello P., Rosati R., Riezzo A., Veronese A., Ferrazzi E.,
RA Martelli M.F., Negrini M., Mecucci C.;
RT "Cryptic insertion producing two NUP98/NSD1 chimeric transcripts in adult
RT refractory anemia with an excess of blasts.";
RL Genes Chromosomes Cancer 41:395-399(2004).
RN [7]
RP PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-766, AND IDENTIFICATION BY
RP MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
RC TISSUE=Prostate cancer;
RX PubMed=17487921; DOI=10.1002/elps.200600782;
RA Giorgianni F., Zhao Y., Desiderio D.M., Beranova-Giorgianni S.;
RT "Toward a global characterization of the phosphoproteome in prostate cancer
RT cells: identification of phosphoproteins in the LNCaP cell line.";
RL Electrophoresis 28:2027-2034(2007).
RN [8]
RP PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-2471, AND IDENTIFICATION BY
RP MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
RC TISSUE=Cervix carcinoma;
RX PubMed=18669648; DOI=10.1073/pnas.0805139105;
RA Dephoure N., Zhou C., Villen J., Beausoleil S.A., Bakalarski C.E.,
RA Elledge S.J., Gygi S.P.;
RT "A quantitative atlas of mitotic phosphorylation.";
RL Proc. Natl. Acad. Sci. U.S.A. 105:10762-10767(2008).
RN [9]
RP IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
RX PubMed=19413330; DOI=10.1021/ac9004309;
RA Gauci S., Helbig A.O., Slijper M., Krijgsveld J., Heck A.J., Mohammed S.;
RT "Lys-N and trypsin cover complementary parts of the phosphoproteome in a
RT refined SCX-based approach.";
RL Anal. Chem. 81:4493-4501(2009).
RN [10]
RP IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
RX PubMed=19369195; DOI=10.1074/mcp.m800588-mcp200;
RA Oppermann F.S., Gnad F., Olsen J.V., Hornberger R., Greff Z., Keri G.,
RA Mann M., Daub H.;
RT "Large-scale proteomics analysis of the human kinome.";
RL Mol. Cell. Proteomics 8:1751-1764(2009).
RN [11]
RP PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT THR-2462, AND IDENTIFICATION BY
RP MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
RC TISSUE=Cervix carcinoma;
RX PubMed=20068231; DOI=10.1126/scisignal.2000475;
RA Olsen J.V., Vermeulen M., Santamaria A., Kumar C., Miller M.L.,
RA Jensen L.J., Gnad F., Cox J., Jensen T.S., Nigg E.A., Brunak S., Mann M.;
RT "Quantitative phosphoproteomics reveals widespread full phosphorylation
RT site occupancy during mitosis.";
RL Sci. Signal. 3:RA3-RA3(2010).
RN [12]
RP PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-483 AND SER-486, AND
RP IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
RX PubMed=21406692; DOI=10.1126/scisignal.2001570;
RA Rigbolt K.T., Prokhorova T.A., Akimov V., Henningsen J., Johansen P.T.,
RA Kratchmarova I., Kassem M., Mann M., Olsen J.V., Blagoev B.;
RT "System-wide temporal characterization of the proteome and phosphoproteome
RT of human embryonic stem cell differentiation.";
RL Sci. Signal. 4:RS3-RS3(2011).
RN [13]
RP PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-483; SER-486; SER-2369 AND
RP SER-2471, AND IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
RC TISSUE=Cervix carcinoma, and Erythroleukemia;
RX PubMed=23186163; DOI=10.1021/pr300630k;
RA Zhou H., Di Palma S., Preisinger C., Peng M., Polat A.N., Heck A.J.,
RA Mohammed S.;
RT "Toward a comprehensive characterization of a human cancer cell
RT phosphoproteome.";
RL J. Proteome Res. 12:260-271(2013).
RN [14]
RP SUMOYLATION [LARGE SCALE ANALYSIS] AT LYS-2616, AND IDENTIFICATION BY MASS
RP SPECTROMETRY [LARGE SCALE ANALYSIS].
RX PubMed=25218447; DOI=10.1038/nsmb.2890;
RA Hendriks I.A., D'Souza R.C., Yang B., Verlaan-de Vries M., Mann M.,
RA Vertegaal A.C.;
RT "Uncovering global SUMOylation signaling networks in a site-specific
RT manner.";
RL Nat. Struct. Mol. Biol. 21:927-936(2014).
RN [15]
RP SUMOYLATION [LARGE SCALE ANALYSIS] AT LYS-906 AND LYS-1339, AND
RP IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
RX PubMed=28112733; DOI=10.1038/nsmb.3366;
RA Hendriks I.A., Lyon D., Young C., Jensen L.J., Vertegaal A.C.,
RA Nielsen M.L.;
RT "Site-specific mapping of the human SUMO proteome reveals co-modification
RT with phosphorylation.";
RL Nat. Struct. Mol. Biol. 24:325-336(2017).
RN [16]
RP X-RAY CRYSTALLOGRAPHY (1.75 ANGSTROMS) OF 1852-2082 IN COMPLEX WITH
RP S-ADENOSYL-L-METHIONINE AND ZINC IONS, FUNCTION, CATALYTIC ACTIVITY,
RP MUTAGENESIS OF ARG-1914 AND ARG-1952, AND CHARACTERIZATION OF SOTOS
RP VARIANTS GLN-1984; GLN-2005 AND GLN-2017.
RX PubMed=21196496; DOI=10.1074/jbc.m110.204115;
RA Qiao Q., Li Y., Chen Z., Wang M., Reinberg D., Xu R.M.;
RT "The structure of NSD1 reveals an autoregulatory mechanism underlying
RT histone H3K36 methylation.";
RL J. Biol. Chem. 286:8361-8368(2011).
RN [17]
RP VARIANTS SOTOS LEU-1616; PRO-1637; TRP-1674; VAL-1792; ARG-1925; GLN-2005;
RP GLN-2017; GLN-2143 AND SER-2183, AND VARIANTS LEU-614; THR-691; PRO-726;
RP PRO-1036; ILE-1091; ILE-2250 AND THR-2261.
RX PubMed=12464997; DOI=10.1086/345647;
RA Douglas J., Hanks S., Temple I.K., Davies S., Murray A., Upadhyaya M.,
RA Tomkins S., Hughes H.E., Cole T.R.P., Rahman N.;
RT "NSD1 mutations are the major cause of Sotos syndrome and occur in some
RT cases of Weaver syndrome but are rare in other overgrowth phenotypes.";
RL Am. J. Hum. Genet. 72:132-143(2003).
RN [18]
RP VARIANTS SOTOS ASN-1687; ASP-1955; GLN-1984; CYS-1997 AND TRP-2017.
RX PubMed=12807965; DOI=10.1136/jmg.40.6.436;
RA Rio M., Clech L., Amiel J., Faivre L., Lyonnet S., Le Merrer M., Odent S.,
RA Lacombe D., Edery P., Brauner R., Raoul O., Gosset P., Prieur M.,
RA Vekemans M., Munnich A., Colleaux L., Cormier-Daire V.;
RT "Spectrum of NSD1 mutations in Sotos and Weaver syndromes.";
RL J. Med. Genet. 40:436-440(2003).
RN [19]
RP VARIANT [LARGE SCALE ANALYSIS] PRO-726.
RX PubMed=18987736; DOI=10.1038/nature07485;
RA Ley T.J., Mardis E.R., Ding L., Fulton B., McLellan M.D., Chen K.,
RA Dooling D., Dunford-Shore B.H., McGrath S., Hickenbotham M., Cook L.,
RA Abbott R., Larson D.E., Koboldt D.C., Pohl C., Smith S., Hawkins A.,
RA Abbott S., Locke D., Hillier L.W., Miner T., Fulton L., Magrini V.,
RA Wylie T., Glasscock J., Conyers J., Sander N., Shi X., Osborne J.R.,
RA Minx P., Gordon D., Chinwalla A., Zhao Y., Ries R.E., Payton J.E.,
RA Westervelt P., Tomasson M.H., Watson M., Baty J., Ivanovich J., Heath S.,
RA Shannon W.D., Nagarajan R., Walter M.J., Link D.C., Graubert T.A.,
RA DiPersio J.F., Wilson R.K.;
RT "DNA sequencing of a cytogenetically normal acute myeloid leukaemia
RT genome.";
RL Nature 456:66-72(2008).
CC -!- FUNCTION: Histone methyltransferase that dimethylates Lys-36 of histone
CC H3 (H3K36me2). Transcriptional intermediary factor capable of both
CC negatively or positively influencing transcription, depending on the
CC cellular context. {ECO:0000269|PubMed:21196496}.
CC -!- CATALYTIC ACTIVITY:
CC Reaction=L-lysyl(36)-[histone H3] + 2 S-adenosyl-L-methionine = 2 H(+)
CC + N(6),N(6)-dimethyl-L-lysyl(36)-[histone H3] + 2 S-adenosyl-L-
CC homocysteine; Xref=Rhea:RHEA:60308, Rhea:RHEA-COMP:9785, Rhea:RHEA-
CC COMP:9787, ChEBI:CHEBI:15378, ChEBI:CHEBI:29969, ChEBI:CHEBI:57856,
CC ChEBI:CHEBI:59789, ChEBI:CHEBI:61976; EC=2.1.1.357;
CC Evidence={ECO:0000269|PubMed:21196496};
CC -!- SUBUNIT: Interacts with the ligand-binding domains of RARA and THRA in
CC the absence of ligand; in the presence of ligand the interaction is
CC severely disrupted but some binding still occurs. Interacts with the
CC ligand-binding domains of RXRA and ESRRA only in the presence of
CC ligand. Interacts with ZNF496 (By similarity). Interacts with AR
CC DNA- and ligand-binding domains. {ECO:0000250,
CC ECO:0000269|PubMed:11509567, ECO:0000269|PubMed:21196496}.
CC -!- INTERACTION:
CC Q96L73; Q04206: RELA; NbExp=2; IntAct=EBI-2862434, EBI-73886;
CC Q96L73-2; O95994: AGR2; NbExp=3; IntAct=EBI-11110981, EBI-712648;
CC Q96L73-2; Q86Z20: CCDC125; NbExp=3; IntAct=EBI-11110981, EBI-11977221;
CC Q96L73-2; A8MQ03: CYSRT1; NbExp=3; IntAct=EBI-11110981, EBI-3867333;
CC Q96L73-2; Q3LI66: KRTAP6-2; NbExp=3; IntAct=EBI-11110981, EBI-11962084;
CC Q96L73-2; Q99750: MDFI; NbExp=3; IntAct=EBI-11110981, EBI-724076;
CC -!- SUBCELLULAR LOCATION: Nucleus. Chromosome {ECO:0000305}.
CC -!- ALTERNATIVE PRODUCTS:
CC Event=Alternative splicing; Named isoforms=3;
CC Name=1; Synonyms=ARA267-beta;
CC IsoId=Q96L73-1; Sequence=Displayed;
CC Name=2; Synonyms=ARA267-alpha;
CC IsoId=Q96L73-2; Sequence=VSP_007682, VSP_007683;
CC Name=3;
CC IsoId=Q96L73-3; Sequence=VSP_007684;
CC -!- TISSUE SPECIFICITY: Expressed in the fetal/adult brain, kidney,
CC skeletal muscle, spleen, and the thymus, and faintly in the lung.
CC -!- DISEASE: Sotos syndrome (SOTOS) [MIM:117550]: An autosomal dominant,
CC childhood overgrowth syndrome characterized by pre- and postnatal
CC overgrowth, developmental delay, intellectual disability, advanced bone
CC age, and abnormal craniofacial morphology including macrodolichocephaly
CC with frontal bossing, frontoparietal sparseness of hair, apparent
CC hypertelorism, downslanting palpebral fissures, and facial flushing.
CC Common oral findings include: premature eruption of teeth; high, arched
CC palate; pointed chin and, more rarely, prognathism.
CC {ECO:0000269|PubMed:11896389, ECO:0000269|PubMed:12464997,
CC ECO:0000269|PubMed:12807965, ECO:0000269|PubMed:14997421}. Note=The
CC disease is caused by variants affecting the gene represented in this
CC entry.
CC -!- DISEASE: Beckwith-Wiedemann syndrome (BWS) [MIM:130650]: A disorder
CC characterized by anterior abdominal wall defects including exomphalos
CC (omphalocele), pre- and postnatal overgrowth, and macroglossia.
CC Additional less frequent complications include specific developmental
CC defects and a predisposition to embryonal tumors.
CC {ECO:0000269|PubMed:14997421}. Note=The disease is caused by variants
CC affecting the gene represented in this entry.
CC -!- DISEASE: Note=A chromosomal aberration involving NSD1 is found in
CC childhood acute myeloid leukemia. Translocation t(5;11)(q35;p15.5) with
CC NUP98.
CC -!- DISEASE: Note=A chromosomal aberration involving NSD1 is found in an
CC adult form of myelodysplastic syndrome (MDS). Insertion of NUP98 into
CC NSD1 generates a NUP98-NSD1 fusion product.
CC {ECO:0000269|PubMed:15382262}.
CC -!- SIMILARITY: Belongs to the class V-like SAM-binding methyltransferase
CC superfamily. {ECO:0000255|PROSITE-ProRule:PRU00190}.
CC -!- WEB RESOURCE: Name=Atlas of Genetics and Cytogenetics in Oncology and
CC Haematology;
CC URL="http://atlasgeneticsoncology.org/Genes/NSD1ID356.html";
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AF380302; AAL27991.1; -; mRNA.
DR EMBL; AY049721; AAL06645.1; -; mRNA.
DR EMBL; AF395588; AAL40694.1; -; mRNA.
DR EMBL; AF322907; AAK92049.1; -; mRNA.
DR CCDS; CCDS4412.1; -. [Q96L73-1]
DR CCDS; CCDS4413.1; -. [Q96L73-2]
DR RefSeq; NP_071900.2; NM_022455.4. [Q96L73-1]
DR RefSeq; NP_758859.1; NM_172349.2. [Q96L73-2]
DR PDB; 3OOI; X-ray; 1.75 A; A=1852-2082.
DR PDB; 6KQP; X-ray; 2.40 A; A=1864-2083.
DR PDB; 6KQQ; X-ray; 1.80 A; A/B=1863-2085.
DR PDBsum; 3OOI; -.
DR PDBsum; 6KQP; -.
DR PDBsum; 6KQQ; -.
DR AlphaFoldDB; Q96L73; -.
DR SMR; Q96L73; -.
DR BioGRID; 122135; 94.
DR DIP; DIP-58517N; -.
DR IntAct; Q96L73; 24.
DR STRING; 9606.ENSP00000395929; -.
DR BindingDB; Q96L73; -.
DR ChEMBL; CHEMBL3588738; -.
DR MoonDB; Q96L73; Predicted.
DR iPTMnet; Q96L73; -.
DR PhosphoSitePlus; Q96L73; -.
DR BioMuta; NSD1; -.
DR DMDM; 32469769; -.
DR EPD; Q96L73; -.
DR jPOST; Q96L73; -.
DR MassIVE; Q96L73; -.
DR MaxQB; Q96L73; -.
DR PaxDb; Q96L73; -.
DR PeptideAtlas; Q96L73; -.
DR PRIDE; Q96L73; -.
DR ProteomicsDB; 77153; -. [Q96L73-1]
DR ProteomicsDB; 77154; -. [Q96L73-2]
DR ProteomicsDB; 77155; -. [Q96L73-3]
DR ABCD; Q96L73; 1 sequenced antibody.
DR Antibodypedia; 29208; 187 antibodies from 27 providers.
DR DNASU; 64324; -.
DR Ensembl; ENST00000439151.7; ENSP00000395929.2; ENSG00000165671.22. [Q96L73-1]
DR Ensembl; ENST00000687453.1; ENSP00000508426.1; ENSG00000165671.22. [Q96L73-3]
DR GeneID; 64324; -.
DR KEGG; hsa:64324; -.
DR MANE-Select; ENST00000439151.7; ENSP00000395929.2; NM_022455.5; NP_071900.2.
DR UCSC; uc003mfr.5; human. [Q96L73-1]
DR CTD; 64324; -.
DR DisGeNET; 64324; -.
DR GeneCards; NSD1; -.
DR GeneReviews; NSD1; -.
DR HGNC; HGNC:14234; NSD1.
DR HPA; ENSG00000165671; Low tissue specificity.
DR MalaCards; NSD1; -.
DR MIM; 117550; phenotype.
DR MIM; 130650; phenotype.
DR MIM; 606681; gene.
DR neXtProt; NX_Q96L73; -.
DR OpenTargets; ENSG00000165671; -.
DR Orphanet; 228415; 5q35 microduplication syndrome.
DR Orphanet; 238613; Beckwith-Wiedemann syndrome due to NSD1 mutation.
DR Orphanet; 1627; Deletion 5q35.
DR Orphanet; 821; Sotos syndrome.
DR Orphanet; 3447; Weaver syndrome.
DR PharmGKB; PA31790; -.
DR VEuPathDB; HostDB:ENSG00000165671; -.
DR eggNOG; KOG1081; Eukaryota.
DR GeneTree; ENSGT00940000155027; -.
DR HOGENOM; CLU_000756_0_0_1; -.
DR InParanoid; Q96L73; -.
DR OMA; LTGTCQR; -.
DR OrthoDB; 507784at2759; -.
DR PhylomeDB; Q96L73; -.
DR TreeFam; TF329088; -.
DR BioCyc; MetaCyc:HS09264-MON; -.
DR BRENDA; 2.1.1.357; 2681.
DR BRENDA; 2.1.1.362; 2681.
DR PathwayCommons; Q96L73; -.
DR Reactome; R-HSA-3214841; PKMTs methylate histone lysines.
DR SignaLink; Q96L73; -.
DR SIGNOR; Q96L73; -.
DR BioGRID-ORCS; 64324; 99 hits in 1098 CRISPR screens.
DR ChiTaRS; NSD1; human.
DR GenomeRNAi; 64324; -.
DR Pharos; Q96L73; Tbio.
DR PRO; PR:Q96L73; -.
DR Proteomes; UP000005640; Chromosome 5.
DR RNAct; Q96L73; protein.
DR Bgee; ENSG00000165671; Expressed in sural nerve and 159 other tissues.
DR ExpressionAtlas; Q96L73; baseline and differential.
DR Genevisible; Q96L73; HS.
DR GO; GO:0000785; C:chromatin; IBA:GO_Central.
DR GO; GO:0005654; C:nucleoplasm; TAS:Reactome.
DR GO; GO:0005634; C:nucleus; IBA:GO_Central.
DR GO; GO:0003682; F:chromatin binding; ISS:UniProtKB.
DR GO; GO:0046975; F:histone methyltransferase activity (H3-K36 specific); IDA:UniProtKB.
DR GO; GO:0042799; F:histone methyltransferase activity (H4-K20 specific); ISS:UniProtKB.
DR GO; GO:0018024; F:histone-lysine N-methyltransferase activity; TAS:Reactome.
DR GO; GO:0050681; F:nuclear androgen receptor binding; IDA:UniProtKB.
DR GO; GO:0030331; F:nuclear estrogen receptor binding; ISS:UniProtKB.
DR GO; GO:0042974; F:nuclear retinoic acid receptor binding; ISS:UniProtKB.
DR GO; GO:0046965; F:nuclear retinoid X receptor binding; ISS:UniProtKB.
DR GO; GO:0046966; F:nuclear thyroid hormone receptor binding; ISS:UniProtKB.
DR GO; GO:0000978; F:RNA polymerase II cis-regulatory region sequence-specific DNA binding; IDA:MGI.
DR GO; GO:0003712; F:transcription coregulator activity; IDA:UniProtKB.
DR GO; GO:0003714; F:transcription corepressor activity; ISS:UniProtKB.
DR GO; GO:0008270; F:zinc ion binding; IDA:UniProtKB.
DR GO; GO:0006325; P:chromatin organization; IEA:UniProtKB-KW.
DR GO; GO:0016571; P:histone methylation; ISS:UniProtKB.
DR GO; GO:0000122; P:negative regulation of transcription by RNA polymerase II; ISS:UniProtKB.
DR GO; GO:0045893; P:positive regulation of transcription, DNA-templated; IDA:UniProtKB.
DR GO; GO:0000414; P:regulation of histone H3-K36 methylation; IMP:MGI.
DR GO; GO:0033135; P:regulation of peptidyl-serine phosphorylation; IMP:MGI.
DR GO; GO:1903025; P:regulation of RNA polymerase II regulatory region sequence-specific DNA binding; IMP:MGI.
DR GO; GO:0006355; P:regulation of transcription, DNA-templated; IBA:GO_Central.
DR Gene3D; 2.170.270.10; -; 1.
DR Gene3D; 3.30.40.10; -; 4.
DR InterPro; IPR006560; AWS_dom.
DR InterPro; IPR041306; C5HCH.
DR InterPro; IPR003616; Post-SET_dom.
DR InterPro; IPR000313; PWWP_dom.
DR InterPro; IPR001214; SET_dom.
DR InterPro; IPR046341; SET_dom_sf.
DR InterPro; IPR019786; Zinc_finger_PHD-type_CS.
DR InterPro; IPR011011; Znf_FYVE_PHD.
DR InterPro; IPR001965; Znf_PHD.
DR InterPro; IPR019787; Znf_PHD-finger.
DR InterPro; IPR013083; Znf_RING/FYVE/PHD.
DR Pfam; PF17907; AWS; 1.
DR Pfam; PF17982; C5HCH; 1.
DR Pfam; PF00855; PWWP; 2.
DR Pfam; PF00856; SET; 1.
DR SMART; SM00570; AWS; 1.
DR SMART; SM00249; PHD; 5.
DR SMART; SM00508; PostSET; 1.
DR SMART; SM00293; PWWP; 2.
DR SMART; SM00317; SET; 1.
DR SUPFAM; SSF57903; SSF57903; 3.
DR SUPFAM; SSF82199; SSF82199; 1.
DR PROSITE; PS51215; AWS; 1.
DR PROSITE; PS50868; POST_SET; 1.
DR PROSITE; PS50812; PWWP; 2.
DR PROSITE; PS50280; SET; 1.
DR PROSITE; PS01359; ZF_PHD_1; 2.
DR PROSITE; PS50016; ZF_PHD_2; 2.
PE 1: Evidence at protein level;
KW 3D-structure; Activator; Alternative splicing; Chromatin regulator;
KW Chromosomal rearrangement; Chromosome; Disease variant; Isopeptide bond;
KW Metal-binding; Methyltransferase; Nucleus; Phosphoprotein; Proto-oncogene;
KW Reference proteome; Repeat; Repressor; S-adenosyl-L-methionine;
KW Transcription; Transcription regulation; Transferase; Ubl conjugation;
KW Zinc; Zinc-finger.
FT CHAIN 1..2696
FT /note="Histone-lysine N-methyltransferase, H3 lysine-36
FT specific"
FT /id="PRO_0000186070"
FT DOMAIN 323..388
FT /note="PWWP 1"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00162"
FT DOMAIN 1756..1818
FT /note="PWWP 2"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00162"
FT DOMAIN 1890..1940
FT /note="AWS"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00562"
FT DOMAIN 1942..2059
FT /note="SET"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00190"
FT DOMAIN 2066..2082
FT /note="Post-SET"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00155"
FT ZN_FING 1543..1589
FT /note="PHD-type 1"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00146"
FT ZN_FING 1590..1646
FT /note="PHD-type 2"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00146"
FT ZN_FING 1707..1751
FT /note="PHD-type 3"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00146"
FT ZN_FING 2118..2165
FT /note="PHD-type 4; atypical"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00146"
FT REGION 207..252
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 281..311
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 487..514
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 872..891
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 936..1035
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1067..1093
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1112..1134
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1243..1272
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1294..1344
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1382..1428
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1480..1534
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2060..2066
FT /note="Inhibits enzyme activity in the absence of bound
FT histone"
FT REGION 2091..2111
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2213..2422
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2464..2499
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2553..2575
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2595..2616
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2665..2696
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 874..891
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 936..988
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1076..1090
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1294..1318
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1400..1414
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1482..1500
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1510..1528
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2217..2232
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2280..2294
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2331..2349
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2395..2409
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2675..2690
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT BINDING 1952..1954
FT /ligand="S-adenosyl-L-methionine"
FT /ligand_id="ChEBI:CHEBI:59789"
FT BINDING 1994..1997
FT /ligand="S-adenosyl-L-methionine"
FT /ligand_id="ChEBI:CHEBI:59789"
FT BINDING 2020..2021
FT /ligand="S-adenosyl-L-methionine"
FT /ligand_id="ChEBI:CHEBI:59789"
FT BINDING 2065
FT /ligand="S-adenosyl-L-methionine"
FT /ligand_id="ChEBI:CHEBI:59789"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00190,
FT ECO:0000269|PubMed:21196496"
FT BINDING 2071
FT /ligand="S-adenosyl-L-methionine"
FT /ligand_id="ChEBI:CHEBI:59789"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00190,
FT ECO:0000269|PubMed:21196496"
FT MOD_RES 117
FT /note="Phosphoserine"
FT /evidence="ECO:0000250|UniProtKB:O88491"
FT MOD_RES 483
FT /note="Phosphoserine"
FT /evidence="ECO:0007744|PubMed:21406692,
FT ECO:0007744|PubMed:23186163"
FT MOD_RES 486
FT /note="Phosphoserine"
FT /evidence="ECO:0007744|PubMed:21406692,
FT ECO:0007744|PubMed:23186163"
FT MOD_RES 766
FT /note="Phosphoserine"
FT /evidence="ECO:0007744|PubMed:17487921"
FT MOD_RES 1510
FT /note="Phosphoserine"
FT /evidence="ECO:0000250|UniProtKB:O88491"
FT MOD_RES 2369
FT /note="Phosphoserine"
FT /evidence="ECO:0007744|PubMed:23186163"
FT MOD_RES 2462
FT /note="Phosphothreonine"
FT /evidence="ECO:0007744|PubMed:20068231"
FT MOD_RES 2471
FT /note="Phosphoserine"
FT /evidence="ECO:0007744|PubMed:18669648,
FT ECO:0007744|PubMed:23186163"
FT CROSSLNK 906
FT /note="Glycyl lysine isopeptide (Lys-Gly) (interchain with
FT G-Cter in SUMO2)"
FT /evidence="ECO:0007744|PubMed:28112733"
FT CROSSLNK 1339
FT /note="Glycyl lysine isopeptide (Lys-Gly) (interchain with
FT G-Cter in SUMO2)"
FT /evidence="ECO:0007744|PubMed:28112733"
FT CROSSLNK 2616
FT /note="Glycyl lysine isopeptide (Lys-Gly) (interchain with
FT G-Cter in SUMO2)"
FT /evidence="ECO:0007744|PubMed:25218447"
FT VAR_SEQ 1..269
FT /note="Missing (in isoform 2)"
FT /evidence="ECO:0000303|PubMed:11509567"
FT /id="VSP_007682"
FT VAR_SEQ 270..279
FT /note="QLNSINLSFQ -> MPLKTRTALS (in isoform 2)"
FT /evidence="ECO:0000303|PubMed:11509567"
FT /id="VSP_007683"
FT VAR_SEQ 310..412
FT /note="Missing (in isoform 3)"
FT /evidence="ECO:0000303|PubMed:11493482"
FT /id="VSP_007684"
FT VARIANT 614
FT /note="V -> L (in dbSNP:rs3733875)"
FT /evidence="ECO:0000269|PubMed:12464997"
FT /id="VAR_015775"
FT VARIANT 691
FT /note="A -> T (in dbSNP:rs28932177)"
FT /evidence="ECO:0000269|PubMed:12464997"
FT /id="VAR_015776"
FT VARIANT 726
FT /note="S -> P (in dbSNP:rs28932178)"
FT /evidence="ECO:0000269|PubMed:12464997,
FT ECO:0000269|PubMed:18987736"
FT /id="VAR_015777"
FT VARIANT 1036
FT /note="A -> P (in dbSNP:rs28932179)"
FT /evidence="ECO:0000269|PubMed:12464997"
FT /id="VAR_015778"
FT VARIANT 1091
FT /note="L -> I (in dbSNP:rs35597015)"
FT /evidence="ECO:0000269|PubMed:12464997"
FT /id="VAR_015779"
FT VARIANT 1616
FT /note="H -> L (in SOTOS)"
FT /evidence="ECO:0000269|PubMed:12464997"
FT /id="VAR_015780"
FT VARIANT 1637
FT /note="L -> P (in SOTOS)"
FT /evidence="ECO:0000269|PubMed:12464997"
FT /id="VAR_015781"
FT VARIANT 1674
FT /note="C -> W (in SOTOS)"
FT /evidence="ECO:0000269|PubMed:12464997"
FT /id="VAR_015782"
FT VARIANT 1687
FT /note="I -> N (in SOTOS)"
FT /evidence="ECO:0000269|PubMed:12807965"
FT /id="VAR_015783"
FT VARIANT 1792
FT /note="G -> V (in SOTOS)"
FT /evidence="ECO:0000269|PubMed:12464997"
FT /id="VAR_015784"
FT VARIANT 1925
FT /note="C -> R (in SOTOS)"
FT /evidence="ECO:0000269|PubMed:12464997"
FT /id="VAR_015785"
FT VARIANT 1955
FT /note="G -> D (in SOTOS)"
FT /evidence="ECO:0000269|PubMed:12807965"
FT /id="VAR_015786"
FT VARIANT 1984
FT /note="R -> Q (in SOTOS; loss of enzyme activity;
FT dbSNP:rs587784169)"
FT /evidence="ECO:0000269|PubMed:12807965,
FT ECO:0000269|PubMed:21196496"
FT /id="VAR_015787"
FT VARIANT 1997
FT /note="Y -> C (in SOTOS; dbSNP:rs797045825)"
FT /evidence="ECO:0000269|PubMed:12807965"
FT /id="VAR_015788"
FT VARIANT 2005
FT /note="R -> Q (in SOTOS; strongly reduced enzyme activity;
FT dbSNP:rs587784174)"
FT /evidence="ECO:0000269|PubMed:12464997,
FT ECO:0000269|PubMed:21196496"
FT /id="VAR_015789"
FT VARIANT 2017
FT /note="R -> Q (in SOTOS; loss of enzyme activity;
FT dbSNP:rs587784177)"
FT /evidence="ECO:0000269|PubMed:12464997,
FT ECO:0000269|PubMed:21196496"
FT /id="VAR_015790"
FT VARIANT 2017
FT /note="R -> W (in SOTOS; dbSNP:rs587784176)"
FT /evidence="ECO:0000269|PubMed:12807965"
FT /id="VAR_015791"
FT VARIANT 2143
FT /note="H -> Q (in SOTOS; dbSNP:rs121908068)"
FT /evidence="ECO:0000269|PubMed:12464997"
FT /id="VAR_015792"
FT VARIANT 2183
FT /note="C -> S (in SOTOS; dbSNP:rs121908069)"
FT /evidence="ECO:0000269|PubMed:12464997"
FT /id="VAR_015793"
FT VARIANT 2250
FT /note="M -> I (in dbSNP:rs35848863)"
FT /evidence="ECO:0000269|PubMed:12464997"
FT /id="VAR_015794"
FT VARIANT 2261
FT /note="M -> T (in dbSNP:rs34165241)"
FT /evidence="ECO:0000269|PubMed:12464997"
FT /id="VAR_015795"
FT MUTAGEN 1914
FT /note="R->C: Reduced enzyme activity."
FT /evidence="ECO:0000269|PubMed:21196496"
FT MUTAGEN 1952
FT /note="R->W: Nearly abolished enzyme activity."
FT /evidence="ECO:0000269|PubMed:21196496"
FT CONFLICT 1306
FT /note="H -> D (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 1397
FT /note="P -> Q (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 1478
FT /note="A -> V (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 1959..1960
FT /note="KT -> QE (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 1963
FT /note="K -> R (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 1982
FT /note="R -> M (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 1986..1991
FT /note="RYAQEH -> KHAHEN (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 1995
FT /note="N -> H (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 2001
FT /note="L -> I (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 2016
FT /note="A -> S (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 2022
FT /note="C -> S (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 2030
FT /note="Q -> L (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 2033
FT /note="S -> T (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 2045..2046
FT /note="LS -> VC (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 2049
FT /note="K -> P (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 2061
FT /note="E -> D (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 2066
FT /note="G -> E (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 2071
FT /note="K -> R (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 2075
FT /note="P -> S (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 2304..2305
FT /note="TK -> AQ (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 2352
FT /note="R -> S (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 2539
FT /note="L -> S (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 2543
FT /note="P -> S (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 2567..2591
FT /note="PGPLSQSPGLVKQAKQMVGGQQLPA -> QGFFTKSPALVENKGKTKWVGRP
FT TNYLH (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 2597
FT /note="G -> W (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT CONFLICT 2608..2612
FT /note="ASLPT -> PSSPN (in Ref. 3; AAK92049)"
FT /evidence="ECO:0000305"
FT HELIX 1852..1863
FT /evidence="ECO:0007829|PDB:3OOI"
FT HELIX 1869..1871
FT /evidence="ECO:0007829|PDB:6KQQ"
FT HELIX 1889..1891
FT /evidence="ECO:0007829|PDB:3OOI"
FT STRAND 1901..1903
FT /evidence="ECO:0007829|PDB:3OOI"
FT HELIX 1912..1915
FT /evidence="ECO:0007829|PDB:3OOI"
FT TURN 1922..1924
FT /evidence="ECO:0007829|PDB:3OOI"
FT HELIX 1928..1930
FT /evidence="ECO:0007829|PDB:3OOI"
FT HELIX 1935..1938
FT /evidence="ECO:0007829|PDB:3OOI"
FT STRAND 1944..1948
FT /evidence="ECO:0007829|PDB:3OOI"
FT STRAND 1950..1960
FT /evidence="ECO:0007829|PDB:3OOI"
FT STRAND 1967..1970
FT /evidence="ECO:0007829|PDB:3OOI"
FT STRAND 1973..1976
FT /evidence="ECO:0007829|PDB:3OOI"
FT HELIX 1978..1990
FT /evidence="ECO:0007829|PDB:3OOI"
FT STRAND 1998..2002
FT /evidence="ECO:0007829|PDB:3OOI"
FT STRAND 2005..2013
FT /evidence="ECO:0007829|PDB:3OOI"
FT HELIX 2015..2018
FT /evidence="ECO:0007829|PDB:3OOI"
FT STRAND 2026..2034
FT /evidence="ECO:0007829|PDB:3OOI"
FT STRAND 2037..2046
FT /evidence="ECO:0007829|PDB:3OOI"
FT HELIX 2058..2060
FT /evidence="ECO:0007829|PDB:6KQQ"
FT STRAND 2062..2064
FT /evidence="ECO:0007829|PDB:6KQP"
FT STRAND 2079..2082
FT /evidence="ECO:0007829|PDB:6KQQ"
SQ SEQUENCE 2696 AA; 296652 MW; 4E80E6DCD9A24C81 CRC64;
MDQTCELPRR NCLLPFSNPV NLDAPEDKDS PFGNGQSNFS EPLNGCTMQL STVSGTSQNA
YGQDSPSCYI PLRRLQDLAS MINVEYLNGS ADGSESFQDP EKSDSRAQTP IVCTSLSPGG
PTALAMKQEP SCNNSPELQV KVTKTIKNGF LHFENFTCVD DADVDSEMDP EQPVTEDESI
EEIFEETQTN ATCNYETKSE NGVKVAMGSE QDSTPESRHG AVKSPFLPLA PQTETQKNKQ
RNEVDGSNEK AALLPAPFSL GDTNITIEEQ LNSINLSFQD DPDSSTSTLG NMLELPGTSS
SSTSQELPFC QPKKKSTPLK YEVGDLIWAK FKRRPWWPCR ICSDPLINTH SKMKVSNRRP
YRQYYVEAFG DPSERAWVAG KAIVMFEGRH QFEELPVLRR RGKQKEKGYR HKVPQKILSK
WEASVGLAEQ YDVPKGSKNR KCIPGSIKLD SEEDMPFEDC TNDPESEHDL LLNGCLKSLA
FDSEHSADEK EKPCAKSRAR KSSDNPKRTS VKKGHIQFEA HKDERRGKIP ENLGLNFISG
DISDTQASNE LSRIANSLTG SNTAPGSFLF SSCGKNTAKK EFETSNGDSL LGLPEGALIS
KCSREKNKPQ RSLVCGSKVK LCYIGAGDEE KRSDSISICT TSDDGSSDLD PIEHSSESDN
SVLEIPDAFD RTENMLSMQK NEKIKYSRFA ATNTRVKAKQ KPLISNSHTD HLMGCTKSAE
PGTETSQVNL SDLKASTLVH KPQSDFTNDA LSPKFNLSSS ISSENSLIKG GAANQALLHS
KSKQPKFRSI KCKHKENPVM AEPPVINEEC SLKCCSSDTK GSPLASISKS GKVDGLKLLN
NMHEKTRDSS DIETAVVKHV LSELKELSYR SLGEDVSDSG TSKPSKPLLF SSASSQNHIP
IEPDYKFSTL LMMLKDMHDS KTKEQRLMTA QNLVSYRSPG RGDCSTNSPV GVSKVLVSGG
STHNSEKKGD GTQNSANPSP SGGDSALSGE LSASLPGLLS DKRDLPASGK SRSDCVTRRN
CGRSKPSSKL RDAFSAQMVK NTVNRKALKT ERKRKLNQLP SVTLDAVLQG DRERGGSLRG
GAEDPSKEDP LQIMGHLTSE DGDHFSDVHF DSKVKQSDPG KISEKGLSFE NGKGPELDSV
MNSENDELNG VNQVVPKKRW QRLNQRRTKP RKRMNRFKEK ENSECAFRVL LPSDPVQEGR
DEFPEHRTPS ASILEEPLTE QNHADCLDSA GPRLNVCDKS SASIGDMEKE PGIPSLTPQA
ELPEPAVRSE KKRLRKPSKW LLEYTEEYDQ IFAPKKKQKK VQEQVHKVSS RCEEESLLAR
GRSSAQNKQV DENSLISTKE EPPVLEREAP FLEGPLAQSE LGGGHAELPQ LTLSVPVAPE
VSPRPALESE ELLVKTPGNY ESKRQRKPTK KLLESNDLDP GFMPKKGDLG LSKKCYEAGH
LENGITESCA TSYSKDFGGG TTKIFDKPRK RKRQRHAAAK MQCKKVKNDD SSKEIPGSEG
ELMPHRTATS PKETVEEGVE HDPGMPASKK MQGERGGGAA LKENVCQNCE KLGELLLCEA
QCCGAFHLEC LGLTEMPRGK FICNECRTGI HTCFVCKQSG EDVKRCLLPL CGKFYHEECV
QKYPPTVMQN KGFRCSLHIC ITCHAANPAN VSASKGRLMR CVRCPVAYHA NDFCLAAGSK
ILASNSIICP NHFTPRRGCR NHEHVNVSWC FVCSEGGSLL CCDSCPAAFH RECLNIDIPE
GNWYCNDCKA GKKPHYREIV WVKVGRYRWW PAEICHPRAV PSNIDKMRHD VGEFPVLFFG
SNDYLWTHQA RVFPYMEGDV SSKDKMGKGV DGTYKKALQE AAARFEELKA QKELRQLQED
RKNDKKPPPY KHIKVNRPIG RVQIFTADLS EIPRCNCKAT DENPCGIDSE CINRMLLYEC
HPTVCPAGGR CQNQCFSKRQ YPEVEIFRTL QRGWGLRTKT DIKKGEFVNE YVGELIDEEE
CRARIRYAQE HDITNFYMLT LDKDRIIDAG PKGNYARFMN HCCQPNCETQ KWSVNGDTRV
GLFALSDIKA GTELTFNYNL ECLGNGKTVC KCGAPNCSGF LGVRPKNQPI ATEEKSKKFK
KKQQGKRRTQ GEITKEREDE CFSCGDAGQL VSCKKPGCPK VYHADCLNLT KRPAGKWECP
WHQCDICGKE AASFCEMCPS SFCKQHREGM LFISKLDGRL SCTEHDPCGP NPLEPGEIRE
YVPPPVPLPP GPSTHLAEQS TGMAAQAPKM SDKPPADTNQ MLSLSKKALA GTCQRPLLPE
RPLERTDSRP QPLDKVRDLA GSGTKSQSLV SSQRPLDRPP AVAGPRPQLS DKPSPVTSPS
SSPSVRSQPL ERPLGTADPR LDKSIGAASP RPQSLEKTSV PTGLRLPPPD RLLITSSPKP
QTSDRPTDKP HASLSQRLPP PEKVLSAVVQ TLVAKEKALR PVDQNTQSKN RAALVMDLID
LTPRQKERAA SPHQVTPQAD EKMPVLESSS WPASKGLGHM PRAVEKGCVS DPLQTSGKAA
APSEDPWQAV KSLTQARLLS QPPAKAFLYE PTTQASGRAS AGAEQTPGPL SQSPGLVKQA
KQMVGGQQLP ALAAKSGQSF RSLGKAPASL PTEEKKLVTT EQSPWALGKA SSRAGLWPIV
AGQTLAQSCW SAGSTQTLAQ TCWSLGRGQD PKPEQNTLPA LNQAPSSHKC AESEQK