WDR33_MOUSE
ID WDR33_MOUSE Reviewed; 1330 AA.
AC Q8K4P0; Q8C7C6; Q8CD02;
DT 25-JAN-2012, integrated into UniProtKB/Swiss-Prot.
DT 01-OCT-2002, sequence version 1.
DT 03-AUG-2022, entry version 141.
DE RecName: Full=pre-mRNA 3' end processing protein WDR33;
DE AltName: Full=WD repeat-containing protein 33;
DE AltName: Full=WD repeat-containing protein of 146 kDa {ECO:0000303|PubMed:11162572};
GN Name=Wdr33; Synonyms=Wdc146 {ECO:0000303|PubMed:11162572};
OS Mus musculus (Mouse).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae;
OC Murinae; Mus; Mus.
OX NCBI_TaxID=10090;
RN [1]
RP NUCLEOTIDE SEQUENCE [MRNA], SUBCELLULAR LOCATION, AND TISSUE SPECIFICITY.
RX PubMed=11162572; DOI=10.1006/bbrc.2000.4163;
RA Ito S., Sakai A., Nomura T., Miki Y., Ouchida M., Sasaki J., Shimizu K.;
RT "A novel WD40 repeat protein, WDC146, highly expressed during
RT spermatogenesis in a stage-specific manner.";
RL Biochem. Biophys. Res. Commun. 280:656-663(2001).
RN [2]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA].
RC STRAIN=C57BL/6J; TISSUE=Head, and Thymus;
RX PubMed=16141072; DOI=10.1126/science.1112014;
RA Carninci P., Kasukawa T., Katayama S., Gough J., Frith M.C., Maeda N.,
RA Oyama R., Ravasi T., Lenhard B., Wells C., Kodzius R., Shimokawa K.,
RA Bajic V.B., Brenner S.E., Batalov S., Forrest A.R., Zavolan M., Davis M.J.,
RA Wilming L.G., Aidinis V., Allen J.E., Ambesi-Impiombato A., Apweiler R.,
RA Aturaliya R.N., Bailey T.L., Bansal M., Baxter L., Beisel K.W., Bersano T.,
RA Bono H., Chalk A.M., Chiu K.P., Choudhary V., Christoffels A.,
RA Clutterbuck D.R., Crowe M.L., Dalla E., Dalrymple B.P., de Bono B.,
RA Della Gatta G., di Bernardo D., Down T., Engstrom P., Fagiolini M.,
RA Faulkner G., Fletcher C.F., Fukushima T., Furuno M., Futaki S.,
RA Gariboldi M., Georgii-Hemming P., Gingeras T.R., Gojobori T., Green R.E.,
RA Gustincich S., Harbers M., Hayashi Y., Hensch T.K., Hirokawa N., Hill D.,
RA Huminiecki L., Iacono M., Ikeo K., Iwama A., Ishikawa T., Jakt M.,
RA Kanapin A., Katoh M., Kawasawa Y., Kelso J., Kitamura H., Kitano H.,
RA Kollias G., Krishnan S.P., Kruger A., Kummerfeld S.K., Kurochkin I.V.,
RA Lareau L.F., Lazarevic D., Lipovich L., Liu J., Liuni S., McWilliam S.,
RA Madan Babu M., Madera M., Marchionni L., Matsuda H., Matsuzawa S., Miki H.,
RA Mignone F., Miyake S., Morris K., Mottagui-Tabar S., Mulder N., Nakano N.,
RA Nakauchi H., Ng P., Nilsson R., Nishiguchi S., Nishikawa S., Nori F.,
RA Ohara O., Okazaki Y., Orlando V., Pang K.C., Pavan W.J., Pavesi G.,
RA Pesole G., Petrovsky N., Piazza S., Reed J., Reid J.F., Ring B.Z.,
RA Ringwald M., Rost B., Ruan Y., Salzberg S.L., Sandelin A., Schneider C.,
RA Schoenbach C., Sekiguchi K., Semple C.A., Seno S., Sessa L., Sheng Y.,
RA Shibata Y., Shimada H., Shimada K., Silva D., Sinclair B., Sperling S.,
RA Stupka E., Sugiura K., Sultana R., Takenaka Y., Taki K., Tammoja K.,
RA Tan S.L., Tang S., Taylor M.S., Tegner J., Teichmann S.A., Ueda H.R.,
RA van Nimwegen E., Verardo R., Wei C.L., Yagi K., Yamanishi H.,
RA Zabarovsky E., Zhu S., Zimmer A., Hide W., Bult C., Grimmond S.M.,
RA Teasdale R.D., Liu E.T., Brusic V., Quackenbush J., Wahlestedt C.,
RA Mattick J.S., Hume D.A., Kai C., Sasaki D., Tomaru Y., Fukuda S.,
RA Kanamori-Katayama M., Suzuki M., Aoki J., Arakawa T., Iida J., Imamura K.,
RA Itoh M., Kato T., Kawaji H., Kawagashira N., Kawashima T., Kojima M.,
RA Kondo S., Konno H., Nakano K., Ninomiya N., Nishio T., Okada M., Plessy C.,
RA Shibata K., Shiraki T., Suzuki S., Tagami M., Waki K., Watahiki A.,
RA Okamura-Oho Y., Suzuki H., Kawai J., Hayashizaki Y.;
RT "The transcriptional landscape of the mammalian genome.";
RL Science 309:1559-1563(2005).
RN [3]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=C57BL/6J;
RX PubMed=19468303; DOI=10.1371/journal.pbio.1000112;
RA Church D.M., Goodstadt L., Hillier L.W., Zody M.C., Goldstein S., She X.,
RA Bult C.J., Agarwala R., Cherry J.L., DiCuccio M., Hlavina W., Kapustin Y.,
RA Meric P., Maglott D., Birtle Z., Marques A.C., Graves T., Zhou S.,
RA Teague B., Potamousis K., Churas C., Place M., Herschleb J., Runnheim R.,
RA Forrest D., Amos-Landgraf J., Schwartz D.C., Cheng Z., Lindblad-Toh K.,
RA Eichler E.E., Ponting C.P.;
RT "Lineage-specific biology revealed by a finished genome assembly of the
RT mouse.";
RL PLoS Biol. 7:E1000112-E1000112(2009).
RN [4]
RP ACETYLATION [LARGE SCALE ANALYSIS] AT LYS-46, AND IDENTIFICATION BY MASS
RP SPECTROMETRY [LARGE SCALE ANALYSIS].
RC TISSUE=Embryonic fibroblast;
RX PubMed=23806337; DOI=10.1016/j.molcel.2013.06.001;
RA Park J., Chen Y., Tishkoff D.X., Peng C., Tan M., Dai L., Xie Z., Zhang Y.,
RA Zwaans B.M., Skinner M.E., Lombard D.B., Zhao Y.;
RT "SIRT5-mediated lysine desuccinylation impacts diverse metabolic
RT pathways.";
RL Mol. Cell 50:919-930(2013).
RN [5]
RP METHYLATION [LARGE SCALE ANALYSIS] AT ARG-909; ARG-981; ARG-1028; ARG-1256
RP AND ARG-1309, AND IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE
RP ANALYSIS].
RC TISSUE=Brain, and Embryo;
RX PubMed=24129315; DOI=10.1074/mcp.o113.027870;
RA Guo A., Gu H., Zhou J., Mulhern D., Wang Y., Lee K.A., Yang V., Aguiar M.,
RA Kornhauser J., Jia X., Ren J., Beausoleil S.A., Silva J.C., Vemulapalli V.,
RA Bedford M.T., Comb M.J.;
RT "Immunoaffinity enrichment and mass spectrometry analysis of protein
RT methylation.";
RL Mol. Cell. Proteomics 13:372-387(2014).
CC -!- FUNCTION: Essential for both cleavage and polyadenylation of pre-mRNA
CC 3' ends. {ECO:0000250}.
CC -!- SUBUNIT: Component of the cleavage and polyadenylation specificity
CC factor (CPSF) module of the pre-mRNA 3'-end processing complex.
CC Interacts with CPSF3/CPSF73 (By similarity). {ECO:0000250}.
CC -!- SUBCELLULAR LOCATION: Nucleus {ECO:0000269|PubMed:11162572}.
CC -!- TISSUE SPECIFICITY: Most highly expressed in testis.
CC {ECO:0000269|PubMed:11162572}.
CC -!- SIMILARITY: Belongs to the WD repeat WDR33 family. {ECO:0000305}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AB086191; BAC00776.1; -; mRNA.
DR EMBL; AK031786; BAC27549.1; -; mRNA.
DR EMBL; AK050653; BAC34364.1; -; mRNA.
DR EMBL; AC124393; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AC131761; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; AC161511; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR CCDS; CCDS29112.1; -.
DR RefSeq; NP_083142.2; NM_028866.3.
DR AlphaFoldDB; Q8K4P0; -.
DR SMR; Q8K4P0; -.
DR BioGRID; 216664; 4.
DR IntAct; Q8K4P0; 2.
DR MINT; Q8K4P0; -.
DR STRING; 10090.ENSMUSP00000025264; -.
DR iPTMnet; Q8K4P0; -.
DR PhosphoSitePlus; Q8K4P0; -.
DR EPD; Q8K4P0; -.
DR jPOST; Q8K4P0; -.
DR MaxQB; Q8K4P0; -.
DR PaxDb; Q8K4P0; -.
DR PeptideAtlas; Q8K4P0; -.
DR PRIDE; Q8K4P0; -.
DR ProteomicsDB; 297844; -.
DR Antibodypedia; 18487; 102 antibodies from 20 providers.
DR DNASU; 74320; -.
DR Ensembl; ENSMUST00000025264; ENSMUSP00000025264; ENSMUSG00000024400.
DR GeneID; 74320; -.
DR KEGG; mmu:74320; -.
DR UCSC; uc008eis.2; mouse.
DR CTD; 55339; -.
DR MGI; MGI:1921570; Wdr33.
DR VEuPathDB; HostDB:ENSMUSG00000024400; -.
DR eggNOG; KOG0284; Eukaryota.
DR GeneTree; ENSGT00730000111130; -.
DR HOGENOM; CLU_000288_77_3_1; -.
DR InParanoid; Q8K4P0; -.
DR OMA; DHREMEA; -.
DR PhylomeDB; Q8K4P0; -.
DR TreeFam; TF317659; -.
DR Reactome; R-MMU-159231; Transport of Mature mRNA Derived from an Intronless Transcript.
DR Reactome; R-MMU-72163; mRNA Splicing - Major Pathway.
DR Reactome; R-MMU-72187; mRNA 3'-end processing.
DR Reactome; R-MMU-73856; RNA Polymerase II Transcription Termination.
DR Reactome; R-MMU-77595; Processing of Intronless Pre-mRNAs.
DR BioGRID-ORCS; 74320; 27 hits in 70 CRISPR screens.
DR ChiTaRS; Wdr33; mouse.
DR PRO; PR:Q8K4P0; -.
DR Proteomes; UP000000589; Chromosome 18.
DR RNAct; Q8K4P0; protein.
DR Bgee; ENSMUSG00000024400; Expressed in ear vesicle and 255 other tissues.
DR ExpressionAtlas; Q8K4P0; baseline and differential.
DR Genevisible; Q8K4P0; MM.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0001650; C:fibrillar center; ISO:MGI.
DR GO; GO:0005847; C:mRNA cleavage and polyadenylation specificity factor complex; IBA:GO_Central.
DR GO; GO:0005654; C:nucleoplasm; ISO:MGI.
DR GO; GO:0005634; C:nucleus; IDA:MGI.
DR GO; GO:0006378; P:mRNA polyadenylation; IBA:GO_Central.
DR Gene3D; 2.130.10.10; -; 3.
DR InterPro; IPR045245; Pfs2-like.
DR InterPro; IPR015943; WD40/YVTN_repeat-like_dom_sf.
DR InterPro; IPR001680; WD40_repeat.
DR InterPro; IPR036322; WD40_repeat_dom_sf.
DR PANTHER; PTHR22836; PTHR22836; 2.
DR Pfam; PF00400; WD40; 5.
DR SMART; SM00320; WD40; 7.
DR SUPFAM; SSF50978; SSF50978; 1.
DR PROSITE; PS50082; WD_REPEATS_2; 6.
DR PROSITE; PS50294; WD_REPEATS_REGION; 1.
PE 1: Evidence at protein level;
KW Acetylation; Collagen; Isopeptide bond; Methylation; mRNA processing;
KW Nucleus; Phosphoprotein; Reference proteome; Repeat; Ubl conjugation;
KW WD repeat.
FT INIT_MET 1
FT /note="Removed"
FT /evidence="ECO:0000250|UniProtKB:Q9C0J8"
FT CHAIN 2..1330
FT /note="pre-mRNA 3' end processing protein WDR33"
FT /id="PRO_0000415291"
FT REPEAT 117..156
FT /note="WD 1"
FT REPEAT 159..198
FT /note="WD 2"
FT REPEAT 200..239
FT /note="WD 3"
FT REPEAT 242..283
FT /note="WD 4"
FT REPEAT 286..325
FT /note="WD 5"
FT REPEAT 329..369
FT /note="WD 6"
FT REPEAT 373..412
FT /note="WD 7"
FT DOMAIN 617..769
FT /note="Collagen-like"
FT REGION 566..1330
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 570..610
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 611..625
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 736..750
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 874..897
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 961..976
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 990..1035
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1071..1112
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1130..1147
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1165..1207
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1230..1249
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1271..1286
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1291..1321
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT MOD_RES 2
FT /note="N-acetylalanine"
FT /evidence="ECO:0000250|UniProtKB:Q9C0J8"
FT MOD_RES 7
FT /note="Phosphoserine"
FT /evidence="ECO:0000250|UniProtKB:Q9C0J8"
FT MOD_RES 46
FT /note="N6-acetyllysine"
FT /evidence="ECO:0007744|PubMed:23806337"
FT MOD_RES 776
FT /note="Omega-N-methylarginine"
FT /evidence="ECO:0000250|UniProtKB:Q9C0J8"
FT MOD_RES 909
FT /note="Asymmetric dimethylarginine"
FT /evidence="ECO:0007744|PubMed:24129315"
FT MOD_RES 981
FT /note="Omega-N-methylarginine"
FT /evidence="ECO:0007744|PubMed:24129315"
FT MOD_RES 1028
FT /note="Omega-N-methylarginine"
FT /evidence="ECO:0007744|PubMed:24129315"
FT MOD_RES 1204
FT /note="Phosphoserine"
FT /evidence="ECO:0000250|UniProtKB:Q9C0J8"
FT MOD_RES 1256
FT /note="Omega-N-methylarginine"
FT /evidence="ECO:0007744|PubMed:24129315"
FT MOD_RES 1309
FT /note="Asymmetric dimethylarginine; alternate"
FT /evidence="ECO:0007744|PubMed:24129315"
FT MOD_RES 1309
FT /note="Omega-N-methylarginine; alternate"
FT /evidence="ECO:0007744|PubMed:24129315"
FT CROSSLNK 526
FT /note="Glycyl lysine isopeptide (Lys-Gly) (interchain with
FT G-Cter in SUMO2)"
FT /evidence="ECO:0000250|UniProtKB:Q9C0J8"
FT CROSSLNK 530
FT /note="Glycyl lysine isopeptide (Lys-Gly) (interchain with
FT G-Cter in SUMO2)"
FT /evidence="ECO:0000250|UniProtKB:Q9C0J8"
FT CROSSLNK 560
FT /note="Glycyl lysine isopeptide (Lys-Gly) (interchain with
FT G-Cter in SUMO2)"
FT /evidence="ECO:0000250|UniProtKB:Q9C0J8"
SQ SEQUENCE 1330 AA; 145267 MW; 5175B5DEB49F9A03 CRC64;
MATEIGSPPR FFHMPRFQHQ APRQLFYKRP DFAQQQAMQQ LTFDGKRMRK AVNRKTIDYN
PSVIKYLENR IWQRDQRDMR AIQPDAGYYN DLVPPIGMLN NPMNAVTTKF VRTSTNKVKC
PVFVVRWTPE GRRLVTGASS GEFTLWNGLT FNFETILQAH DSPVRAMTWS HNDMWMLTAD
HGGYVKYWQS NMNNVKMFQA HKEAIREASF SPTDNKFATC SDDGTVRIWD FLRCHEERIL
RGHGADVKCV DWHPTKGLVV SGSKDSQQPI KFWDPKTGQS LATLHAHKNT VMEVKLNLNG
NWLLTASRDH LCKLFDIRNL KEELQVFRGH KKEATAVAWH PVHEGLFASG GSDGSLLFWH
VGVEKEVGGM EMAHEGMIWS LAWHPLGHIL CSGSNDHTSK FWTRNRPGDK MRDRYNLNLL
PGMSEDGVEY DDLEPNSLAV IPGMGIPEQL KLAMEQEQMG KDESSEIEMT IPGLDWGMEE
VMQKDQKKVP QKKVPYAKPI PAQFQQAWMQ NKVPIPAPNE VLNDRKEDIK LEEKKKTQAE
IEQEMATLQY TNPQLLEQLK IERLAQKQAD QIQPPPSSGT PLLGPQPFSG QGPISQIPQG
FQQPHPSQQM PLVPQMGPPG PQGQFRAPGP QGQMGPQGPP MHQGGGGPQG FMGPQGPQGP
PQGLPRPQDM HGPQGMQRHP GPHGPLGPQG PPGPQGSSGP QGHMGPQGPP GPQGHIGPQG
PPASQGHMGP QGPPGTQGMQ GPPGPRGMQG PPHPHGIQGG PASQGIQGPL MGLNPRGMQG
PPGPRENQGP APQGLMIGHP PQEMRGPHPP SGLLGHGPQE MRGPQEMRGM QGPPPQGSML
GPPQELRGPS GSQGQQGPPQ GSLGPPPQGG MQGPPGPQGQ QNPARGPHPS QGPIPFQQQK
APLLGDGPRA PFNQEGQSTG PPPLIPGLGQ QGAQGRIPPL NPGQGPGPNK GDTRGPPNHH
LGPMSERRHE QSGGPEHGPD RGPFRGGQDC RGPPDRRGSH PDFPDDFRPD DFHPDKRFGH
RLREFEGRGG PLPQEEKWRR GGPGPPFPPD HREFNEGDGR GAARGPPGAW EGRRPGDDRF
PRDPDDPRFR GRREESFRRG APPRHEGRAP PRGRDNFPGP DDFGPEEGFD ASDEAARGRD
LRGRGRGTPR GGSRKCLLPT PDEFPRFEGG RKPDSWDGNR EPGPGHEHFR DAPRPDHPPH
DGHSPASRER SSSLQGMDMA SLPPRKRPWH DGSGTSEHRE MEAQGGPSED RGSKGRGGPG
PSQRVPKSGR SSSLDGDHHD GYHRDEPFGG PPGSSSSSRG ARSGSNWGRG SNMNSGPPRR
GTSRGSGRGR