THOC2_MOUSE
ID THOC2_MOUSE Reviewed; 1594 AA.
AC B1AZI6;
DT 22-SEP-2009, integrated into UniProtKB/Swiss-Prot.
DT 08-APR-2008, sequence version 1.
DT 03-AUG-2022, entry version 97.
DE RecName: Full=THO complex subunit 2;
DE Short=Tho2;
GN Name=Thoc2;
OS Mus musculus (Mouse).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae;
OC Murinae; Mus; Mus.
OX NCBI_TaxID=10090;
RN [1]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=C57BL/6J;
RX PubMed=19468303; DOI=10.1371/journal.pbio.1000112;
RA Church D.M., Goodstadt L., Hillier L.W., Zody M.C., Goldstein S., She X.,
RA Bult C.J., Agarwala R., Cherry J.L., DiCuccio M., Hlavina W., Kapustin Y.,
RA Meric P., Maglott D., Birtle Z., Marques A.C., Graves T., Zhou S.,
RA Teague B., Potamousis K., Churas C., Place M., Herschleb J., Runnheim R.,
RA Forrest D., Amos-Landgraf J., Schwartz D.C., Cheng Z., Lindblad-Toh K.,
RA Eichler E.E., Ponting C.P.;
RT "Lineage-specific biology revealed by a finished genome assembly of the
RT mouse.";
RL PLoS Biol. 7:E1000112-E1000112(2009).
RN [2]
RP IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
RC TISSUE=Liver;
RX PubMed=17242355; DOI=10.1073/pnas.0609836104;
RA Villen J., Beausoleil S.A., Gerber S.A., Gygi S.P.;
RT "Large-scale phosphorylation analysis of mouse liver.";
RL Proc. Natl. Acad. Sci. U.S.A. 104:1488-1493(2007).
RN [3]
RP PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-1222; SER-1393 AND SER-1417,
RP AND IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS].
RC TISSUE=Brown adipose tissue, Kidney, Lung, Spleen, and Testis;
RX PubMed=21183079; DOI=10.1016/j.cell.2010.12.001;
RA Huttlin E.L., Jedrychowski M.P., Elias J.E., Goswami T., Rad R.,
RA Beausoleil S.A., Villen J., Haas W., Sowa M.E., Gygi S.P.;
RT "A tissue-specific atlas of mouse protein phosphorylation and expression.";
RL Cell 143:1174-1189(2010).
RN [4]
RP TISSUE SPECIFICITY.
RX PubMed=26166480; DOI=10.1016/j.ajhg.2015.05.021;
RA Kumar R., Corbett M.A., van Bon B.W., Woenig J.A., Weir L., Douglas E.,
RA Friend K.L., Gardner A., Shaw M., Jolly L.A., Tan C., Hunter M.F.,
RA Hackett A., Field M., Palmer E.E., Leffler M., Rogers C., Boyle J.,
RA Bienek M., Jensen C., Van Buggenhout G., Van Esch H., Hoffmann K.,
RA Raynaud M., Zhao H., Reed R., Hu H., Haas S.A., Haan E., Kalscheuer V.M.,
RA Gecz J.;
RT "THOC2 Mutations Implicate mRNA-Export Pathway in X-Linked Intellectual
RT Disability.";
RL Am. J. Hum. Genet. 97:302-310(2015).
CC -!- FUNCTION: Required for efficient export of polyadenylated RNA and
CC spliced mRNA. Acts as component of the THO subcomplex of the TREX
CC complex which is thought to couple mRNA transcription, processing and
CC nuclear export, and which specifically associates with spliced mRNA and
CC not with unspliced pre-mRNA. TREX is recruited to spliced mRNAs by a
CC transcription-independent mechanism, binds to mRNA upstream of the
CC exon-junction complex (EJC) and is recruited in a splicing- and cap-
CC dependent manner to a region near the 5' end of the mRNA where it
CC functions in mRNA export to the cytoplasm via the TAP/NFX1 pathway.
CC Plays a role for proper neuronal development.
CC {ECO:0000250|UniProtKB:Q8NI27}.
CC -!- SUBUNIT: Component of the THO complex, which is composed of THOC1,
CC THOC2, THOC3, THOC5, THOC6 and THOC7; together with at least
CC ALYREF/THOC4, DDX39B, SARNP/CIP29 and CHTOP, THO forms the
CC transcription/export (TREX) complex which seems to have a dynamic
CC structure involving ATP-dependent remodeling. Interacts with THOC1,
CC POLDIP3 and ZC3H11A (By similarity). {ECO:0000250}.
CC -!- SUBCELLULAR LOCATION: Nucleus {ECO:0000305}. Nucleus speckle
CC {ECO:0000250}.
CC -!- TISSUE SPECIFICITY: Expressed in the hippocampus and the cortical
CC neurons. {ECO:0000269|PubMed:26166480}.
CC -!- SIMILARITY: Belongs to the THOC2 family. {ECO:0000305}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AL954355; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; BX005253; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR CCDS; CCDS40951.1; -.
DR RefSeq; NP_001028594.1; NM_001033422.1.
DR AlphaFoldDB; B1AZI6; -.
DR SMR; B1AZI6; -.
DR BioGRID; 237091; 28.
DR IntAct; B1AZI6; 1.
DR STRING; 10090.ENSMUSP00000044677; -.
DR iPTMnet; B1AZI6; -.
DR PhosphoSitePlus; B1AZI6; -.
DR EPD; B1AZI6; -.
DR jPOST; B1AZI6; -.
DR MaxQB; B1AZI6; -.
DR PaxDb; B1AZI6; -.
DR PeptideAtlas; B1AZI6; -.
DR PRIDE; B1AZI6; -.
DR ProteomicsDB; 259385; -.
DR Ensembl; ENSMUST00000047037; ENSMUSP00000044677; ENSMUSG00000037475.
DR GeneID; 331401; -.
DR KEGG; mmu:331401; -.
DR UCSC; uc009tao.1; mouse.
DR CTD; 57187; -.
DR MGI; MGI:2442413; Thoc2.
DR VEuPathDB; HostDB:ENSMUSG00000037475; -.
DR eggNOG; KOG1874; Eukaryota.
DR GeneTree; ENSGT00710000106792; -.
DR HOGENOM; CLU_000511_5_0_1; -.
DR InParanoid; B1AZI6; -.
DR OMA; PEIAFWI; -.
DR OrthoDB; 979205at2759; -.
DR PhylomeDB; B1AZI6; -.
DR TreeFam; TF313127; -.
DR Reactome; R-MMU-159236; Transport of Mature mRNA derived from an Intron-Containing Transcript.
DR Reactome; R-MMU-72187; mRNA 3'-end processing.
DR Reactome; R-MMU-73856; RNA Polymerase II Transcription Termination.
DR BioGRID-ORCS; 331401; 27 hits in 77 CRISPR screens.
DR ChiTaRS; Thoc2; mouse.
DR PRO; PR:B1AZI6; -.
DR Proteomes; UP000000589; Chromosome X.
DR RNAct; B1AZI6; protein.
DR Bgee; ENSMUSG00000037475; Expressed in rostral migratory stream and 253 other tissues.
DR ExpressionAtlas; B1AZI6; baseline and differential.
DR Genevisible; B1AZI6; MM.
DR GO; GO:0000781; C:chromosome, telomeric region; ISO:MGI.
DR GO; GO:0016607; C:nuclear speck; IEA:UniProtKB-SubCell.
DR GO; GO:0005634; C:nucleus; IDA:MGI.
DR GO; GO:0000347; C:THO complex; ISO:MGI.
DR GO; GO:0000445; C:THO complex part of transcription export complex; ISO:MGI.
DR GO; GO:0000346; C:transcription export complex; ISO:MGI.
DR GO; GO:0003729; F:mRNA binding; IMP:MGI.
DR GO; GO:0001824; P:blastocyst development; IMP:MGI.
DR GO; GO:0000902; P:cell morphogenesis; IMP:MGI.
DR GO; GO:0048699; P:generation of neurons; ISS:UniProtKB.
DR GO; GO:0006406; P:mRNA export from nucleus; ISO:MGI.
DR GO; GO:0006397; P:mRNA processing; IEA:UniProtKB-KW.
DR GO; GO:0010977; P:negative regulation of neuron projection development; ISO:MGI.
DR GO; GO:0048666; P:neuron development; ISS:UniProtKB.
DR GO; GO:0016973; P:poly(A)+ mRNA export from nucleus; ISO:MGI.
DR GO; GO:0010468; P:regulation of gene expression; IMP:MGI.
DR GO; GO:0010793; P:regulation of mRNA export from nucleus; IMP:MGI.
DR GO; GO:0008380; P:RNA splicing; IEA:UniProtKB-KW.
DR GO; GO:0017145; P:stem cell division; IMP:MGI.
DR GO; GO:0046784; P:viral mRNA export from host cell nucleus; ISO:MGI.
DR InterPro; IPR040007; Tho2.
DR InterPro; IPR021418; THO_THOC2_C.
DR InterPro; IPR021726; THO_THOC2_N.
DR InterPro; IPR032302; THOC2_N.
DR PANTHER; PTHR21597; PTHR21597; 1.
DR Pfam; PF11262; Tho2; 1.
DR Pfam; PF11732; Thoc2; 1.
DR Pfam; PF16134; THOC2_N; 2.
PE 1: Evidence at protein level;
KW Coiled coil; mRNA processing; mRNA splicing; mRNA transport; Nucleus;
KW Phosphoprotein; Reference proteome; RNA-binding; Transport.
FT CHAIN 1..1594
FT /note="THO complex subunit 2"
FT /id="PRO_0000384399"
FT REGION 1183..1594
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COILED 896..965
FT /evidence="ECO:0000255"
FT COILED 1464..1491
FT /evidence="ECO:0000255"
FT MOTIF 923..928
FT /note="Nuclear localization signal"
FT /evidence="ECO:0000255"
FT COMPBIAS 1201..1220
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1221..1236
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1237..1264
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1265..1381
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1398..1416
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1447..1509
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1518..1586
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT MOD_RES 1222
FT /note="Phosphoserine"
FT /evidence="ECO:0007744|PubMed:21183079"
FT MOD_RES 1385
FT /note="Phosphothreonine"
FT /evidence="ECO:0000250|UniProtKB:Q8NI27"
FT MOD_RES 1390
FT /note="Phosphoserine"
FT /evidence="ECO:0000250|UniProtKB:Q8NI27"
FT MOD_RES 1393
FT /note="Phosphoserine"
FT /evidence="ECO:0007744|PubMed:21183079"
FT MOD_RES 1417
FT /note="Phosphoserine"
FT /evidence="ECO:0007744|PubMed:21183079"
FT MOD_RES 1450
FT /note="Phosphoserine"
FT /evidence="ECO:0000250|UniProtKB:Q8NI27"
FT MOD_RES 1486
FT /note="Phosphoserine"
FT /evidence="ECO:0000250|UniProtKB:Q8NI27"
FT MOD_RES 1516
FT /note="Phosphoserine"
FT /evidence="ECO:0000250|UniProtKB:Q8NI27"
SQ SEQUENCE 1594 AA; 182773 MW; 9732F53F9C506CC8 CRC64;
MAAAAVVVPA EWIKNWEKSG RGEFLHLCRI LSENKSHDSS TYRDFQQALY ELSYHVIKGN
LKHEQASSVL NDISEFREDM PSILADVFCI LDIETNCLEE KSKRDYFTQL VLACLYLVSD
TVLKERLDPE TLESLGLIKQ SQQFNQKSVK IKTKLFYKQQ KFNLLREENE GYAKLIAELG
QDLSGNITSD LILENIKSLI GCFNLDPNRV LDVILEVFEC RPEHDDFFIS LLESYMSMCE
PQTLCHILGF KFKFYQEPSG ETPSSLYRVA AVLLQFNLID LDDLYVHLLP ADNCIMDEYK
REIVEAKQIV RKLTMVVLSS EKLDERDKEK DKDDEKVEKP PDNQKLGLLE ALLKVGDWQH
AQNIMDQMPP YYAASHKLIA LAICKLIHIT VEPLYRRVGV PKGAKGSPVS ALQNKRAPKQ
VESFEDLRRD VFNMFCYLGP HLSHDPILFA KVVRIGKSFM KEFQSDGSKQ EDKEKTEVIL
SCLLSITDQV LLPSLSLMDC NACMSEELWG MFKTFPYQHR YRLYGQWKNE TYNGHPLLVK
VKAQTIDRAK YIMKRLTKEN VKPSGRQIGK LSHSNPTILF DYILSQIQKY DNLITPVVDS
LKYLTSLNYD VLAYCIIEAL ANPEKERMKH DDTTISSWLQ SLASFCGAVF RKYPIDLAGL
LQYVANQLKA GKSFDLLILK EVVQKMAGIE ITEEMTMEQL EAMTGGEQLK AEGGYFGQIR
NTKKSSQRLK DALLDHDLAL PLCLLMAQQR NGVIFQEGGE KHLKLVGKLY DQCHDTLVQF
GGFLASNLST EDYIKRVPSI DVLCNEFHTP HDAAFFLSRP MYAHHISSKY DELKKSEKGS
KQQHKVHKYI TSCEMVMAPV HEAVVSLHVS KVWDDISPQF YATFWSLTMY DLAVPHTSYE
REVNKLKVQM KAIDDNQEMP PNKKKKEKER CTALQDKLLE EEKKQMEHVQ RVLQRLKLEK
DNWLLAKSTK NETITKFLQL CIFPRCIFSA IDAVYCARFV ELVHQQKTPN FSTLLCYDRV
FSDIIYTVAS CTENEASRYG RFLCCMLETV TRWHSDRATY EKECGNYPGF LTILRATGFD
GGNKADQLDY ENFRHVVHKW HYKLTKASVH CLETGEYTHI RNILIVLTKI LPWYPKVLNL
GQALERRVNK ICQEEKEKRP DLYALAMGYS GQLKSRKSHM IPENEFHHKD PPPRNAVASV
QNGPGGGTSS SSIGNASKSD ESGAEETDKS RERSQCGTKA VNKASSTTPK GNSSNGNSGS
NSNKAVKEND KEKVKEKEKE KKEKTPATTP EARALGKDSK EKPKEERPNK EDKARETKER
TPKSDKEKEK FKKEEKAKDE KFKTTVPIVE SKSTQERERE KEPSRERDVA KEMKSKENVK
GGEKTPVSGS LKSPVPRSDI SEPDREQKRR KIDSHPSPSH SSTVKDSLID LKDSSAKLYI
NHNPPPLSKS KEREMDKKDL DKSRERSRER EKKDEKDRKE RKRDHSNNDR EVPPDITKRR
KEENGTMGVS KHKSESPCES QYPNEKDKEK NKSKSSGKEK SSSDSFKSEK MDKISSGGKK
ESRHDKEKIE KKEKRDSSGG KEEKKHHKSS DKHR