NANOG_MACFA

ID   NANOG_MACFA             Reviewed;         305 AA.
AC   Q5TM84;
DT   28-NOV-2006, integrated into UniProtKB/Swiss-Prot.
DT   21-DEC-2004, sequence version 1.
DT   03-AUG-2022, entry version 89.
DE   RecName: Full=Homeobox protein NANOG;
DE   AltName: Full=Homeobox transcription factor Nanog;
GN   Name=NANOG; Synonyms=STM1;
OS   Macaca fascicularis (Crab-eating macaque) (Cynomolgus monkey).
OC   Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC   Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini;
OC   Cercopithecidae; Cercopithecinae; Macaca.
OX   NCBI_TaxID=9541;
RN   [1]
RP   NUCLEOTIDE SEQUENCE [MRNA], AND DEVELOPMENTAL STAGE.
RC   TISSUE=Embryonic stem cell;
RX   PubMed=15582778; DOI=10.1016/j.mod.2004.08.008;
RA   Hatano S.Y., Tada M., Kimura H., Yamaguchi S., Kono T., Nakano T.,
RA   Suemori H., Nakatsuji N., Tada T.;
RT   "Pluripotential competence of cells associated with Nanog activity.";
RL   Mech. Dev. 122:67-79(2005).
CC   -!- FUNCTION: Transcription regulator involved in inner cell mass and
CC       embryonic stem (ES) cells proliferation and self-renewal. Imposes
CC       pluripotency on ES cells and prevents their differentiation towards
CC       extraembryonic endoderm and trophectoderm lineages. Blocks bone
CC       morphogenetic protein-induced mesoderm differentiation of ES cells by
CC       physically interacting with SMAD1 and interfering with the recruitment
CC       of coactivators to the active SMAD transcriptional complexes. Acts as a
CC       transcriptional activator or repressor. Binds optimally to the DNA
CC       consensus sequence 5'-TAAT[GT][GT]-3' or 5'-[CG][GA][CG]C[GC]ATTAN[GC]-
CC       3'. Binds to the POU5F1/OCT4 promoter. Able to autorepress its
CC       expression in differentiating (ES) cells: binds to its own promoter
CC       following interaction with ZNF281/ZFP281, leading to recruitment of the
CC       NuRD complex and subsequent repression of expression. When
CC       overexpressed, promotes cells to enter into S phase and proliferation
CC       (By similarity). {ECO:0000250, ECO:0000250|UniProtKB:Q80Z64,
CC       ECO:0000250|UniProtKB:Q9H9S0}.
CC   -!- SUBUNIT: Interacts with SMAD1. Interacts with SALL4. Interacts with
CC       ZNF281/ZFP281 (By similarity). Interacts with PCGF1 (By similarity).
CC       Interacts with ESRRB; reciprocally modulates their transcriptional
CC       activities. Interacts with NSD2 (By similarity).
CC       {ECO:0000250|UniProtKB:Q80Z64, ECO:0000250|UniProtKB:Q9H9S0}.
CC   -!- SUBCELLULAR LOCATION: Nucleus {ECO:0000255|PROSITE-ProRule:PRU00108}.
CC   -!- DEVELOPMENTAL STAGE: Expressed in embryonic stem (ES) cells.
CC       {ECO:0000269|PubMed:15582778}.
CC   -!- SIMILARITY: Belongs to the Nanog homeobox family. {ECO:0000305}.
CC   ---------------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC   ---------------------------------------------------------------------------
DR   EMBL; AB126938; BAD72891.1; -; mRNA.
DR   RefSeq; NP_001274577.1; NM_001287648.1.
DR   AlphaFoldDB; Q5TM84; -.
DR   SMR; Q5TM84; -.
DR   STRING; 9541.XP_005570081.1; -.
DR   Ensembl; ENSMFAT00000035085; ENSMFAP00000020930; ENSMFAG00000033399.
DR   GeneID; 102114964; -.
DR   CTD; 79923; -.
DR   eggNOG; KOG0491; Eukaryota.
DR   GeneTree; ENSGT00670000098076; -.
DR   OrthoDB; 1141558at2759; -.
DR   Proteomes; UP000233100; Chromosome 11.
DR   GO; GO:0005634; C:nucleus; IEA:UniProtKB-SubCell.
DR   GO; GO:0003700; F:DNA-binding transcription factor activity; ISS:UniProtKB.
DR   GO; GO:0001227; F:DNA-binding transcription repressor activity, RNA polymerase II-specific; ISS:UniProtKB.
DR   GO; GO:0000976; F:transcription cis-regulatory region binding; ISS:UniProtKB.
DR   GO; GO:0019827; P:stem cell population maintenance; ISS:UniProtKB.
DR   CDD; cd00086; homeodomain; 1.
DR   InterPro; IPR009057; Homeobox-like_sf.
DR   InterPro; IPR017970; Homeobox_CS.
DR   InterPro; IPR001356; Homeobox_dom.
DR   Pfam; PF00046; Homeodomain; 1.
DR   SMART; SM00389; HOX; 1.
DR   SUPFAM; SSF46689; SSF46689; 1.
DR   PROSITE; PS00027; HOMEOBOX_1; 1.
DR   PROSITE; PS50071; HOMEOBOX_2; 1.
PE   2: Evidence at transcript level;
KW   Activator; Developmental protein; DNA-binding; Homeobox; Nucleus;
KW   Reference proteome; Repeat; Repressor; Transcription;
KW   Transcription regulation.
FT   CHAIN           1..305
FT                   /note="Homeobox protein NANOG"
FT                   /id="PRO_0000261421"
FT   REPEAT          196..200
FT                   /note="1"
FT   REPEAT          201..205
FT                   /note="2"
FT   REPEAT          206..210
FT                   /note="3"
FT   REPEAT          216..220
FT                   /note="4"
FT   REPEAT          221..225
FT                   /note="5"
FT   REPEAT          226..230
FT                   /note="6"
FT   REPEAT          231..235
FT                   /note="7"
FT   REPEAT          236..240
FT                   /note="8"
FT   REGION          1..95
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   REGION          122..151
FT                   /note="Required for DNA-binding"
FT                   /evidence="ECO:0000250|UniProtKB:Q9H9S0"
FT   REGION          196..240
FT                   /note="8 X repeats starting with a Trp in each unit"
FT   REGION          196..240
FT                   /note="Sufficient for transactivation activity"
FT                   /evidence="ECO:0000250"
FT   REGION          241..305
FT                   /note="Sufficient for strong transactivation activity"
FT                   /evidence="ECO:0000250"
FT   COMPBIAS        35..81
FT                   /note="Polar residues"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ   SEQUENCE   305 AA;  34153 MW;  3FC6FD226DF5E998 CRC64;
     MSVDPACPQS LPCLEASDSK ESSPMPVICG PEENYPSLQM SSAEMPHTET VSPLPSSMDL
     LIQDSPDSST SPKGKQPTAA ENSATKKEDK VPVKKQKART VFSSAQLCVL NDRFQRQKYL
     SLQQMQELSN ILNLSYKQVK TWFQNQRMKS KRWQKNNWPK NSNGVTQKAS APTYPSLYSS
     CHQGCLVNPT GNLPMWSNQT WNNSSWSNQT QNIQSWSNHS WNAQTWCTQS WNNQAWNSPF
     SNCGEESLQS CLQFQPNSPA SDLEAALEAA GEGLNVIQQT TRYLSTPQTV DLLLNYSTNM
     QPEDV