CFP1_CAEEL
ID CFP1_CAEEL Reviewed; 508 AA.
AC Q9XUE7; D3YT77; E3W754; E3W755; E3W756; E3W757; Q86D13;
DT 02-DEC-2020, integrated into UniProtKB/Swiss-Prot.
DT 01-JUN-2003, sequence version 3.
DT 03-AUG-2022, entry version 134.
DE RecName: Full=CXXC-type zinc finger protein 1 {ECO:0000250|UniProtKB:Q9P0U4, ECO:0000303|PubMed:24653213};
GN Name=cfp-1 {ECO:0000312|WormBase:F52B11.1a};
GN ORFNames=F52B11.1 {ECO:0000312|WormBase:F52B11.1a};
OS Caenorhabditis elegans.
OC Eukaryota; Metazoa; Ecdysozoa; Nematoda; Chromadorea; Rhabditida;
OC Rhabditina; Rhabditomorpha; Rhabditoidea; Rhabditidae; Peloderinae;
OC Caenorhabditis.
OX NCBI_TaxID=6239 {ECO:0000312|Proteomes:UP000001940};
RN [1] {ECO:0000312|Proteomes:UP000001940}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Bristol N2 {ECO:0000312|Proteomes:UP000001940};
RX PubMed=9851916; DOI=10.1126/science.282.5396.2012;
RG The C. elegans sequencing consortium;
RT "Genome sequence of the nematode C. elegans: a platform for investigating
RT biology.";
RL Science 282:2012-2018(1998).
RN [2] {ECO:0000305}
RP FUNCTION.
RX PubMed=24653213; DOI=10.1101/gr.161992.113;
RA Chen R.A., Stempor P., Down T.A., Zeiser E., Feuer S.K., Ahringer J.;
RT "Extreme HOT regions are CpG-dense promoters in C. elegans and humans.";
RL Genome Res. 24:1138-1146(2014).
RN [3] {ECO:0000305}
RP FUNCTION.
RX PubMed=30941832; DOI=10.1111/febs.14833;
RA Pokhrel B., Chen Y., Biro J.J.;
RT "CFP-1 interacts with HDAC1/2 complexes in C. elegans development.";
RL FEBS J. 286:2490-2504(2019).
RN [4] {ECO:0000305}
RP FUNCTION, IDENTIFICATION IN THE SET2 COMPLEX, INTERACTION WITH WDR-5.1;
RP ASH-2; DPY-30; HDA-1; SIN-3 AND MRG-1, AND ASSOCIATION WITH THE SIN3S
RP COMPLEX.
RX PubMed=31602465; DOI=10.1093/nar/gkz880;
RA Beurton F., Stempor P., Caron M., Appert A., Dong Y., Chen R.A., Cluet D.,
RA Coute Y., Herbette M., Huang N., Polveche H., Spichty M., Bedet C.,
RA Ahringer J., Palladino F.;
RT "Physical and functional interaction between SET1/COMPASS complex component
RT CFP-1 and a Sin3S HDAC complex in C. elegans.";
RL Nucleic Acids Res. 47:11164-11180(2019).
CC -!- FUNCTION: Transcriptional activator that exhibits a unique DNA binding
CC specificity for CpG motifs; enriched at promoters containing the
CC trimethylation mark on histone H3 'Lys-4' (H3K4me3) (PubMed:24653213).
CC Forms part of the SET2 complex and interacts with the SIN3S HDAC
CC complex at promoters (PubMed:31602465). Required for H3K4
CC trimethylation and plays a repressive role in the expression of heat
CC shock and salt-inducible genes (PubMed:30941832). Required for
CC fertility, in cooperation with class I histone deacetylases (HDACs)
CC (PubMed:24653213, PubMed:30941832, PubMed:31602465).
CC {ECO:0000269|PubMed:24653213, ECO:0000269|PubMed:30941832,
CC ECO:0000269|PubMed:31602465}.
CC -!- SUBUNIT: Component of the SET2 complex (also known as the SET1/COMPASS
CC complex), which contains at least set-2, swd-2.1, cfp-1, rbbp-5, wdr-
CC 5.1, dpy-30 and ash-2 (PubMed:31602465). Within the complex, interacts
CC with wdr-5.1, ash-2 and dpy-30 (PubMed:31602465). Also interacts with
CC the SIN3S complex, which contains at least sin-3, hda-1, athp-1 and
CC mrg-1 (PubMed:31602465). Interacts with sin-3, hda-1 and mrg-1
CC (PubMed:31602465). {ECO:0000269|PubMed:31602465}.
CC -!- SUBCELLULAR LOCATION: Nucleus {ECO:0000305|PubMed:24653213}.
CC -!- ALTERNATIVE PRODUCTS:
CC Event=Alternative splicing; Named isoforms=7;
CC Name=a {ECO:0000312|WormBase:F52B11.1a};
CC IsoId=Q9XUE7-1; Sequence=Displayed;
CC Name=b {ECO:0000312|WormBase:F52B11.1b};
CC IsoId=Q9XUE7-2; Sequence=VSP_060819;
CC Name=c {ECO:0000312|WormBase:F52B11.1c};
CC IsoId=Q9XUE7-3; Sequence=VSP_060818;
CC Name=d {ECO:0000312|WormBase:F52B11.1d};
CC IsoId=Q9XUE7-4; Sequence=VSP_060820;
CC Name=e {ECO:0000312|WormBase:F52B11.1e};
CC IsoId=Q9XUE7-5; Sequence=VSP_060817;
CC Name=f {ECO:0000312|WormBase:F52B11.1f};
CC IsoId=Q9XUE7-6; Sequence=VSP_060816;
CC Name=g {ECO:0000312|WormBase:F52B11.1g};
CC IsoId=Q9XUE7-7; Sequence=VSP_060815;
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; BX284604; CAB05197.3; -; Genomic_DNA.
DR EMBL; BX284604; CAD89737.1; -; Genomic_DNA.
DR EMBL; BX284604; CBK19450.1; -; Genomic_DNA.
DR EMBL; BX284604; CBX53323.1; -; Genomic_DNA.
DR EMBL; BX284604; CBX53324.1; -; Genomic_DNA.
DR EMBL; BX284604; CBX53325.1; -; Genomic_DNA.
DR EMBL; BX284604; CBX53326.1; -; Genomic_DNA.
DR PIR; T22484; T22484.
DR RefSeq; NP_001023214.1; NM_001028043.3.
DR RefSeq; NP_001023215.1; NM_001028044.3.
DR RefSeq; NP_001255781.1; NM_001268852.1.
DR RefSeq; NP_001255782.1; NM_001268853.1.
DR RefSeq; NP_001255783.1; NM_001268854.1.
DR RefSeq; NP_001255784.1; NM_001268855.1. [Q9XUE7-6]
DR RefSeq; NP_001255785.1; NM_001268856.1. [Q9XUE7-7]
DR AlphaFoldDB; Q9XUE7; -.
DR SMR; Q9XUE7; -.
DR DIP; DIP-26307N; -.
DR STRING; 6239.F52B11.1a.1; -.
DR EPD; Q9XUE7; -.
DR PaxDb; Q9XUE7; -.
DR PeptideAtlas; Q9XUE7; -.
DR EnsemblMetazoa; F52B11.1a.1; F52B11.1a.1; WBGene00009924. [Q9XUE7-1]
DR EnsemblMetazoa; F52B11.1a.2; F52B11.1a.2; WBGene00009924. [Q9XUE7-1]
DR EnsemblMetazoa; F52B11.1b.1; F52B11.1b.1; WBGene00009924. [Q9XUE7-2]
DR EnsemblMetazoa; F52B11.1c.1; F52B11.1c.1; WBGene00009924. [Q9XUE7-3]
DR EnsemblMetazoa; F52B11.1d.1; F52B11.1d.1; WBGene00009924. [Q9XUE7-4]
DR EnsemblMetazoa; F52B11.1e.1; F52B11.1e.1; WBGene00009924. [Q9XUE7-5]
DR EnsemblMetazoa; F52B11.1f.1; F52B11.1f.1; WBGene00009924. [Q9XUE7-6]
DR EnsemblMetazoa; F52B11.1g.1; F52B11.1g.1; WBGene00009924. [Q9XUE7-7]
DR GeneID; 178363; -.
DR UCSC; F52B11.1a.3; c. elegans.
DR CTD; 178363; -.
DR WormBase; F52B11.1a; CE33791; WBGene00009924; cfp-1.
DR WormBase; F52B11.1b; CE33792; WBGene00009924; cfp-1.
DR WormBase; F52B11.1c; CE44575; WBGene00009924; cfp-1.
DR WormBase; F52B11.1d; CE45513; WBGene00009924; cfp-1.
DR WormBase; F52B11.1e; CE45491; WBGene00009924; cfp-1.
DR WormBase; F52B11.1f; CE45418; WBGene00009924; cfp-1.
DR WormBase; F52B11.1g; CE45430; WBGene00009924; cfp-1.
DR eggNOG; KOG1632; Eukaryota.
DR GeneTree; ENSGT00940000169358; -.
DR HOGENOM; CLU_041345_0_0_1; -.
DR InParanoid; Q9XUE7; -.
DR OMA; CGMETNG; -.
DR OrthoDB; 1072172at2759; -.
DR PhylomeDB; Q9XUE7; -.
DR Proteomes; UP000001940; Chromosome IV.
DR Bgee; WBGene00009924; Expressed in embryo and 4 other tissues.
DR ExpressionAtlas; Q9XUE7; baseline and differential.
DR GO; GO:0048188; C:Set1C/COMPASS complex; IBA:GO_Central.
DR GO; GO:0003677; F:DNA binding; IEA:UniProtKB-KW.
DR GO; GO:0046872; F:metal ion binding; IEA:UniProtKB-KW.
DR GO; GO:0035064; F:methylated histone binding; IBA:GO_Central.
DR GO; GO:0051568; P:histone H3-K4 methylation; IBA:GO_Central.
DR GO; GO:0045893; P:positive regulation of transcription, DNA-templated; IBA:GO_Central.
DR GO; GO:0060290; P:transdifferentiation; IMP:WormBase.
DR InterPro; IPR022056; CpG-bd_C.
DR InterPro; IPR037869; Spp1/CFP1.
DR PANTHER; PTHR46174; PTHR46174; 1.
DR Pfam; PF12269; CpG_bind_C; 1.
PE 1: Evidence at protein level;
KW Alternative splicing; DNA-binding; Metal-binding; Nucleus;
KW Reference proteome; Transcription; Transcription regulation; Zinc;
KW Zinc-finger.
FT CHAIN 1..508
FT /note="CXXC-type zinc finger protein 1"
FT /id="PRO_0000451595"
FT ZN_FING 10..47
FT /note="CXXC-type"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00509"
FT REGION 95..156
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 453..508
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 139..154
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 453..481
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT VAR_SEQ 1..447
FT /note="Missing (in isoform g)"
FT /evidence="ECO:0000305"
FT /id="VSP_060815"
FT VAR_SEQ 1..410
FT /note="Missing (in isoform f)"
FT /evidence="ECO:0000305"
FT /id="VSP_060816"
FT VAR_SEQ 1..310
FT /note="Missing (in isoform e)"
FT /evidence="ECO:0000305"
FT /id="VSP_060817"
FT VAR_SEQ 1..187
FT /note="Missing (in isoform c)"
FT /evidence="ECO:0000305"
FT /id="VSP_060818"
FT VAR_SEQ 1..151
FT /note="MSNKEIEDNEDVWKERCMNCIRCNDEKNCGTCWPCRNGKTCDMRKCFSAKRL
FT YNEKVKRQTDENLKAIMAKTAQREAAHQAATTTAPSAPVVIEQQVEKKKRGRKKGSGNG
FT GAAAAAQQRKANIINERDYVPNRPTRQQSADLRRKRTQLN -> MKRIVEHVGRVEMEK
FT LVICGNVSQRKDYIMRK (in isoform b)"
FT /evidence="ECO:0000305"
FT /id="VSP_060819"
FT VAR_SEQ 1..68
FT /note="Missing (in isoform d)"
FT /evidence="ECO:0000305"
FT /id="VSP_060820"
SQ SEQUENCE 508 AA; 58082 MW; 2D3B11A0C3F0CEE6 CRC64;
MSNKEIEDNE DVWKERCMNC IRCNDEKNCG TCWPCRNGKT CDMRKCFSAK RLYNEKVKRQ
TDENLKAIMA KTAQREAAHQ AATTTAPSAP VVIEQQVEKK KRGRKKGSGN GGAAAAAQQR
KANIINERDY VPNRPTRQQS ADLRRKRTQL NAEPDKHPRQ CLNPNCIYES RIDSKYCSDE
CGKELARMRL TEILPNRCKQ YFFEGPSGGP RSLEDEIKPK RAKINREVQK LTESEKNMMA
FLNKLVEFIK TQLKLQPLGT EERYDDNLYE GCIVCGLPDI PLLKYTKHIE LCWARSEKAI
SFGAPEKNND MFYCEKYDSR TNSFCKRLKS LCPEHRKLGD EQHLKVCGYP KKWEDGMIET
AKTVSELIEM EDPFGEEGCR TKKDACHKHH KWIPSLRGTI ELEQACLFQK MYELCHEMHK
LNAHAEWTTN ALSIMMHKQP NIIDSEQMSL FNKSQSTSSS ASAHGATTPI SSTSSSSSSS
SKNDDEMEDT AEFLANLAVQ KEEETQNN