SET1_CANAL
ID SET1_CANAL Reviewed; 1040 AA.
AC Q5ABG1; A0A1D8PCE5;
DT 09-JAN-2007, integrated into UniProtKB/Swiss-Prot.
DT 26-APR-2005, sequence version 1.
DT 03-AUG-2022, entry version 116.
DE RecName: Full=Histone-lysine N-methyltransferase, H3 lysine-4 specific;
DE EC=2.1.1.354 {ECO:0000250|UniProtKB:P38827};
DE AltName: Full=COMPASS component SET1;
DE AltName: Full=SET domain-containing protein 1;
GN Name=SET1; OrderedLocusNames=CAALFM_C100960CA;
GN ORFNames=CaO19.13430, CaO19.6009;
OS Candida albicans (strain SC5314 / ATCC MYA-2876) (Yeast).
OC Eukaryota; Fungi; Dikarya; Ascomycota; Saccharomycotina; Saccharomycetes;
OC Saccharomycetales; Debaryomycetaceae; Candida/Lodderomyces clade; Candida.
OX NCBI_TaxID=237561;
RN [1]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=SC5314 / ATCC MYA-2876;
RX PubMed=15123810; DOI=10.1073/pnas.0401648101;
RA Jones T., Federspiel N.A., Chibana H., Dungan J., Kalman S., Magee B.B.,
RA Newport G., Thorstenson Y.R., Agabian N., Magee P.T., Davis R.W.,
RA Scherer S.;
RT "The diploid genome sequence of Candida albicans.";
RL Proc. Natl. Acad. Sci. U.S.A. 101:7329-7334(2004).
RN [2]
RP GENOME REANNOTATION.
RC STRAIN=SC5314 / ATCC MYA-2876;
RX PubMed=17419877; DOI=10.1186/gb-2007-8-4-r52;
RA van het Hoog M., Rast T.J., Martchenko M., Grindle S., Dignard D.,
RA Hogues H., Cuomo C., Berriman M., Scherer S., Magee B.B., Whiteway M.,
RA Chibana H., Nantel A., Magee P.T.;
RT "Assembly of the Candida albicans genome into sixteen supercontigs aligned
RT on the eight chromosomes.";
RL Genome Biol. 8:RESEARCH52.1-RESEARCH52.12(2007).
RN [3]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA], AND GENOME REANNOTATION.
RC STRAIN=SC5314 / ATCC MYA-2876;
RX PubMed=24025428; DOI=10.1186/gb-2013-14-9-r97;
RA Muzzey D., Schwartz K., Weissman J.S., Sherlock G.;
RT "Assembly of a phased diploid Candida albicans genome facilitates allele-
RT specific measurements and provides a simple model for repeat and indel
RT structure.";
RL Genome Biol. 14:RESEARCH97.1-RESEARCH97.14(2013).
RN [4]
RP FUNCTION.
RX PubMed=16629671; DOI=10.1111/j.1365-2958.2006.05121.x;
RA Raman S.B., Nguyen M.H., Zhang Z., Cheng S., Jia H.Y., Weisner N.,
RA Iczkowski K., Clancy C.J.;
RT "Candida albicans SET1 encodes a histone 3 lysine 4 methyltransferase that
RT contributes to the pathogenesis of invasive candidiasis.";
RL Mol. Microbiol. 60:697-709(2006).
CC -!- FUNCTION: Catalytic component of the COMPASS (Set1C) complex that
CC specifically mono-, di- and trimethylates histone H3 to form
CC H3K4me1/2/3, which subsequently plays a role in telomere length
CC maintenance, transcription elongation regulation and pathogenesis of
CC invasive candidiasis. {ECO:0000269|PubMed:16629671}.
CC -!- CATALYTIC ACTIVITY:
CC Reaction=L-lysyl(4)-[histone H3] + 3 S-adenosyl-L-methionine = 3 H(+) +
CC N(6),N(6),N(6)-trimethyl-L-lysyl(4)-[histone H3] + 3 S-adenosyl-L-
CC homocysteine; Xref=Rhea:RHEA:60260, Rhea:RHEA-COMP:15537, Rhea:RHEA-
CC COMP:15547, ChEBI:CHEBI:15378, ChEBI:CHEBI:29969, ChEBI:CHEBI:57856,
CC ChEBI:CHEBI:59789, ChEBI:CHEBI:61961; EC=2.1.1.354;
CC Evidence={ECO:0000250|UniProtKB:P38827};
CC -!- SUBUNIT: Component of the COMPASS (Set1C) complex. {ECO:0000250}.
CC -!- SUBCELLULAR LOCATION: Nucleus {ECO:0000305}. Chromosome {ECO:0000305}.
CC -!- SIMILARITY: Belongs to the class V-like SAM-binding methyltransferase
CC superfamily. {ECO:0000255|PROSITE-ProRule:PRU00190}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; CP017623; AOW25793.1; -; Genomic_DNA.
DR RefSeq; XP_718971.1; XM_713878.1.
DR AlphaFoldDB; Q5ABG1; -.
DR SMR; Q5ABG1; -.
DR ELM; Q5ABG1; -.
DR STRING; 237561.Q5ABG1; -.
DR PRIDE; Q5ABG1; -.
DR GeneID; 3639280; -.
DR KEGG; cal:CAALFM_C100960CA; -.
DR CGD; CAL0000198993; SET1.
DR VEuPathDB; FungiDB:C1_00960C_A; -.
DR eggNOG; KOG1080; Eukaryota.
DR HOGENOM; CLU_004391_1_0_1; -.
DR InParanoid; Q5ABG1; -.
DR OMA; LHQPLNT; -.
DR OrthoDB; 1017537at2759; -.
DR PHI-base; PHI:2825; -.
DR PRO; PR:Q5ABG1; -.
DR Proteomes; UP000000559; Chromosome 1.
DR GO; GO:0005694; C:chromosome; IEA:UniProtKB-SubCell.
DR GO; GO:0048188; C:Set1C/COMPASS complex; IBA:GO_Central.
DR GO; GO:0042800; F:histone methyltransferase activity (H3-K4 specific); IMP:CGD.
DR GO; GO:0003676; F:nucleic acid binding; IEA:InterPro.
DR GO; GO:0048869; P:cellular developmental process; IMP:CGD.
DR GO; GO:0006325; P:chromatin organization; IEA:UniProtKB-KW.
DR GO; GO:0030447; P:filamentous growth; IMP:CGD.
DR GO; GO:0051568; P:histone H3-K4 methylation; IMP:CGD.
DR GO; GO:0044416; P:induction by symbiont of host defense response; IDA:CGD.
DR GO; GO:0036166; P:phenotypic switching; IMP:CGD.
DR Gene3D; 2.170.270.10; -; 1.
DR Gene3D; 3.30.70.330; -; 1.
DR InterPro; IPR024657; COMPASS_Set1_N-SET.
DR InterPro; IPR012677; Nucleotide-bd_a/b_plait_sf.
DR InterPro; IPR003616; Post-SET_dom.
DR InterPro; IPR035979; RBD_domain_sf.
DR InterPro; IPR044570; Set1-like.
DR InterPro; IPR017111; Set1_fungi.
DR InterPro; IPR024636; SET_assoc.
DR InterPro; IPR001214; SET_dom.
DR InterPro; IPR046341; SET_dom_sf.
DR PANTHER; PTHR45814; PTHR45814; 2.
DR Pfam; PF11764; N-SET; 1.
DR Pfam; PF00856; SET; 1.
DR Pfam; PF11767; SET_assoc; 1.
DR PIRSF; PIRSF037104; Histone_H3-K4_mtfrase_Set1_fun; 1.
DR SMART; SM01291; N-SET; 1.
DR SMART; SM00508; PostSET; 1.
DR SMART; SM00317; SET; 1.
DR SUPFAM; SSF54928; SSF54928; 1.
DR SUPFAM; SSF82199; SSF82199; 1.
DR PROSITE; PS50868; POST_SET; 1.
DR PROSITE; PS51572; SAM_MT43_1; 1.
DR PROSITE; PS50280; SET; 1.
PE 3: Inferred from homology;
KW Chromatin regulator; Chromosome; Methyltransferase; Nucleus;
KW Reference proteome; S-adenosyl-L-methionine; Transferase.
FT CHAIN 1..1040
FT /note="Histone-lysine N-methyltransferase, H3 lysine-4
FT specific"
FT /id="PRO_0000269767"
FT DOMAIN 898..1015
FT /note="SET"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00190"
FT DOMAIN 1024..1040
FT /note="Post-SET"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00155"
FT REGION 1..129
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 175..202
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 353..374
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 591..664
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 692..737
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 50..69
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 82..122
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 181..196
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 628..644
FT /note="Acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 648..662
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 696..713
FT /note="Acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
SQ SEQUENCE 1040 AA; 119161 MW; 30A4796C4C7B0160 CRC64;
MSYNNRSGGG ASGGYSRRGY HGSHRGGYRT GRSKYPEDRY LVGGMLSLNK GSHYESSDNR
YIPNEIGSKS PENRSHRSST KDGRTPSGLS TPLSSSDKVS TPISIESING SDRNTGVNNK
DSEFPKLSHH SDFTSTIPFS RSINPQKNFM VINDSHTPKT DKGIQSKKIR YNGEGVNHVS
DPRIAQSNSN LQKPTKKTKK TPYKQLPQPK FVYNSDSLGP APMSTIIIWD LPISTSEPFL
RNFVSRYGNP LEEMTFITDP TTAVPLGIVT FKFQGNPQKA SELAKNFIKT VRQDELKIDG
ATLKIALNDN ENQLLNRKLE SAKKKMLQQR LQREQEEEKR RQKLVEEQKK QELLKKKEKE
HQESVKKEKS VEHESTIVST RDKNLVYKPN STVLSMRHNH KIISSVILPK DLEKYIKSRP
YILIRDKYVP TKKISSHDIK RALKKYDWTR VLSDKSGFFI VFNSLNECER CFLNEDNKKF
FEYKLVMEMA IPEGFTNNIR ENESKSTNDV LDEATNILIK EFQTFLAKDI RERIIAPNIL
DLLAHDKYPE LVEELKSREQ AAKPKVLVTN NQLKENALSI LEKQRQLFQQ RLPSFRMSHD
RTQQHKPKRR NSIIPMQHAL NFDDDEDSES HSQSESEDED EDETTASRPL TPVVSTMKRE
RSSTITSIED DIELEEREIK KQKVKVPAIE AEIAPESSPE EGEEEEKEEV EIKQEAEEVD
IKFQPTEESP RTVYPEIPFS GDFDLNALQH TIKDSEDLLL AQEVLSETTP SGLSNIEYWS
WKSKNRKDVQ EISQEEEYIE ELPESLQSTT GSFKSEGVRK IPEIEKIGYL PHRKRTNKPI
KTIQYEDEDE EKPNENTNAV QSSRVNRANN RRFAADITAQ IGSESDVLSL NALTKRKKPV
TFARSAIHNW GLYAMEPIAA KEMIIEYVGE RIRQQVAEHR EKSYLKTGIG SSYLFRIDDN
TVIDATKKGG IARFINHCCS PSCTAKIIKV EGKKRIVIYA LRDIEANEEL TYDYKFERET
NDEERIRCLC GAPGCKGYLN