H31_ENCCU

ID   H31_ENCCU               Reviewed;         144 AA.
AC   Q8SS77;
DT   31-AUG-2004, integrated into UniProtKB/Swiss-Prot.
DT   23-JAN-2007, sequence version 3.
DT   03-AUG-2022, entry version 104.
DE   RecName: Full=Histone H3.1;
GN   Name=HHT1; OrderedLocusNames=ECU03_1460;
OS   Encephalitozoon cuniculi (strain GB-M1) (Microsporidian parasite).
OC   Eukaryota; Fungi; Fungi incertae sedis; Microsporidia; Unikaryonidae;
OC   Encephalitozoon.
OX   NCBI_TaxID=284813;
RN   [1]
RP   NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC   STRAIN=GB-M1;
RX   PubMed=11719806; DOI=10.1038/35106579;
RA   Katinka M.D., Duprat S., Cornillot E., Metenier G., Thomarat F.,
RA   Prensier G., Barbe V., Peyretaillade E., Brottier P., Wincker P.,
RA   Delbac F., El Alaoui H., Peyret P., Saurin W., Gouy M., Weissenbach J.,
RA   Vivares C.P.;
RT   "Genome sequence and gene compaction of the eukaryote parasite
RT   Encephalitozoon cuniculi.";
RL   Nature 414:450-453(2001).
CC   -!- FUNCTION: Core component of nucleosome. Nucleosomes wrap and compact
CC       DNA into chromatin, limiting DNA accessibility to the cellular
CC       machineries which require DNA as a template. Histones thereby play a
CC       central role in transcription regulation, DNA repair, DNA replication
CC       and chromosomal stability. DNA accessibility is regulated via a complex
CC       set of post-translational modifications of histones, also called
CC       histone code, and nucleosome remodeling.
CC   -!- SUBUNIT: The nucleosome is a histone octamer containing two molecules
CC       each of H2A, H2B, H3 and H4 assembled in one H3-H4 heterotetramer and
CC       two H2A-H2B heterodimers. The octamer wraps approximately 147 bp of
CC       DNA.
CC   -!- SUBCELLULAR LOCATION: Nucleus {ECO:0000250}. Chromosome {ECO:0000250}.
CC   -!- PTM: Mono-, di- and trimethylated to form H3K4me1/2/3. H3K4me activates
CC       gene expression by regulating transcription elongation and plays a role
CC       in telomere length maintenance. H3K4me enrichment correlates with
CC       transcription levels, and occurs in a 5' to 3' gradient with H3K4me3
CC       enrichment at the 5'-end of genes, shifting to H3K4me2 and then
CC       H3K4me1. H3K36me represses gene expression (By similarity).
CC       {ECO:0000250}.
CC   -!- PTM: Acetylation of histone H3 leads to transcriptional activation.
CC       {ECO:0000250}.
CC   -!- SIMILARITY: Belongs to the histone H3 family. {ECO:0000305}.
CC   -!- CAUTION: To ensure consistency between histone entries, we follow the
CC       'Brno' nomenclature for histone modifications, with positions referring
CC       to those used in the literature for the 'closest' model organism. Due
CC       to slight variations in histone sequences between organisms and to the
CC       presence of initiator methionine in UniProtKB/Swiss-Prot sequences, the
CC       actual positions of modified amino acids in the sequence generally
CC       differ. In this entry the following conventions are used: H3K4me1/2/3 =
CC       mono-, di- and trimethylated Lys-5; H3K9ac = acetylated Lys-10; H3K9me1
CC       = monomethylated Lys-10; H3K14ac = acetylated Lys-15; H3K14me2 =
CC       dimethylated Lys-15; H3K18ac = acetylated Lys-19; H3K18me1 =
CC       monomethylated Lys-19; H3K23ac = acetylated Lys-24; H3K23me1 =
CC       monomethylated Lys-24; H3K27ac = acetylated Lys-28; H3K27me1/2/3 =
CC       mono-, di- and trimethylated Lys-28; H3K36ac = acetylated Lys-39;
CC       H3K36me1/2/3 = mono-, di- and trimethylated Lys-39; H3K56ac =
CC       acetylated Lys-58. {ECO:0000305}.
CC   ---------------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC   ---------------------------------------------------------------------------
DR   EMBL; AL590443; CAD26289.1; -; Genomic_DNA.
DR   RefSeq; NP_597654.1; NM_001041018.1.
DR   AlphaFoldDB; Q8SS77; -.
DR   SMR; Q8SS77; -.
DR   STRING; 284813.Q8SS77; -.
DR   GeneID; 858816; -.
DR   KEGG; ecu:ECU03_1460; -.
DR   VEuPathDB; MicrosporidiaDB:ECU03_1460; -.
DR   HOGENOM; CLU_078295_4_1_1; -.
DR   InParanoid; Q8SS77; -.
DR   OMA; MPRDINL; -.
DR   OrthoDB; 1564596at2759; -.
DR   Proteomes; UP000000819; Chromosome III.
DR   GO; GO:0000786; C:nucleosome; IEA:UniProtKB-KW.
DR   GO; GO:0005634; C:nucleus; IEA:UniProtKB-SubCell.
DR   GO; GO:0003677; F:DNA binding; IEA:UniProtKB-KW.
DR   GO; GO:0046982; F:protein heterodimerization activity; IEA:InterPro.
DR   GO; GO:0030527; F:structural constituent of chromatin; IEA:InterPro.
DR   Gene3D; 1.10.20.10; -; 1.
DR   InterPro; IPR009072; Histone-fold.
DR   InterPro; IPR007125; Histone_H2A/H2B/H3.
DR   InterPro; IPR000164; Histone_H3/CENP-A.
DR   PANTHER; PTHR11426; PTHR11426; 1.
DR   Pfam; PF00125; Histone; 1.
DR   PRINTS; PR00622; HISTONEH3.
DR   SMART; SM00428; H3; 1.
DR   SUPFAM; SSF47113; SSF47113; 1.
DR   PROSITE; PS00322; HISTONE_H3_1; 1.
PE   3: Inferred from homology;
KW   Acetylation; Chromosome; DNA-binding; Methylation; Nucleosome core;
KW   Nucleus; Reference proteome.
FT   INIT_MET        1
FT                   /note="Removed"
FT                   /evidence="ECO:0000250"
FT   CHAIN           2..144
FT                   /note="Histone H3.1"
FT                   /id="PRO_0000221360"
FT   REGION          1..45
FT                   /note="Disordered"
FT                   /evidence="ECO:0000256|SAM:MobiDB-lite"
FT   MOD_RES         5
FT                   /note="N6,N6,N6-trimethyllysine; alternate"
FT                   /evidence="ECO:0000250"
FT   MOD_RES         5
FT                   /note="N6,N6-dimethyllysine; alternate"
FT                   /evidence="ECO:0000250"
FT   MOD_RES         5
FT                   /note="N6-methyllysine; alternate"
FT                   /evidence="ECO:0000250"
FT   MOD_RES         10
FT                   /note="N6-acetyllysine; alternate"
FT                   /evidence="ECO:0000250"
FT   MOD_RES         10
FT                   /note="N6-methyllysine; alternate"
FT                   /evidence="ECO:0000250"
FT   MOD_RES         15
FT                   /note="N6,N6-dimethyllysine; alternate"
FT                   /evidence="ECO:0000250"
FT   MOD_RES         15
FT                   /note="N6-acetyllysine; alternate"
FT                   /evidence="ECO:0000250"
FT   MOD_RES         19
FT                   /note="N6-acetyllysine; alternate"
FT                   /evidence="ECO:0000250"
FT   MOD_RES         19
FT                   /note="N6-methyllysine; alternate"
FT                   /evidence="ECO:0000250"
FT   MOD_RES         24
FT                   /note="N6-acetyllysine; alternate"
FT                   /evidence="ECO:0000250"
FT   MOD_RES         24
FT                   /note="N6-methyllysine; alternate"
FT                   /evidence="ECO:0000250"
FT   MOD_RES         28
FT                   /note="N6,N6,N6-trimethyllysine; alternate"
FT                   /evidence="ECO:0000250"
FT   MOD_RES         28
FT                   /note="N6,N6-dimethyllysine; alternate"
FT                   /evidence="ECO:0000250"
FT   MOD_RES         28
FT                   /note="N6-acetyllysine; alternate"
FT                   /evidence="ECO:0000250"
FT   MOD_RES         28
FT                   /note="N6-methyllysine; alternate"
FT                   /evidence="ECO:0000250"
FT   MOD_RES         39
FT                   /note="N6,N6,N6-trimethyllysine; alternate"
FT                   /evidence="ECO:0000250"
FT   MOD_RES         39
FT                   /note="N6,N6-dimethyllysine; alternate"
FT                   /evidence="ECO:0000250"
FT   MOD_RES         39
FT                   /note="N6-acetyllysine; alternate"
FT                   /evidence="ECO:0000250"
FT   MOD_RES         39
FT                   /note="N6-methyllysine; alternate"
FT                   /evidence="ECO:0000250"
FT   MOD_RES         58
FT                   /note="N6-acetyllysine"
FT                   /evidence="ECO:0000250"
SQ   SEQUENCE   144 AA;  15997 MW;  F19A4864D89D0B5F CRC64;
     MARTKQSARK TTGGKAPRKQ LSAKSARKGV SPASSAGAKK SRYRPGSVAL KEIRRYQKST
     DFLIRRLPFQ RACRSVVKEC SNATDIRFQG PALASIQEAL EVYLVGLFED AMLCAYHAKR
     VTVFPKDISL VLKLRSRHVK SISD