HCFC1_MESAU
ID HCFC1_MESAU Reviewed; 2090 AA.
AC P51611;
DT 01-OCT-1996, integrated into UniProtKB/Swiss-Prot.
DT 01-OCT-1996, sequence version 1.
DT 03-AUG-2022, entry version 131.
DE RecName: Full=Host cell factor 1 {ECO:0000250|UniProtKB:P51610};
DE Short=HCF {ECO:0000303|PubMed:9087427};
DE Short=HCF-1 {ECO:0000250|UniProtKB:P51610};
DE AltName: Full=C1 factor {ECO:0000250|UniProtKB:P51610};
DE AltName: Full=VCAF {ECO:0000250|UniProtKB:P51610};
DE AltName: Full=VP16 accessory protein {ECO:0000250|UniProtKB:P51610};
DE Contains:
DE RecName: Full=HCF N-terminal chain 1 {ECO:0000250|UniProtKB:P51610};
DE Contains:
DE RecName: Full=HCF N-terminal chain 2 {ECO:0000250|UniProtKB:P51610};
DE Contains:
DE RecName: Full=HCF N-terminal chain 3 {ECO:0000250|UniProtKB:P51610};
DE Contains:
DE RecName: Full=HCF N-terminal chain 4 {ECO:0000250|UniProtKB:P51610};
DE Contains:
DE RecName: Full=HCF N-terminal chain 5 {ECO:0000250|UniProtKB:P51610};
DE Contains:
DE RecName: Full=HCF N-terminal chain 6 {ECO:0000250|UniProtKB:P51610};
DE Contains:
DE RecName: Full=HCF C-terminal chain 1 {ECO:0000250|UniProtKB:P51610};
DE Contains:
DE RecName: Full=HCF C-terminal chain 2 {ECO:0000250|UniProtKB:P51610};
DE Contains:
DE RecName: Full=HCF C-terminal chain 3 {ECO:0000250|UniProtKB:P51610};
DE Contains:
DE RecName: Full=HCF C-terminal chain 4 {ECO:0000250|UniProtKB:P51610};
DE Contains:
DE RecName: Full=HCF C-terminal chain 5 {ECO:0000250|UniProtKB:P51610};
DE Contains:
DE RecName: Full=HCF C-terminal chain 6 {ECO:0000250|UniProtKB:P51610};
GN Name=HCFC1 {ECO:0000250|UniProtKB:P51610};
OS Mesocricetus auratus (Golden hamster).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea;
OC Cricetidae; Cricetinae; Mesocricetus.
OX NCBI_TaxID=10036;
RN [1]
RP NUCLEOTIDE SEQUENCE [MRNA], FUNCTION, AND MUTAGENESIS OF PRO-134.
RX PubMed=9087427; DOI=10.1101/gad.11.6.726;
RA Goto H., Motomura S., Wilson A.C., Freiman R.N., Nakabeppu Y.,
RA Fukushima K., Fujishima M., Herr W., Nishimoto T.;
RT "A single-point mutation in HCF causes temperature-sensitive cell-cycle
RT arrest and disrupts VP16 function.";
RL Genes Dev. 11:726-737(1997).
CC -!- FUNCTION: Transcriptional coregulator (By similarity). Involved in
CC control of the cell cycle (PubMed:9087427). Also antagonizes
CC transactivation by ZBTB17 and GABP2; represses ZBTB17 activation of the
CC p15(INK4b) promoter and inhibits its ability to recruit p300 (By
CC similarity). Coactivator for EGR2 and GABP2 (By similarity). Tethers
CC the chromatin modifying Set1/Ash2 histone H3 'Lys-4' methyltransferase
CC (H3K4me) and Sin3 histone deacetylase (HDAC) complexes (involved in the
CC activation and repression of transcription respectively) together (By
CC similarity). As part of the NSL complex it may be involved in
CC acetylation of nucleosomal histone H4 on several lysine residues (By
CC similarity). Recruits KMT2E to E2F1 responsive promoters promoting
CC transcriptional activation and thereby facilitates G1 to S phase
CC transition (By similarity). Modulates expression of homeobox protein
CC PDX1, perhaps acting in concert with transcription factor E2F1, thereby
CC regulating pancreatic beta-cell growth and glucose-stimulated insulin
CC secretion (By similarity). May negatively modulate transcriptional
CC activity of FOXO3 (By similarity). {ECO:0000250|UniProtKB:D3ZN95,
CC ECO:0000250|UniProtKB:P51610, ECO:0000269|PubMed:9087427}.
CC -!- SUBUNIT: Composed predominantly of six polypeptides ranging from 110 to
CC 150 kDa and a minor 300 kDa polypeptide. The majority of N- and C-
CC terminal cleavage products remain tightly, albeit non-covalently,
CC associated. Interacts with POU2F1, CREB3, ZBTB17, EGR2, E2F4, CREBZF,
CC SP1, GABP2, Sin3 HDAC complex (SIN3A, HDAC1, HDAC2, SUDS3), SAP30,
CC SIN3B and FHL2. Component of a MLL1 complex, composed of at least the
CC core components KMT2A/MLL1, ASH2L, HCFC1, WDR5 and RBBP5, as well as
CC the facultative components BAP18, CHD8, DPY30, E2F6, HCFC2, HSP70,
CC INO80C, KANSL1, LAS1L, MAX, MCRS1, MEN1, MGA, KAT8, PELP1, PHF20,
CC PRP31, RING2, RUVBL1, RUVBL2, SENP3, TAF1, TAF4, TAF6, TAF7, TAF9 and
CC TEX10. Component of a THAP1/THAP3-HCFC1-OGT complex that is required
CC for the regulation of the transcriptional activity of RRM1. Interacts
CC directly with THAP3 (via its HBM). Interacts (via the Kelch-repeat
CC domain) with THAP1 (via the HBM); the interaction recruits HCHC1 to the
CC RRM1. Interacts directly with OGT; the interaction, which requires the
CC HCFC1 cleavage site domain, glycosylates and promotes the proteolytic
CC processing of HCFC1 and retains OGT in the nucleus. Component of the
CC SET1 complex, at least composed of the catalytic subunit (SETD1A or
CC SETD1B), WDR5, WDR82, RBBP5, ASH2L, CXXC1, HCFC1 and DPY30. Component
CC of the NSL complex at least composed of MOF/KAT8, KANSL1, KANSL2,
CC KANSL3, MCRS1, PHF20, OGT1/OGT, WDR5 and HCFC1. Component of a complex
CC at least composed of ZNF335, HCFC1, CCAR2, EMSY, MKI67, RBBP5, ASH2L
CC and WDR5; the complex is formed as a result of interactions between
CC components of a nuclear receptor-mediated transcription complex and a
CC histone methylation complex (By similarity). Within the complex
CC interacts with ZNF335 (By similarity). Interacts with TET2 and TET3.
CC Interacts with HCFC1R1. Interacts with THAP11. Interacts (via Kelch
CC domain) with KMT2E (via HBM motif). Interacts with E2F1.
CC {ECO:0000250|UniProtKB:P51610, ECO:0000250|UniProtKB:Q61191}.
CC -!- SUBCELLULAR LOCATION: Cytoplasm {ECO:0000250|UniProtKB:P51610}. Nucleus
CC {ECO:0000250|UniProtKB:P51610}. Note=HCFC1R1 modulates its subcellular
CC localization and overexpression of HCFC1R1 leads to accumulation of
CC HCFC1 in the cytoplasm. Non-processed HCFC1 associates with chromatin.
CC Colocalizes with CREB3 and CANX in the ER.
CC {ECO:0000250|UniProtKB:P51610}.
CC -!- DOMAIN: The HCF repeat is a highly specific proteolytic cleavage
CC signal. {ECO:0000250|UniProtKB:P51610}.
CC -!- DOMAIN: The kelch repeats fold into a 6-bladed kelch beta-propeller
CC called the beta-propeller domain which mediates interaction with
CC HCFC1R1. {ECO:0000250|UniProtKB:P51610}.
CC -!- PTM: Proteolytically cleaved at one or several PPCE--THET sites within
CC the HCF repeats. Cleavage is promoted by O-glycosylation (By
CC similarity). Further cleavage of the primary N- and C-terminal chains
CC results in a 'trimming' and accumulation of the smaller chains (By
CC similarity). Cleavage is promoted by O-glycosylation (By similarity).
CC {ECO:0000250|UniProtKB:P51610}.
CC -!- PTM: O-glycosylated. GlcNAcylation by OGT promotes proteolytic
CC processing. {ECO:0000250|UniProtKB:P51610}.
CC -!- PTM: Ubiquitinated. Lys-1862 and Lys-1863 are ubiquitinated both via
CC 'Lys-48'- and 'Lys-63'-linked polyubiquitin chains. BAP1 mediated
CC deubiquitination of 'Lys-48'-linked polyubiquitin chains;
CC deubiquitination by BAP1 does not seem to stabilize the protein.
CC {ECO:0000250|UniProtKB:P51610}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; D45419; BAA08258.1; -; mRNA.
DR RefSeq; NP_001268286.1; NM_001281357.1.
DR AlphaFoldDB; P51611; -.
DR BMRB; P51611; -.
DR SMR; P51611; -.
DR STRING; 10036.XP_005086983.1; -.
DR PRIDE; P51611; -.
DR GeneID; 101830931; -.
DR CTD; 3054; -.
DR eggNOG; KOG4152; Eukaryota.
DR OrthoDB; 416932at2759; -.
DR Proteomes; UP000189706; Unplaced.
DR GO; GO:0005737; C:cytoplasm; ISS:UniProtKB.
DR GO; GO:0000123; C:histone acetyltransferase complex; ISS:UniProtKB.
DR GO; GO:0071339; C:MLL1 complex; ISS:UniProtKB.
DR GO; GO:0043025; C:neuronal cell body; ISS:UniProtKB.
DR GO; GO:0005634; C:nucleus; ISS:UniProtKB.
DR GO; GO:0048188; C:Set1C/COMPASS complex; ISS:UniProtKB.
DR GO; GO:0003682; F:chromatin binding; ISS:UniProtKB.
DR GO; GO:0003713; F:transcription coactivator activity; IMP:UniProtKB.
DR GO; GO:0007049; P:cell cycle; IEA:UniProtKB-KW.
DR GO; GO:0006325; P:chromatin organization; IEA:UniProtKB-KW.
DR GO; GO:0043984; P:histone H4-K16 acetylation; ISS:UniProtKB.
DR GO; GO:0043981; P:histone H4-K5 acetylation; ISS:UniProtKB.
DR GO; GO:0043982; P:histone H4-K8 acetylation; ISS:UniProtKB.
DR GO; GO:0000122; P:negative regulation of transcription by RNA polymerase II; ISS:UniProtKB.
DR GO; GO:0045931; P:positive regulation of mitotic cell cycle; IMP:UniProtKB.
DR GO; GO:0050821; P:protein stabilization; ISS:UniProtKB.
DR CDD; cd00063; FN3; 2.
DR Gene3D; 2.120.10.80; -; 2.
DR Gene3D; 2.60.40.10; -; 2.
DR InterPro; IPR003961; FN3_dom.
DR InterPro; IPR036116; FN3_sf.
DR InterPro; IPR037854; HCF1.
DR InterPro; IPR043536; HCF1/2.
DR InterPro; IPR013783; Ig-like_fold.
DR InterPro; IPR015915; Kelch-typ_b-propeller.
DR InterPro; IPR006652; Kelch_1.
DR PANTHER; PTHR46003; PTHR46003; 1.
DR PANTHER; PTHR46003:SF3; PTHR46003:SF3; 1.
DR Pfam; PF01344; Kelch_1; 1.
DR SMART; SM00060; FN3; 3.
DR SUPFAM; SSF117281; SSF117281; 1.
DR SUPFAM; SSF49265; SSF49265; 1.
DR PROSITE; PS50853; FN3; 3.
PE 1: Evidence at protein level;
KW Acetylation; Autocatalytic cleavage; Cell cycle; Chromatin regulator;
KW Cytoplasm; Glycoprotein; Isopeptide bond; Kelch repeat; Methylation;
KW Nucleus; Phosphoprotein; Reference proteome; Repeat; Ubl conjugation.
FT INIT_MET 1
FT /note="Removed"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT CHAIN 2..1432
FT /note="HCF N-terminal chain 6"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT /id="PRO_0000016623"
FT CHAIN 2..1332
FT /note="HCF N-terminal chain 5"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT /id="PRO_0000016624"
FT CHAIN 2..1304
FT /note="HCF N-terminal chain 4"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT /id="PRO_0000016625"
FT CHAIN 2..1110
FT /note="HCF N-terminal chain 3"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT /id="PRO_0000016626"
FT CHAIN 2..1081
FT /note="HCF N-terminal chain 2"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT /id="PRO_0000016627"
FT CHAIN 2..1019
FT /note="HCF N-terminal chain 1"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT /id="PRO_0000016628"
FT CHAIN 1020..2090
FT /note="HCF C-terminal chain 1"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT /id="PRO_0000016629"
FT CHAIN 1082..2090
FT /note="HCF C-terminal chain 2"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT /id="PRO_0000016630"
FT CHAIN 1111..2090
FT /note="HCF C-terminal chain 3"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT /id="PRO_0000016631"
FT CHAIN 1305..2090
FT /note="HCF C-terminal chain 4"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT /id="PRO_0000016632"
FT CHAIN 1333..2090
FT /note="HCF C-terminal chain 5"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT /id="PRO_0000016633"
FT CHAIN 1433..2090
FT /note="HCF C-terminal chain 6"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT /id="PRO_0000016634"
FT REPEAT 44..89
FT /note="Kelch 1"
FT /evidence="ECO:0000255"
FT REPEAT 93..140
FT /note="Kelch 2"
FT /evidence="ECO:0000255"
FT REPEAT 148..194
FT /note="Kelch 3"
FT /evidence="ECO:0000255"
FT REPEAT 217..265
FT /note="Kelch 4"
FT /evidence="ECO:0000255"
FT REPEAT 266..313
FT /note="Kelch 5"
FT /evidence="ECO:0000255"
FT DOMAIN 366..469
FT /note="Fibronectin type-III 1"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT REPEAT 1010..1035
FT /note="HCF repeat 1"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT REPEAT 1072..1097
FT /note="HCF repeat 2"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT REPEAT 1101..1126
FT /note="HCF repeat 3"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT REPEAT 1157..1182
FT /note="HCF repeat 4; degenerate"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT REPEAT 1295..1320
FT /note="HCF repeat 5"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT REPEAT 1323..1348
FT /note="HCF repeat 6"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT REPEAT 1358..1383
FT /note="HCF repeat 7; degenerate"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT REPEAT 1423..1448
FT /note="HCF repeat 8"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT DOMAIN 1853..1943
FT /note="Fibronectin type-III 2"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 1945..2061
FT /note="Fibronectin type-III 3"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT REGION 407..434
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 500..550
FT /note="Required for interaction with OGT"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT REGION 610..722
FT /note="Interaction with SIN3A"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT REGION 750..902
FT /note="Interaction with ZBTB17"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT REGION 813..912
FT /note="Interaction with GABP2"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT REGION 1098..1140
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1302..1374
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1444..1486
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2049..2090
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 414..432
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1302..1354
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2052..2070
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT SITE 1019..1020
FT /note="Cleavage; by autolysis"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT SITE 1081..1082
FT /note="Cleavage; by autolysis"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT SITE 1110..1111
FT /note="Cleavage; by autolysis"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT SITE 1304..1305
FT /note="Cleavage; by autolysis"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT SITE 1332..1333
FT /note="Cleavage; by autolysis"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT SITE 1432..1433
FT /note="Cleavage; by autolysis"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT MOD_RES 2
FT /note="N-acetylalanine"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT MOD_RES 6
FT /note="Phosphoserine"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT MOD_RES 288
FT /note="N6-acetyllysine"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT MOD_RES 411
FT /note="Phosphoserine"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT MOD_RES 504
FT /note="Omega-N-methylarginine"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT MOD_RES 524
FT /note="Omega-N-methylarginine"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT MOD_RES 598
FT /note="Phosphoserine"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT MOD_RES 666
FT /note="Phosphoserine"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT MOD_RES 669
FT /note="Phosphoserine"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT MOD_RES 813
FT /note="N6-acetyllysine"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT MOD_RES 1204
FT /note="Phosphoserine"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT MOD_RES 1216
FT /note="Asymmetric dimethylarginine"
FT /evidence="ECO:0000250|UniProtKB:Q61191"
FT MOD_RES 1223
FT /note="Phosphoserine"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT MOD_RES 1500
FT /note="Phosphothreonine"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT MOD_RES 1506
FT /note="Phosphoserine"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT MOD_RES 1559
FT /note="Phosphoserine"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT MOD_RES 1826
FT /note="Phosphoserine"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT MOD_RES 1893
FT /note="Phosphoserine"
FT /evidence="ECO:0000250|UniProtKB:Q61191"
FT MOD_RES 2060
FT /note="N6-acetyllysine"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT CROSSLNK 105
FT /note="Glycyl lysine isopeptide (Lys-Gly) (interchain with
FT G-Cter in ubiquitin)"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT CROSSLNK 163
FT /note="Glycyl lysine isopeptide (Lys-Gly) (interchain with
FT G-Cter in ubiquitin)"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT CROSSLNK 244
FT /note="Glycyl lysine isopeptide (Lys-Gly) (interchain with
FT G-Cter in ubiquitin)"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT CROSSLNK 282
FT /note="Glycyl lysine isopeptide (Lys-Gly) (interchain with
FT G-Cter in SUMO2)"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT CROSSLNK 363
FT /note="Glycyl lysine isopeptide (Lys-Gly) (interchain with
FT G-Cter in ubiquitin)"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT CROSSLNK 1862
FT /note="Glycyl lysine isopeptide (Lys-Gly) (interchain with
FT G-Cter in ubiquitin)"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT CROSSLNK 1863
FT /note="Glycyl lysine isopeptide (Lys-Gly) (interchain with
FT G-Cter in ubiquitin)"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT CROSSLNK 2079
FT /note="Glycyl lysine isopeptide (Lys-Gly) (interchain with
FT G-Cter in SUMO2)"
FT /evidence="ECO:0000250|UniProtKB:P51610"
FT MUTAGEN 134
FT /note="P->S: Causes temperature-sensitive cell cycle arrest
FT in a Go-like state."
FT /evidence="ECO:0000269|PubMed:9087427"
SQ SEQUENCE 2090 AA; 214942 MW; E495E8B1F2385E17 CRC64;
MASAVSPANL PAVLLQPRWK RVVGWSGPVP RPRHGHRAVA IKELIVVFGG GNEGIVDELH
VYNTATNQWF IPAVRGDIPP GCAAYGFVCD GTRLLVFGGM VEYGKYSNDL YELQASRWEW
KRLKAKTPKN GPPPCPRLGH SFSLVGNKCY LFGGLANDSE DPKNNIPRYL NDLYILELRP
GSGVVAWDIP ITYGVLPPPR ESHTAVVYTE KDNKKSKLVI YGGMSGCRLG DLWTLDIETL
TWNKPSLSGV APLPRSLHSA TTIGNKMYVF GGWVPLVMDD VKVATHEKEW KCTNTLACLN
LDTMAWETIL MDTLEDNIPR ARAGHCAVAI NTRLYIWSGR DGYRKAWNNQ VCCKDLWYLE
TEKPPPPARV QLVRANTNSL EVSWGAVATA DSYLLQLQKY DIPATAATAT SPTPNPVPSV
PANPPKSPAP AAAAPAVQPL TQVGITLVPQ AAAAPPSTTT IQVLPTVPGS SISVPTAARA
QGVPAVLKVT GPQATTGTPL VTMRPAGQAG KAPVTVTSLP ASVRMVVPTQ SAQGTVIGSN
PQMSGMAALA AAAAATQKIP PSSAPTVLSV PAGTTIVKTV AVTPGTTTLP ATVKVASSPV
MVSNPATRML KTAAAQVGTS VSSAANTSTR PIITVHKSGT VTVAQQAQVV TTVVGGVTKT
ITLVKSPISV PGGSALISNL GKVMSVVQTK PVQTSAVTGQ ASTGPVTQII QTKGPLPAGT
ILKLVTSADG KPTTIITTTQ ASGAGSKPTI LGISSVSPST TKPGTTTIIK TIPMSAIITQ
AGATGVTSTP GIKSPITIIT TKVMTSGTGA PAKIITAVPK IATGHGQQGV TQVVLKGAPG
QPGAILRTVP MSGVRLVTPV TVSAVKPAVT TLVVKGTTGV TTLGTVTGTV STSLAGAGAH
STSASLATPI TTLGTIATLS SQVINPTAIT VSAAQTTLTA AGGLTTPTIT MQPVSQPTQV
TLITAPSGVE AQPVHDLPVS ILASPTTEQP TATVTIADSG QGDVQPGTVT LVCSNPPCET
HETGTTNTAT TTVVANLGGH PQPTQVQFVC DRQEAAASLV TSAVGQQNGN VVRVCSNPPC
ETHETGTTNT ATTATSNMAG QHGCSNPPCE THETGTTSTA TTAMSSMGTG QQRDTRHTSS
NPTVVRITVA PGALERTQGT VKPQCQTQQA NMTNTTMTVQ ATRSPCPAGP LLRPSVALEA
GNHSPAFVQL ALPSVRVGLS GPSNKDMPTG HQLETYHTYT TNTPTTALSI MGAGELGTAR
LIPTSTYESL QASSPSSTMT MTALEALLCP SATVTQVCSN PPCETHETGT TNTATTSNAG
SAQRVCSNPP CETHETGTTH TATTATSNGG AGQPEGGQQP AGGRPCETHQ TTSTGTTMSV
SVGALLPDAT PSHGTLESGL EVVAVSTVTS QAGATLLASF PTQRVCSNPP CETHETGTTH
TATTVTSNMS SNQDPPPAAS DQGEVVSTQG DSANITSSSG ITTTVSSTLP RAVTTVTQST
PVPGPSVPNI SSLTETPGAL TSEVPIPATI TVTIANTETS DMPFSAVDIL QPPEELQVSP
GPRQQLPPRQ LLQSASTPLM GESSEVLSAS QTPELQAAVD LSSTGDPSSG QEPTSSAVVA
TVVVQPPPPT QSEVDQLSLP QELMAEAQAG TTTLMVTGLT PEELAVTAAA EAAAQAAATE
EAQALAIQAV LQAAQQAVMA GTGEPMDTSE AAAAVTQAEL GHLSAEGQEG QATTIPIVLT
QQELAALVQQ QQQLQEVQAQ AQQQHHLPTE ALAPADSLND PSIESNCLNE LASAVPSTVA
LLPSTATESL APSNTFVAPQ PVVVASPAKM QAAATLTEVD NGIESLGVKP DLPPPPSKAP
VKKENQWFDV GVIKGTSVMV THYFLPPDDA VQSDDDSGTI PDYNQLKKQE LQPGTAYKFR
VAGINACGRG PFSEISAFKT CLPGFPGAPC AIKISKSPDG AHLTWEPPSV TSGKIIEYSV
YLAIQSSQAG GEPKSSTPAQ LAFMRVYCGP SPSCLVQSSS LSNAHIDYTT KPAIIFRIAA
RNEKGYGPAT QVRWLQETSK DSSGTKPASK RPMSSPEMKS APKKSKADGQ