HCF_DROME
ID HCF_DROME Reviewed; 1500 AA.
AC Q9V4C8; A4V109; H9XVN5; Q8IGU2; Q95S26; Q95ZF3; Q9BKH1;
DT 01-MAR-2005, integrated into UniProtKB/Swiss-Prot.
DT 01-OCT-2002, sequence version 2.
DT 03-AUG-2022, entry version 156.
DE RecName: Full=Host cell factor;
DE Short=dHcf;
DE Contains:
DE RecName: Full=HCF N-terminal chain;
DE Contains:
DE RecName: Full=HCF C-terminal chain;
GN Name=Hcf {ECO:0000312|EMBL:AAF59349.2}; ORFNames=CG1710;
OS Drosophila melanogaster (Fruit fly).
OC Eukaryota; Metazoa; Ecdysozoa; Arthropoda; Hexapoda; Insecta; Pterygota;
OC Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; Ephydroidea;
OC Drosophilidae; Drosophila; Sophophora.
OX NCBI_TaxID=7227;
RN [1] {ECO:0000305, ECO:0000312|EMBL:CAC44472.1}
RP NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM A), SUBCELLULAR LOCATION, NUCLEAR
RP LOCALIZATION SIGNAL, AND MUTAGENESIS OF 1470-LYS-ARG-1471 AND
RP 1492-LYS--ARG-1495.
RC TISSUE=Embryo {ECO:0000269|PubMed:12609738};
RX PubMed=12609738; DOI=10.1016/s0378-1119(03)00380-9;
RA Izeta A., Malcomber S., O'Hare P.;
RT "Primary structure and compartmentalization of Drosophila melanogaster host
RT cell factor.";
RL Gene 305:175-183(2003).
RN [2] {ECO:0000305, ECO:0000312|EMBL:AAK28427.1}
RP NUCLEOTIDE SEQUENCE [MRNA] (ISOFORMS A; D AND G), FUNCTION, DEVELOPMENTAL
RP STAGE, AND PROTEOLYTIC CLEAVAGE.
RC STRAIN=Berkeley; TISSUE=Embryo {ECO:0000269|PubMed:12494450};
RX PubMed=12494450; DOI=10.1002/jcp.10193;
RA Mahajan S.S., Johnson K.M., Wilson A.C.;
RT "Molecular cloning of Drosophila HCF reveals proteolytic processing and
RT self-association of the encoded protein.";
RL J. Cell. Physiol. 194:117-126(2003).
RN [3] {ECO:0000305, ECO:0000312|EMBL:AAF59349.2}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Berkeley {ECO:0000269|PubMed:10731132};
RX PubMed=10731132; DOI=10.1126/science.287.5461.2185;
RA Adams M.D., Celniker S.E., Holt R.A., Evans C.A., Gocayne J.D.,
RA Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A., Galle R.F.,
RA George R.A., Lewis S.E., Richards S., Ashburner M., Henderson S.N.,
RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q., Chen L.X., Brandon R.C.,
RA Rogers Y.-H.C., Blazej R.G., Champe M., Pfeiffer B.D., Wan K.H., Doyle C.,
RA Baxter E.G., Helt G., Nelson C.R., Miklos G.L.G., Abril J.F., Agbayani A.,
RA An H.-J., Andrews-Pfannkoch C., Baldwin D., Ballew R.M., Basu A.,
RA Baxendale J., Bayraktaroglu L., Beasley E.M., Beeson K.Y., Benos P.V.,
RA Berman B.P., Bhandari D., Bolshakov S., Borkova D., Botchan M.R., Bouck J.,
RA Brokstein P., Brottier P., Burtis K.C., Busam D.A., Butler H., Cadieu E.,
RA Center A., Chandra I., Cherry J.M., Cawley S., Dahlke C., Davenport L.B.,
RA Davies P., de Pablos B., Delcher A., Deng Z., Mays A.D., Dew I.,
RA Dietz S.M., Dodson K., Doup L.E., Downes M., Dugan-Rocha S., Dunkov B.C.,
RA Dunn P., Durbin K.J., Evangelista C.C., Ferraz C., Ferriera S.,
RA Fleischmann W., Fosler C., Gabrielian A.E., Garg N.S., Gelbart W.M.,
RA Glasser K., Glodek A., Gong F., Gorrell J.H., Gu Z., Guan P., Harris M.,
RA Harris N.L., Harvey D.A., Heiman T.J., Hernandez J.R., Houck J., Hostin D.,
RA Houston K.A., Howland T.J., Wei M.-H., Ibegwam C., Jalali M., Kalush F.,
RA Karpen G.H., Ke Z., Kennison J.A., Ketchum K.A., Kimmel B.E., Kodira C.D.,
RA Kraft C.L., Kravitz S., Kulp D., Lai Z., Lasko P., Lei Y., Levitsky A.A.,
RA Li J.H., Li Z., Liang Y., Lin X., Liu X., Mattei B., McIntosh T.C.,
RA McLeod M.P., McPherson D., Merkulov G., Milshina N.V., Mobarry C.,
RA Morris J., Moshrefi A., Mount S.M., Moy M., Murphy B., Murphy L.,
RA Muzny D.M., Nelson D.L., Nelson D.R., Nelson K.A., Nixon K., Nusskern D.R.,
RA Pacleb J.M., Palazzolo M., Pittman G.S., Pan S., Pollard J., Puri V.,
RA Reese M.G., Reinert K., Remington K., Saunders R.D.C., Scheeler F.,
RA Shen H., Shue B.C., Siden-Kiamos I., Simpson M., Skupski M.P., Smith T.J.,
RA Spier E., Spradling A.C., Stapleton M., Strong R., Sun E., Svirskas R.,
RA Tector C., Turner R., Venter E., Wang A.H., Wang X., Wang Z.-Y.,
RA Wassarman D.A., Weinstock G.M., Weissenbach J., Williams S.M., Woodage T.,
RA Worley K.C., Wu D., Yang S., Yao Q.A., Ye J., Yeh R.-F., Zaveri J.S.,
RA Zhan M., Zhang G., Zhao Q., Zheng L., Zheng X.H., Zhong F.N., Zhong W.,
RA Zhou X., Zhu S.C., Zhu X., Smith H.O., Gibbs R.A., Myers E.W., Rubin G.M.,
RA Venter J.C.;
RT "The genome sequence of Drosophila melanogaster.";
RL Science 287:2185-2195(2000).
RN [4] {ECO:0000305, ECO:0000312|EMBL:AAF59349.2}
RP GENOME REANNOTATION, AND ALTERNATIVE SPLICING.
RC STRAIN=Berkeley;
RX PubMed=12537572; DOI=10.1186/gb-2002-3-12-research0083;
RA Misra S., Crosby M.A., Mungall C.J., Matthews B.B., Campbell K.S.,
RA Hradecky P., Huang Y., Kaminker J.S., Millburn G.H., Prochnik S.E.,
RA Smith C.D., Tupy J.L., Whitfield E.J., Bayraktaroglu L., Berman B.P.,
RA Bettencourt B.R., Celniker S.E., de Grey A.D.N.J., Drysdale R.A.,
RA Harris N.L., Richter J., Russo S., Schroeder A.J., Shu S.Q., Stapleton M.,
RA Yamada C., Ashburner M., Gelbart W.M., Rubin G.M., Lewis S.E.;
RT "Annotation of the Drosophila melanogaster euchromatic genome: a systematic
RT review.";
RL Genome Biol. 3:RESEARCH0083.1-RESEARCH0083.22(2002).
RN [5] {ECO:0000305, ECO:0000312|EMBL:AAL28531.1}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM A).
RC STRAIN=Berkeley {ECO:0000312|EMBL:AAL28531.1};
RC TISSUE=Embryo {ECO:0000269|PubMed:12537569}, and
RC Ovary {ECO:0000269|PubMed:12537569};
RX PubMed=12537569; DOI=10.1186/gb-2002-3-12-research0080;
RA Stapleton M., Carlson J.W., Brokstein P., Yu C., Champe M., George R.A.,
RA Guarin H., Kronmiller B., Pacleb J.M., Park S., Wan K.H., Rubin G.M.,
RA Celniker S.E.;
RT "A Drosophila full-length cDNA resource.";
RL Genome Biol. 3:RESEARCH0080.1-RESEARCH0080.8(2002).
RN [6] {ECO:0000305, ECO:0000312|EMBL:AAT94497.1}
RP NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM A).
RC STRAIN=Berkeley {ECO:0000312|EMBL:AAT94497.1}; TISSUE=Embryo;
RA Stapleton M., Carlson J.W., Chavez C., Frise E., George R.A., Pacleb J.M.,
RA Park S., Wan K.H., Yu C., Rubin G.M., Celniker S.E.;
RL Submitted (AUG-2004) to the EMBL/GenBank/DDBJ databases.
RN [7]
RP IDENTIFICATION BY MASS SPECTROMETRY, IDENTIFICATION IN THE ATAC COMPLEX,
RP AND SUBCELLULAR LOCATION.
RX PubMed=18327268; DOI=10.1038/nsmb.1397;
RA Suganuma T., Gutierrez J.L., Li B., Florens L., Swanson S.K.,
RA Washburn M.P., Abmayr S.M., Workman J.L.;
RT "ATAC is a double histone acetyltransferase complex that stimulates
RT nucleosome sliding.";
RL Nat. Struct. Mol. Biol. 15:364-372(2008).
RN [8]
RP PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-477; SER-958; SER-966;
RP THR-1126 AND SER-1489, AND IDENTIFICATION BY MASS SPECTROMETRY.
RC TISSUE=Embryo;
RX PubMed=18327897; DOI=10.1021/pr700696a;
RA Zhai B., Villen J., Beausoleil S.A., Mintseris J., Gygi S.P.;
RT "Phosphoproteome analysis of Drosophila melanogaster embryos.";
RL J. Proteome Res. 7:1675-1682(2008).
RN [9]
RP IDENTIFICATION IN THE SET1 COMPLEX, AND SUBCELLULAR LOCATION.
RX PubMed=21694722; DOI=10.1038/emboj.2011.194;
RA Ardehali M.B., Mei A., Zobeck K.L., Caron M., Lis J.T., Kusch T.;
RT "Drosophila Set1 is the major histone H3 lysine 4 trimethyltransferase with
RT role in transcription.";
RL EMBO J. 30:2817-2828(2011).
RN [10]
RP IDENTIFICATION IN THE SET1 AND MLL3/4 COMPLEXES.
RX PubMed=21875999; DOI=10.1128/mcb.06092-11;
RA Mohan M., Herz H.M., Smith E.R., Zhang Y., Jackson J., Washburn M.P.,
RA Florens L., Eissenberg J.C., Shilatifard A.;
RT "The COMPASS family of H3K4 methylases in Drosophila.";
RL Mol. Cell. Biol. 31:4310-4318(2011).
CC -!- FUNCTION: May be involved in control of the cell cycle.
CC {ECO:0000269|PubMed:12494450}.
CC -!- BIOPHYSICOCHEMICAL PROPERTIES:
CC Temperature dependence:
CC Optimum temperature is 20 degrees Celsius for complex formation
CC activity of the N-terminus and 33.5 degrees Celsius for nuclear
CC localization of the protein in vitro. {ECO:0000269|PubMed:12494450,
CC ECO:0000269|PubMed:12609738};
CC -!- SUBUNIT: Core component of several methyltransferase-containing
CC complexes. Component of the SET1 complex, composed at least of the
CC catalytic subunit Set1, wds/WDR5, Wdr82, Rbbp5, ash2, Cfp1/CXXC1, hcf
CC and Dpy-30L1. Component of the MLL3/4 complex composed at least of the
CC catalytic subunit trr, ash2, Rbbp5, Dpy-30L1, wds, hcf, ptip, Pa1, Utx,
CC Lpt and Ncoa6. Component of the Ada2a-containing (ATAC) complex
CC composed of at least Ada2a, Atac1, Hcf, Ada3, Gcn5, Mocs2B, Charac-14,
CC Atac3, Atac2, NC2beta and wds (PubMed:18327268).
CC {ECO:0000269|PubMed:18327268, ECO:0000269|PubMed:21694722,
CC ECO:0000269|PubMed:21875999}.
CC -!- INTERACTION:
CC Q9V4C8; Q5LJZ2: Set1; NbExp=3; IntAct=EBI-2912878, EBI-3405171;
CC -!- SUBCELLULAR LOCATION: Nucleus {ECO:0000269|PubMed:12609738,
CC ECO:0000269|PubMed:18327268, ECO:0000269|PubMed:21694722}.
CC -!- ALTERNATIVE PRODUCTS:
CC Event=Alternative splicing; Named isoforms=5;
CC Name=A {ECO:0000269|PubMed:12494450}; Synonyms=B
CC {ECO:0000303|PubMed:10731132};
CC IsoId=Q9V4C8-1; Sequence=Displayed;
CC Name=E;
CC IsoId=Q9V4C8-6; Sequence=VSP_047937, VSP_047938;
CC Name=F;
CC IsoId=Q9V4C8-5; Sequence=VSP_047712;
CC Name=D {ECO:0000269|PubMed:12494450}; Synonyms=8-11
CC {ECO:0000303|PubMed:12494450};
CC IsoId=Q9V4C8-3; Sequence=Not described;
CC Name=G; Synonyms=11-13 {ECO:0000303|PubMed:12494450};
CC IsoId=Q9V4C8-4; Sequence=Not described;
CC -!- DEVELOPMENTAL STAGE: Expressed throughout development and in adults.
CC {ECO:0000269|PubMed:12494450}.
CC -!- PTM: Proteolytic cleavage occurs between amino acids 900 and 1100
CC within the non-conserved central region, giving rise to two independent
CC but tightly associated N- and C-terminal subunits.
CC {ECO:0000269|PubMed:12494450}.
CC -!- MISCELLANEOUS: Due to lack of HCF repeats, the cleavage process occurs
CC via a different mechanism to that in the mammalian HCFC1.
CC {ECO:0000305|PubMed:12494450}.
CC -!- MISCELLANEOUS: [Isoform D]: Exons 9 and 10 deleted.
CC {ECO:0000269|PubMed:12494450}.
CC -!- MISCELLANEOUS: [Isoform G]: Exon 12 deleted. {ECO:0000305}.
CC -!- SEQUENCE CAUTION:
CC Sequence=AAL28531.1; Type=Erroneous initiation; Note=Truncated N-terminus.; Evidence={ECO:0000305};
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AJ320236; CAC44472.1; -; mRNA.
DR EMBL; AF251006; AAK28427.1; -; mRNA.
DR EMBL; AE014135; AAF59349.2; -; Genomic_DNA.
DR EMBL; AE014135; AAN06529.1; -; Genomic_DNA.
DR EMBL; AE014135; AAN06530.2; -; Genomic_DNA.
DR EMBL; AE014135; AFH06780.1; -; Genomic_DNA.
DR EMBL; AY060983; AAL28531.1; ALT_INIT; mRNA.
DR EMBL; BT001602; AAN71357.1; -; mRNA.
DR EMBL; BT015268; AAT94497.1; -; mRNA.
DR RefSeq; NP_001245420.1; NM_001258491.3. [Q9V4C8-6]
DR RefSeq; NP_524621.2; NM_079882.4. [Q9V4C8-1]
DR RefSeq; NP_726566.1; NM_166756.3. [Q9V4C8-1]
DR RefSeq; NP_726567.2; NM_166757.3. [Q9V4C8-5]
DR RefSeq; NP_995595.1; NM_205873.3. [Q9V4C8-1]
DR AlphaFoldDB; Q9V4C8; -.
DR SMR; Q9V4C8; -.
DR BioGRID; 68622; 36.
DR IntAct; Q9V4C8; 9.
DR MINT; Q9V4C8; -.
DR STRING; 7227.FBpp0088193; -.
DR iPTMnet; Q9V4C8; -.
DR PaxDb; Q9V4C8; -.
DR PRIDE; Q9V4C8; -.
DR EnsemblMetazoa; FBtr0089125; FBpp0088194; FBgn0039904. [Q9V4C8-1]
DR EnsemblMetazoa; FBtr0089126; FBpp0088195; FBgn0039904. [Q9V4C8-1]
DR EnsemblMetazoa; FBtr0307379; FBpp0298368; FBgn0039904. [Q9V4C8-6]
DR EnsemblMetazoa; FBtr0334479; FBpp0306551; FBgn0039904. [Q9V4C8-5]
DR EnsemblMetazoa; FBtr0345284; FBpp0311451; FBgn0039904. [Q9V4C8-1]
DR GeneID; 43788; -.
DR KEGG; dme:Dmel_CG1710; -.
DR CTD; 43788; -.
DR FlyBase; FBgn0039904; Hcf.
DR VEuPathDB; VectorBase:FBgn0039904; -.
DR eggNOG; KOG4152; Eukaryota.
DR GeneTree; ENSGT00940000166952; -.
DR InParanoid; Q9V4C8; -.
DR OMA; NQVCSNP; -.
DR PhylomeDB; Q9V4C8; -.
DR Reactome; R-DME-5689603; UCH proteinases.
DR SignaLink; Q9V4C8; -.
DR BioGRID-ORCS; 43788; 1 hit in 3 CRISPR screens.
DR GenomeRNAi; 43788; -.
DR PRO; PR:Q9V4C8; -.
DR Proteomes; UP000000803; Chromosome 4.
DR Bgee; FBgn0039904; Expressed in central nervous system and 19 other tissues.
DR Genevisible; Q9V4C8; DM.
DR GO; GO:0140672; C:ATAC complex; IDA:FlyBase.
DR GO; GO:0035097; C:histone methyltransferase complex; IBA:GO_Central.
DR GO; GO:0044665; C:MLL1/2 complex; ISS:FlyBase.
DR GO; GO:0044666; C:MLL3/4 complex; IDA:FlyBase.
DR GO; GO:0005634; C:nucleus; IDA:UniProtKB.
DR GO; GO:0048188; C:Set1C/COMPASS complex; IDA:FlyBase.
DR GO; GO:0003682; F:chromatin binding; IDA:FlyBase.
DR GO; GO:0003713; F:transcription coactivator activity; IBA:GO_Central.
DR GO; GO:0007049; P:cell cycle; IEA:UniProtKB-KW.
DR GO; GO:0006338; P:chromatin remodeling; IDA:FlyBase.
DR GO; GO:0016573; P:histone acetylation; IDA:FlyBase.
DR GO; GO:0043966; P:histone H3 acetylation; IDA:FlyBase.
DR GO; GO:0051568; P:histone H3-K4 methylation; IC:FlyBase.
DR GO; GO:0043967; P:histone H4 acetylation; IDA:FlyBase.
DR GO; GO:0045927; P:positive regulation of growth; IGI:FlyBase.
DR GO; GO:0045893; P:positive regulation of transcription, DNA-templated; IDA:FlyBase.
DR GO; GO:0006355; P:regulation of transcription, DNA-templated; IBA:GO_Central.
DR CDD; cd00063; FN3; 2.
DR Gene3D; 2.120.10.80; -; 2.
DR Gene3D; 2.60.40.10; -; 2.
DR InterPro; IPR003961; FN3_dom.
DR InterPro; IPR036116; FN3_sf.
DR InterPro; IPR043536; HCF1/2.
DR InterPro; IPR013783; Ig-like_fold.
DR InterPro; IPR015915; Kelch-typ_b-propeller.
DR PANTHER; PTHR46003; PTHR46003; 2.
DR SMART; SM00060; FN3; 2.
DR SUPFAM; SSF117281; SSF117281; 1.
DR SUPFAM; SSF49265; SSF49265; 1.
DR PROSITE; PS50853; FN3; 2.
PE 1: Evidence at protein level;
KW Alternative splicing; Autocatalytic cleavage; Cell cycle; Kelch repeat;
KW Nucleus; Phosphoprotein; Reference proteome; Repeat.
FT CHAIN 1..?
FT /note="HCF N-terminal chain"
FT /id="PRO_0000016647"
FT CHAIN ?..1500
FT /note="HCF C-terminal chain"
FT /id="PRO_0000016648"
FT REPEAT 85..133
FT /note="Kelch 1"
FT /evidence="ECO:0000255"
FT REPEAT 135..181
FT /note="Kelch 2"
FT /evidence="ECO:0000255"
FT REPEAT 189..237
FT /note="Kelch 3"
FT /evidence="ECO:0000255"
FT REPEAT 259..307
FT /note="Kelch 4"
FT /evidence="ECO:0000255"
FT REPEAT 308..373
FT /note="Kelch 5"
FT /evidence="ECO:0000255"
FT DOMAIN 1244..1341
FT /note="Fibronectin type-III 1"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 1346..1457
FT /note="Fibronectin type-III 2"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT REGION 517..543
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1024..1061
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1161..1185
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1458..1500
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT MOTIF 1470..1495
FT /note="Bipartite nuclear localization signal"
FT COMPBIAS 1033..1057
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1161..1177
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1459..1489
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT MOD_RES 477
FT /note="Phosphoserine"
FT /evidence="ECO:0000269|PubMed:18327897"
FT MOD_RES 958
FT /note="Phosphoserine"
FT /evidence="ECO:0000269|PubMed:18327897"
FT MOD_RES 966
FT /note="Phosphoserine"
FT /evidence="ECO:0000269|PubMed:18327897"
FT MOD_RES 1126
FT /note="Phosphothreonine"
FT /evidence="ECO:0000269|PubMed:18327897"
FT MOD_RES 1489
FT /note="Phosphoserine"
FT /evidence="ECO:0000269|PubMed:18327897"
FT VAR_SEQ 392..443
FT /note="Missing (in isoform F)"
FT /evidence="ECO:0000305"
FT /id="VSP_047712"
FT VAR_SEQ 392
FT /note="V -> VRV (in isoform E)"
FT /evidence="ECO:0000305"
FT /id="VSP_047937"
FT VAR_SEQ 794..811
FT /note="Missing (in isoform E)"
FT /evidence="ECO:0000305"
FT /id="VSP_047938"
FT MUTAGEN 1470..1471
FT /note="Missing: Causes accumulation exclusively in the
FT cytoplasm."
FT /evidence="ECO:0000269|PubMed:12609738"
FT MUTAGEN 1492..1495
FT /note="Missing: Causes accumulation exclusively in the
FT cytoplasm."
FT /evidence="ECO:0000269|PubMed:12609738"
FT CONFLICT 694
FT /note="K -> T (in Ref. 2; AAK28427)"
FT /evidence="ECO:0000305"
FT CONFLICT 956
FT /note="R -> K (in Ref. 2; AAK28427)"
FT /evidence="ECO:0000305"
SQ SEQUENCE 1500 AA; 160185 MW; 1275EC71E7D8A44F CRC64;
MEGSDFVDPA FSSGERISAS DLNSEHIIQA ENHSFANRIS MDMDVPDGHQ LDSNLTGFRW
KRVLNPTGPQ PRPRHGHRAI NIKELMVVFG GGNEGIVDEL HVYNTVTNQW YVPVLKGDVP
NGCAAYGFVV EGTRMFVFGG MIEYGKYSNE LYELQATKWE WRKMYPESPD SGLSPCPRLG
HSFTMVGEKI FLFGGLANES DDPKNNIPKY LNDLYILDTR GVHSHNGKWI VPKTYGDSPP
PRESHTGISF ATKSNGNLNL LIYGGMSGCR LGDLWLLETD SMTWSKPKTS GEAPLPRSLH
SSTMIGNKMY VFGGWVPLVI NDSKSTTERE WKCTNTLAVL DLETMTWENV TLDTVEENVP
RARAGHCAVG IQSRLYVWSG RDGYRKAWNN QVCCKDLWYL EVSKPLYAVK VALVRASTHA
LELSWTATTF AAAYVLQIQK IEQPLNTSSK LLSNNIVQQG TPTSAETSGI NISANRSGSA
LGLGVEATST VLKLEKESLQ LSGCQPETNV QPSVNDLLQS MSQPSSPASR ADKDPLSSGG
GTTFNLSTSV ASVHPQISVI SSTAAVTGND TASPSGAINS ILQKFRPVVT AVRTSTTTAV
SIATSTSDPL SVRVPSTMSA NVVLSSSSST LRIVPSVTAS HSLRIASSQA SGNNCRSSSA
INILKTALPN VAVQSQPTSS TTTSIGGKQY FIQKPLTLAP NVQLQFVKTS GGMTVQTLPK
VNFTASKGTP PHGISIANPH LASGITQIQG STVPGSQIQK PIVSGNVLKL VSPHTMAGGK
LIMKNSNILQ MGKVTPNVMG GKPAFVITNK QGTPLGNQQI IIVTTGGNVR SVPTSTVMTS
AGGSASGTNI VSIVNSTSTT PSPLQALSGQ KTLISNQSGV KMLRNISSVQ ASSSMAFGQK
QSGTPIHQKT ALYIGGKAVT VMSTNTSMAA SGNKVMVLPG TSSNNSPATT TALSARKSFV
FNAGGSPRTV TLATKSINAK SIPQSQPVTE TNNHSVATIK DTDPMDDIIE QLDGAGDLLK
LSESEGQHGS EENENNGENA TSSSASALFT GGDTAGPSRA QNPIVMEHPV DIIEDVSGVS
STTDVNETAI VSGDTIESLK MSEKENDDVK SMGEKSILSD DCHQPTTSET EAATILTTIK
SAEALVLETA EIRKDHTGCT IGSLKENQDE NKKFKQRQES SPSQNIHQFQ NVDGSQLEAL
ASAALLQAAT SDTTALALKE LIERPESETN TRSSNIAEIQ QNNVQSTLAV VVPNTSQNEN
QKWHTVGVFK DLSHTVTSYI DSNCISDSFF DGIDVDNLPD FSKFPRTNLE PGTAYRFRLS
AINSCGRGEW GEISSFKTCL PGFPGAPSAI KISKDVKEGA HLTWEPPPAQ KTKEIIEYSV
YLAVKPTAKD KALSTPQLAF VRVYVGAANQ CTVPNASLSN AHVDCSNKPA IIFRIAARNQ
KGYGPATQVR WLQDPAAAKQ HTPTVTPNLK RGPEKSTIGS SNIANTFCSP HKRGRNGLHD