HMEN_ANOGA
ID HMEN_ANOGA Reviewed; 594 AA.
AC O02491; Q7PJ43;
DT 01-NOV-1997, integrated into UniProtKB/Swiss-Prot.
DT 10-JUL-2007, sequence version 3.
DT 25-MAY-2022, entry version 128.
DE RecName: Full=Segmentation polarity homeobox protein engrailed;
GN Name=en; ORFNames=AGAP008023;
OS Anopheles gambiae (African malaria mosquito).
OC Eukaryota; Metazoa; Ecdysozoa; Arthropoda; Hexapoda; Insecta; Pterygota;
OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; Culicidae;
OC Anophelinae; Anopheles.
OX NCBI_TaxID=7165;
RN [1]
RP NUCLEOTIDE SEQUENCE [GENOMIC DNA / MRNA].
RX PubMed=9108369; DOI=10.1242/dev.124.8.1531;
RA Whiteley M., Kassis J.A.;
RT "Rescue of Drosophila engrailed mutants with a highly divergent mosquito
RT engrailed cDNA using a homing, enhancer-trapping transposon.";
RL Development 124:1531-1541(1997).
RN [2]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=PEST;
RX PubMed=12364791; DOI=10.1126/science.1076181;
RA Holt R.A., Subramanian G.M., Halpern A., Sutton G.G., Charlab R.,
RA Nusskern D.R., Wincker P., Clark A.G., Ribeiro J.M.C., Wides R.,
RA Salzberg S.L., Loftus B.J., Yandell M.D., Majoros W.H., Rusch D.B., Lai Z.,
RA Kraft C.L., Abril J.F., Anthouard V., Arensburger P., Atkinson P.W.,
RA Baden H., de Berardinis V., Baldwin D., Benes V., Biedler J., Blass C.,
RA Bolanos R., Boscus D., Barnstead M., Cai S., Center A., Chaturverdi K.,
RA Christophides G.K., Chrystal M.A.M., Clamp M., Cravchik A., Curwen V.,
RA Dana A., Delcher A., Dew I., Evans C.A., Flanigan M.,
RA Grundschober-Freimoser A., Friedli L., Gu Z., Guan P., Guigo R.,
RA Hillenmeyer M.E., Hladun S.L., Hogan J.R., Hong Y.S., Hoover J.,
RA Jaillon O., Ke Z., Kodira C.D., Kokoza E., Koutsos A., Letunic I.,
RA Levitsky A.A., Liang Y., Lin J.-J., Lobo N.F., Lopez J.R., Malek J.A.,
RA McIntosh T.C., Meister S., Miller J.R., Mobarry C., Mongin E., Murphy S.D.,
RA O'Brochta D.A., Pfannkoch C., Qi R., Regier M.A., Remington K., Shao H.,
RA Sharakhova M.V., Sitter C.D., Shetty J., Smith T.J., Strong R., Sun J.,
RA Thomasova D., Ton L.Q., Topalis P., Tu Z.J., Unger M.F., Walenz B.,
RA Wang A.H., Wang J., Wang M., Wang X., Woodford K.J., Wortman J.R., Wu M.,
RA Yao A., Zdobnov E.M., Zhang H., Zhao Q., Zhao S., Zhu S.C., Zhimulev I.,
RA Coluzzi M., della Torre A., Roth C.W., Louis C., Kalush F., Mural R.J.,
RA Myers E.W., Adams M.D., Smith H.O., Broder S., Gardner M.J., Fraser C.M.,
RA Birney E., Bork P., Brey P.T., Venter J.C., Weissenbach J., Kafatos F.C.,
RA Collins F.H., Hoffman S.L.;
RT "The genome sequence of the malaria mosquito Anopheles gambiae.";
RL Science 298:129-149(2002).
CC -!- FUNCTION: This protein specifies the body segmentation pattern. It is
CC required for the development of the central nervous system.
CC Transcriptional regulator that repress activated promoters (By
CC similarity). {ECO:0000250}.
CC -!- SUBCELLULAR LOCATION: Nucleus {ECO:0000255|PROSITE-ProRule:PRU00108}.
CC -!- SIMILARITY: Belongs to the engrailed homeobox family. {ECO:0000305}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; U42214; AAB58461.1; -; Genomic_DNA.
DR EMBL; U42429; AAB54088.1; -; mRNA.
DR EMBL; AAAB01008964; EAA43906.3; -; Genomic_DNA.
DR RefSeq; XP_317438.3; XM_317438.4.
DR AlphaFoldDB; O02491; -.
DR SMR; O02491; -.
DR STRING; 7165.AGAP008023-PA; -.
DR PaxDb; O02491; -.
DR GeneID; 1277926; -.
DR KEGG; aga:AgaP_AGAP008023; -.
DR CTD; 1277926; -.
DR VEuPathDB; VectorBase:AGAP008023; -.
DR eggNOG; KOG0493; Eukaryota.
DR HOGENOM; CLU_021330_0_0_1; -.
DR InParanoid; O02491; -.
DR OrthoDB; 858478at2759; -.
DR PhylomeDB; O02491; -.
DR Proteomes; UP000007062; Chromosome 3R.
DR GO; GO:0005634; C:nucleus; IBA:GO_Central.
DR GO; GO:0000981; F:DNA-binding transcription factor activity, RNA polymerase II-specific; IBA:GO_Central.
DR GO; GO:0000978; F:RNA polymerase II cis-regulatory region sequence-specific DNA binding; IBA:GO_Central.
DR GO; GO:0030182; P:neuron differentiation; IBA:GO_Central.
DR GO; GO:0006357; P:regulation of transcription by RNA polymerase II; IBA:GO_Central.
DR GO; GO:0007367; P:segment polarity determination; IEA:UniProtKB-KW.
DR CDD; cd00086; homeodomain; 1.
DR InterPro; IPR019549; Homeobox-engrailed_C-terminal.
DR InterPro; IPR009057; Homeobox-like_sf.
DR InterPro; IPR017970; Homeobox_CS.
DR InterPro; IPR001356; Homeobox_dom.
DR InterPro; IPR000747; Homeobox_engrailed.
DR InterPro; IPR020479; Homeobox_metazoa.
DR InterPro; IPR019737; Homoebox-engrailed_CS.
DR InterPro; IPR000047; HTH_motif.
DR Pfam; PF10525; Engrail_1_C_sig; 1.
DR Pfam; PF00046; Homeodomain; 1.
DR PRINTS; PR00026; ENGRAILED.
DR PRINTS; PR00024; HOMEOBOX.
DR PRINTS; PR00031; HTHREPRESSR.
DR SMART; SM00389; HOX; 1.
DR SUPFAM; SSF46689; SSF46689; 1.
DR PROSITE; PS00033; ENGRAILED; 1.
DR PROSITE; PS00027; HOMEOBOX_1; 1.
DR PROSITE; PS50071; HOMEOBOX_2; 1.
PE 2: Evidence at transcript level;
KW Developmental protein; DNA-binding; Homeobox; Nucleus; Reference proteome;
KW Segmentation polarity protein.
FT CHAIN 1..594
FT /note="Segmentation polarity homeobox protein engrailed"
FT /id="PRO_0000196075"
FT DNA_BIND 496..555
FT /note="Homeobox"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00108"
FT REGION 1..64
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 76..127
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 141..164
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 198..217
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 231..299
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 387..458
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 474..501
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 14..62
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 93..112
FT /note="Basic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 113..127
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 234..299
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 387..410
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 425..440
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 477..500
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT CONFLICT 47
FT /note="T -> S (in Ref. 1; AAB54088/AAB58461)"
FT /evidence="ECO:0000305"
FT CONFLICT 67
FT /note="L -> V (in Ref. 1; AAB54088/AAB58461)"
FT /evidence="ECO:0000305"
FT CONFLICT 154
FT /note="P -> S (in Ref. 1; AAB54088/AAB58461)"
FT /evidence="ECO:0000305"
FT CONFLICT 291
FT /note="V -> A (in Ref. 1; AAB54088/AAB58461)"
FT /evidence="ECO:0000305"
FT CONFLICT 335
FT /note="N -> NR (in Ref. 1; AAB54088/AAB58461)"
FT /evidence="ECO:0000305"
FT CONFLICT 343
FT /note="A -> R (in Ref. 1; AAB54088/AAB58461)"
FT /evidence="ECO:0000305"
FT CONFLICT 356
FT /note="G -> S (in Ref. 1; AAB54088/AAB58461)"
FT /evidence="ECO:0000305"
FT CONFLICT 362
FT /note="G -> GA (in Ref. 1; AAB54088/AAB58461)"
FT /evidence="ECO:0000305"
FT CONFLICT 399
FT /note="A -> T (in Ref. 1; AAB54088/AAB58461)"
FT /evidence="ECO:0000305"
FT CONFLICT 436..437
FT /note="DA -> ER (in Ref. 1; AAB54088/AAB58461)"
FT /evidence="ECO:0000305"
FT CONFLICT 490..492
FT /note="EKG -> KRA (in Ref. 1; AAB54088/AAB58461)"
FT /evidence="ECO:0000305"
SQ SEQUENCE 594 AA; 63524 MW; D5F5F8191C7EEB89 CRC64;
MALEDRCSPQ SAPSPPHHHH SSQSPTSTTT VTMATASPVP ACTTTTTTTS TSGASAASSP
TRDEMSLVVP ISPLHIKQEP LGSDGPMPAQ PPHHHQHPHH HQLPHHPHHQ HHPQQQPSPQ
TSPPASISFS ITNILSDRFG KATAEQQQQP HPQPPAIREP ISPGPIHPAV LLPYPQHVLH
PAHHPALLHP AYHTGLHHYY QPSPSHPQPI VPQPQRASLE RRDSLFRPYD ISKSPRLCSS
NGSSSATPLP LHPYHTDSDC STQDSTSAPS PATYGDIASP SSASSAMTTP VTTSSPTGSV
YDYSRKASAL DHRAALLNGF SAAASYPKLH EEIINPPQVP GEADRIANEG GTGCGGHGCC
GGSATPHNMP PLGSLCKTVS QIGQHVAGTG SLNGSGSAAN GASNGGSGAP ATAKPTPKPI
PKPAPSSETN GSSSQDAGME SSDDAKSETS STKDGSENGS NLWPAWVYCT RYSDRPSSGP
RYRRTKQPKE KGDSEEKRPR TAFSNAQLQR LKNEFNENRY LTEKRRQTLS AELGLNEAQI
KIWFQNKRAK IKKSSSEKNP LALQLMAQGL YNHSTVPLTK EEEELEMRMN GQIP