FIBH_BOMMO
ID FIBH_BOMMO Reviewed; 5263 AA.
AC P05790; Q17220; Q26379;
DT 01-NOV-1988, integrated into UniProtKB/Swiss-Prot.
DT 01-DEC-2000, sequence version 4.
DT 25-MAY-2022, entry version 91.
DE RecName: Full=Fibroin heavy chain;
DE Short=Fib-H;
DE AltName: Full=H-fibroin;
DE Flags: Precursor;
GN Name=FIBH;
OS Bombyx mori (Silk moth).
OC Eukaryota; Metazoa; Ecdysozoa; Arthropoda; Hexapoda; Insecta; Pterygota;
OC Neoptera; Endopterygota; Lepidoptera; Glossata; Ditrysia; Bombycoidea;
OC Bombycidae; Bombycinae; Bombyx.
OX NCBI_TaxID=7091;
RN [1]
RP NUCLEOTIDE SEQUENCE [GENOMIC DNA].
RX PubMed=10871375; DOI=10.1093/nar/28.12.2413;
RA Zhou C.-Z., Confalonieri F., Medina N., Zivanovic Y., Esnault C., Yang T.,
RA Jacquet M., Janin J., Duguet M., Perasso R., Li Z.-G.;
RT "Fine organization of Bombyx mori fibroin heavy chain gene.";
RL Nucleic Acids Res. 28:2413-2419(2000).
RN [2]
RP NUCLEOTIDE SEQUENCE [GENOMIC DNA] OF 1-168.
RX PubMed=498286; DOI=10.1016/0092-8674(79)90075-8;
RA Tsujimoto Y., Suzuki Y.;
RT "The DNA sequence of Bombyx mori fibroin gene including the 5' flanking,
RT mRNA coding, entire intervening and fibroin protein coding regions.";
RL Cell 18:591-600(1979).
RN [3]
RP PROTEIN SEQUENCE OF 22-31.
RX PubMed=16466694; DOI=10.1016/j.bbrc.2006.01.081;
RA Wang S.P., Guo T.Q., Guo X.Y., Huang J.T., Lu C.D.;
RT "In vivo analysis of fibroin heavy chain signal peptide of silkworm Bombyx
RT mori using recombinant baculovirus as vector.";
RL Biochem. Biophys. Res. Commun. 341:1203-1210(2006).
RN [4]
RP PARTIAL NUCLEOTIDE SEQUENCE [GENOMIC DNA].
RX PubMed=455439; DOI=10.1016/0092-8674(79)90018-7;
RA Tsujimoto Y., Suzuki Y.;
RT "Structural analysis of the fibroin gene at the 5' end and its surrounding
RT regions.";
RL Cell 16:425-436(1979).
RN [5]
RP PARTIAL NUCLEOTIDE SEQUENCE [MRNA].
RC STRAIN=Kinshu X Showa;
RX PubMed=3210244; DOI=10.1016/0022-2836(88)90117-9;
RA Mita K., Ichimura S., Zama M., James T.C.;
RT "Specific codon usage pattern and its implications on the secondary
RT structure of silk fibroin mRNA.";
RL J. Mol. Biol. 203:917-925(1988).
RN [6]
RP PARTIAL NUCLEOTIDE SEQUENCE [MRNA].
RX PubMed=7916056; DOI=10.1007/bf00175878;
RA Mita K., Ichimura S., James T.C.;
RT "Highly repetitive structure and its organization of the silk fibroin
RT gene.";
RL J. Mol. Evol. 38:583-592(1994).
RN [7]
RP PROTEIN SEQUENCE OF 22-38; 46-63; 81-171; 191-213; 274-311; 377-409;
RP 414-438; 542-561; 632-691; 695-726; 930-951; 1032-1051; 1116-1137;
RP 1198-1308; 1364-1382; 1441-1462; 1512-1530; 1589-1682; 1685-1703;
RP 1741-1873; 1903-1914; 2024-2084; 2105-2124; 2213-2232; 2271-2369;
RP 2506-2527; 2532-2547; 2553-2580; 2590-2663; 2673-2694; 2797-2816;
RP 2836-2900; 2914-2941; 2957-2976; 3014-3069; 3111-3126; 3165-3192;
RP 3229-3302; 3400-3427; 3488-3507; 3527-3591; 3605-3632; 3648-3667;
RP 3705-3732; 3748-3866; 3909-3936; 3989-4016; 4026-4055; 4095-4106;
RP 4160-4189; 4197-4251; 4269-4296; 4324-4345; 4350-4365; 4383-4410;
RP 4414-4441; 4506-4598; 4648-4682; 4755-4782; 4898-4925; 4947-4958;
RP 5033-5069; 5079-5094; 5115-5148; 5151-5233 AND 5256-5263, AND
RP IDENTIFICATION BY MASS SPECTROMETRY.
RA Lubec G., Chen W.-Q.;
RL Submitted (AUG-2009) to UniProtKB.
RN [8]
RP NUCLEOTIDE SEQUENCE [GENOMIC DNA] OF 5179-5263, AND DISULFIDE BONDS.
RC STRAIN=J-139;
RX PubMed=10366732; DOI=10.1016/s0167-4838(99)00088-6;
RA Tanaka K., Kajiyama N., Ishikura K., Waga S., Kikuchi A., Ohtomo K.,
RA Takagi T., Mizuno S.;
RT "Determination of the site of disulfide linkage between heavy and light
RT chains of silk fibroin produced by Bombyx mori.";
RL Biochim. Biophys. Acta 1432:92-103(1999).
RN [9]
RP PARTIAL NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=p50T;
RX PubMed=15591204; DOI=10.1126/science.1102210;
RA Xia Q., Zhou Z., Lu C., Cheng D., Dai F., Li B., Zhao P., Zha X., Cheng T.,
RA Chai C., Pan G., Xu J., Liu C., Lin Y., Qian J., Hou Y., Wu Z., Li G.,
RA Pan M., Li C., Shen Y., Lan X., Yuan L., Li T., Xu H., Yang G., Wan Y.,
RA Zhu Y., Yu M., Shen W., Wu D., Xiang Z., Yu J., Wang J., Li R., Shi J.,
RA Li H., Li G., Su J., Wang X., Li G., Zhang Z., Wu Q., Li J., Zhang Q.,
RA Wei N., Xu J., Sun H., Dong L., Liu D., Zhao S., Zhao X., Meng Q., Lan F.,
RA Huang X., Li Y., Fang L., Li C., Li D., Sun Y., Zhang Z., Yang Z.,
RA Huang Y., Xi Y., Qi Q., He D., Huang H., Zhang X., Wang Z., Li W., Cao Y.,
RA Yu Y., Yu H., Li J., Ye J., Chen H., Zhou Y., Liu B., Wang J., Ye J.,
RA Ji H., Li S., Ni P., Zhang J., Zhang Y., Zheng H., Mao B., Wang W., Ye C.,
RA Li S., Wang J., Wong G.K.-S., Yang H.;
RT "A draft sequence for the genome of the domesticated silkworm (Bombyx
RT mori).";
RL Science 306:1937-1940(2004).
RN [10]
RP SUBUNIT.
RX PubMed=10986287; DOI=10.1074/jbc.m006897200;
RA Inoue S., Tanaka K., Arisaka F., Kimura S., Ohtomo K., Mizuno S.;
RT "Silk fibroin of Bombyx mori is secreted, assembling a high molecular mass
RT elementary unit consisting of H-chain, L-chain, and p25, with a 6:6:1 molar
RT ratio.";
RL J. Biol. Chem. 275:40517-40528(2000).
CC -!- FUNCTION: Core component of the silk filament; a strong, insoluble and
CC chemically inert fiber.
CC -!- SUBUNIT: Silk fibroin elementary unit consists in a disulfide-linked
CC heavy and light chain and a p25 glycoprotein in molar ratios of 6:6:1.
CC This results in a complex of approximately 2.3 MDa.
CC {ECO:0000269|PubMed:10366732, ECO:0000269|PubMed:10986287}.
CC -!- TISSUE SPECIFICITY: Produced exclusively in the posterior (PSG) section
CC of silk glands, which are essentially modified salivary glands.
CC -!- DOMAIN: Composed of antiparallel beta sheets. The strands of the beta
CC sheets run parallel to the fiber axis. Long stretches of silk fibroin
CC are composed of microcrystalline arrays of (-Gly-Ser-Gly-Ala-Gly-Ala-)n
CC interrupted by regions containing bulkier residues. The fiber is
CC composed of microcrystalline arrays alternating with amorphous regions.
CC -!- PTM: The interchain disulfide bridge is essential for the intracellular
CC transport and secretion of fibroin.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AF226688; AAF76983.1; -; Genomic_DNA.
DR EMBL; V00094; CAA23432.1; -; Genomic_DNA.
DR EMBL; V00097; CAA23433.1; -; Genomic_DNA.
DR EMBL; S74439; AAB31861.1; -; mRNA.
DR EMBL; X13869; CAA32076.1; -; mRNA.
DR EMBL; M35378; AAA27839.1; -; mRNA.
DR EMBL; AB017362; BAA33147.1; -; Genomic_DNA.
DR EMBL; CK538369; -; NOT_ANNOTATED_CDS; mRNA.
DR EMBL; AADK01000575; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR PIR; S01844; S01844.
DR RefSeq; NP_001106733.1; NM_001113262.1.
DR PDB; 3UA0; X-ray; 3.00 A; A/B=1-126.
DR PDBsum; 3UA0; -.
DR SMR; P05790; -.
DR STRING; 7091.BGIBMGA005111-TA; -.
DR GeneID; 693030; -.
DR KEGG; bmor:693030; -.
DR CTD; 693030; -.
DR OrthoDB; 777540at2759; -.
DR Proteomes; UP000005204; Unassembled WGS sequence.
PE 1: Evidence at protein level;
KW 3D-structure; Direct protein sequencing; Disulfide bond;
KW Reference proteome; Repeat; Signal; Silk protein.
FT SIGNAL 1..21
FT /evidence="ECO:0000269|PubMed:16466694, ECO:0000269|Ref.7"
FT CHAIN 22..5263
FT /note="Fibroin heavy chain"
FT /id="PRO_0000021255"
FT REGION 149..5206
FT /note="Highly repetitive"
FT DISULFID 5244
FT /note="Interchain (with C-190 in light chain)"
FT /evidence="ECO:0000269|PubMed:10366732"
FT DISULFID 5260..5263
FT /evidence="ECO:0000269|PubMed:10366732"
FT CONFLICT 10
FT /note="C -> V (in Ref. 2)"
FT /evidence="ECO:0000305"
FT STRAND 31..39
FT /evidence="ECO:0007829|PDB:3UA0"
FT STRAND 42..48
FT /evidence="ECO:0007829|PDB:3UA0"
FT STRAND 54..65
FT /evidence="ECO:0007829|PDB:3UA0"
FT STRAND 79..92
FT /evidence="ECO:0007829|PDB:3UA0"
FT STRAND 97..107
FT /evidence="ECO:0007829|PDB:3UA0"
SQ SEQUENCE 5263 AA; 391593 MW; 8EE11D3A0A47440E CRC64;
MRVKTFVILC CALQYVAYTN ANINDFDEDY FGSDVTVQSS NTTDEIIRDA SGAVIEEQIT
TKKMQRKNKN HGILGKNEKM IKTFVITTDS DGNESIVEED VLMKTLSDGT VAQSYVAADA
GAYSQSGPYV SNSGYSTHQG YTSDFSTSAA VGAGAGAGAA AGSGAGAGAG YGAASGAGAG
AGAGAGAGYG TGAGAGAGAG YGAGAGAGAG AGYGAGAGAG AGAGYGAGAG AGAGAGYGAG
AGAGAGAGYG AGAGAGAGAG YGAASGAGAG AGYGQGVGSG AASGAGAGAG AGSAAGSGAG
AGAGTGAGAG YGAGAGAGAG AGYGAASGTG AGYGAGAGAG YGGASGAGAG AGAGAGAGAG
AGYGTGAGYG AGAGAGAGAG AGAGYGAGAG AGYGAGYGVG AGAGYGAGYG AGAGSGAASG
AGSGAGAGSG AGAGSGAGAG SGAGAGSGAG AGSGAGAGSG AGAGSGAGAG SGTGAGSGAG
AGYGAGAGAG YGAGAGSGAA SGAGAGSGAG AGSGAGAGSG AGAGSGAGAG SGAGAGYGAG
AGAGYGAGAG AGYGAGAGVG YGAGAGSGAA SGAGAGSGAG AGSGAGAGSG AGAGSGAGAG
SGAGAGSGAG AGSGAGAGSG AGAGSGAGVG YGAGVGAGYG AGYGAGAGAG YGAGAGSGAA
SGAGAGAGAG AGTGSSGFGP YVANGGYSRS DGYEYAWSSD FGTGSGAGAG SGAGAGSGAG
AGSGAGAGSG AGAGSGAGAG YGAGVGVGYG AGYGAGAGAG YGAGAGSGAA SGAGAGSGAG
AGSGAGAGSG AGAGSGAGAG SGAGAGSGAG AGSGAGAGSG AGAGSGAGAG SGAGVGSGAG
AGSGAGAGVG YGAGAGVGYG AGAGSGAASG AGAGSGAGAG SGAGAGSGAG AGSGAGAGSG
AGAGSGAGAG SGAGAGSGAG AGSGAGVGYG AGVGAGYGAG YGAGAGAGYG AGAGSGAASG
AGAGSGAGAG SGAGAGSGAG AGSGAGAGSG AGAGSGAGAG SGAGAGSGAG AGSGAGAGSG
AGAGSGAGAG YGAGAGAGYG AGYGAGAGAG YGAGAGSGAA SGAGSGAGAG SGAGAGAGSG
AGAGSGAGAG SGAGAGSGAG AGSGAGAGSG AGAGYGAGVG AGYGAGYGAG AGAGYGAGAG
SGAASGAGAG SGAGAGSGAG AGSGAGAGSG AGAGSGAGAG SGAGAGSGAG VGYGAGYGAG
AGAGYGAGAG SGAASGAGAG AGAGAGTGSS GFGPYVAHGG YSGYEYAWSS ESDFGTGSGA
GAGSGAGAGS GAGAGSGAGA GSGAGYGAGV GAGYGAGYGA GAGAGYGAGA GSGAGSGAGA
GSGAGAGSGA GAGSGAGAGS GAGAGSGAGA GSGAGAGSGA GAGSGAGAGY GAGYGAGAGA
GYGAGAGSGA GSGAGAGSGA GAGSGAGAGS GAGAGSGAGA GSGAGAGSGA GAGSGAGAGY
GAGVGAGYGA GYGAGAGAGY GAGAGSGAGS GAGAGSGAGA GSGAGAGSGA GVGSGAGAGS
GAGAGSGAGA GSGAGAGYGA GYGAGAGAGY GAGAGSGAGS GAGAGSGAGA GSGAGAGSGA
GAGSGAGAGS GAGAGSGAGA GSGAGVGYGA GVGAGYGAGY GAGAGAGYGA GAGSGAASGA
GAGAGAGAGT GSSGFGPYVA NGGYSGYEYA WSSESDFGTG SGAGAGSGAG AGSGAGAGSG
AGAGSGAGAG YGAGYGAGAG AGYGAGAGSG AGSGAGAGSG AGAGSGAGAG SGAGAGSGAG
AGSGAGAGSG AGAGSGAGSG SGAGAGSGAG AGSGAGAGYG AGVGAGYGVG YGAGAGAGYG
AGAGSGAASG AGAGAGAGAG TGSSGFGPYV AHGGYSGYEY AWSSESDFGT GSGAGAGSGA
GAGSGAGAGS GAGAGSGAGA GSGAGAGSGA GAGYGAGVGA GYGAAYGAGA GAGYGAGAGS
GAASGAGAGS GAGAGSGAGA GSGAGAGSGA GAGSGAGAGS GAGAGSGAGA GSGAGAGSGA
GAGSGAGAGY GAGAGAGYGA GAGSGAGSGA GAGSGAGAGS GAGAGSGAGA GSGAGAGSGA
GSGSGAGAGS GAGAGSGAGA GYGAGVGAGY GAGYGAGAGA GYGAGAGSGA GSGAGAGSGA
GAGYGAGAGA GYGAGYGAGA GAGYGAGAGT GAGSGAGAGS GAGAGSGAGA GSGAGAGSGA
GAGSGAGAGS GAGSGSGAGA GSGAGAGSGA GAGSGAGAGS GAGAGSGAGA GYGAGAGAGY
GAGYGAGAGA GYGAGAGSGA GSGAGAGSGA GAGSGAGAGS GAGAGYGAGY GAGAGSGAAS
GAGAGAGAGA GTGSSGFGPY VAHGGYSGYE YAWSSESDFG TGSGAGAGSG AGAGAGAGAG
SGAGAGYGAG VGAGYGAGYG AGAGAGYGAG AGSGTGSGAG AGSGAGAGYG AGVGAGYGAG
AGSGAAFGAG AGAGAGSGAG AGSGAGAGSG AGAGSGAGAG SGAGAGYGAG YGAGVGAGYG
AGAGSGAASG AGAGSGAGAG SGAGAGSGAG AGSGAGAGSG AGAGYGAGVG AGYGAGYGAG
AGAGYGAGAG SGAASGAGAG SGAGAGAGSG AGAGSGAGAG SGAGAGSGAG SGAGAGSGAG
AGSGAGAGYG AGAGSGAASG AGAGAGAGAG TGSSGFGPYV ANGGYSGYEY AWSSESDFGT
GSGAGAGSGA GAGSGAGAGS GAGAGSGAGA GYGAGVGAGY GAGYGAGAGA GYGAGAGSGA
GSGAGAGSGA GAGSGAGAGS GAGAGSGAGA GSGAGAGSGA GAGYGAGAGS GAASGAGAGS
GAGAGSGAGA GSGAGAGSGA GAGSGAGAGS GAGAGYGAGV GAGYGVGYGA GAGAGYGAGA
GSGAGSGAGA GSGAGAGSGA GAGSGAGAGS GAGSGAGAGS GAGAGSGAGA GSGAGSGAGA
GSGAGAGYGV GYGAGAGAGY GAGAGSGAGS GAGAGSGAGA GSGAGAGSGA GSGAGAGSGA
GAGSGAGAGS GAGAGYGAGV GAGYGVGYGA GAGAGYGAGA GSGAGSGAGA GSGAGAGSGA
GAGSGAGAGS GAGAGSGAGA GSGAGAGSGA GSGAGAGSGA GAGSGAGAGS GAGAGSGAGS
GAGAGSGAGA GSGAGAGSGA GAGYGAGVGA GYGVGYGAGV GAGYGAGAGS GAASGAGAGS
GAGAGAGSGA GAGSGAGAGS GAGAGSGAGA GSGAGAGSGA GAGYGAGYGA GVGAGYGAGA
GVGYGAGAGA GYGAGAGSGA ASGAGAGAGS GAGAGTGAGA GSGAGAGYGA GAGSGAASGA
GAGAGAGAGT GSSGFGPYVA NGGYSGYEYA WSSESDFGTG SGAGAGSGAG AGSGAGAGSG
AGAGSGAGAG YGAGVGAGYG AGAGSGAGSG AGAGSGAGAG SGAGAGSGAG AGSGAGAGYG
AGAGSGTGSG AGAGSGAGAG SGAGAGSGAG AGSGAGAGSG AGAGSGVGAG YGVGYGAGAG
AGYGVGYGAG AGAGYGAGAG SGTGSGAGAG SGAGAGSGAG AGSGAGAGSG AGAGSGAGAG
SGAGAGYGAG VGAGYGVGYG AGAGAGYGAG AGSGAGSGAG AGSGAGAGSG AGAGSGAGAG
SGAGSGAGAG SGAGAGSGAG AGSGAGSGAG AGSGAGAGYG VGYGAGAGAG YGAGAGSGAG
SGAGAGSGAG AGSGAGAGSG AGSGAGAGSG AGAGSGAGAG SGAGAGYGAG VGAGYGVGYG
AGAGAGYGAG AGSGAGSGAG AGSGAGAGSG AGAGSGAGAG SGAGAGSGAG AGSGAGAGSG
AGSGAGAGSG AGAGSGAGAG SGAGAGYGAG VGAGYGVGYG AGAGAGYGAG AGSGAASGAG
AGAGAGAGTG SSGFGPYVAN GGYSGYEYAW SSESDFGTGS GAGAGSGAGA GSGAGAGYGA
GYGAGVGAGY GAGAGVGYGA GAGAGYGAGA GSGAASGAGA GAGAGAGSGA GAGSGAGAGA
GSGAGAGYGA GYGIGVGAGY GAGAGVGYGA GAGAGYGAGA GSGAASGAGA GSGAGAGSGA
GAGSGAGAGS GAGAGSGAGA GSGAGAGYGA GYGAGVGAGY GAGAGVGYGA GAGAGYGAGA
GSGAASGAGA GAGAGAGAGS GAGAGSGAGA GSGAGAGSGA GAGSGAGAGS GAGAGSGAGA
GSGAGAGSGA GAGYGAGVGA GYGAGYGGAG AGYGAGAGSG AASGAGAGSG AGAGSGAGAG
SGAGAGSGAG AGSGAGAGYG AGAGSGAASG AGAGAGAGAG TGSSGFGPYV NGGYSGYEYA
WSSESDFGTG SGAGAGSGAG AGSGAGAGYG AGVGAGYGAG YGAGAGAGYG AGAGSGAASG
AGAGSGAGAG SGAGAGSGAG AGSGAGSGAG AGSGAGAGSG AGAGSGAGAG SGAGAGSGAG
AGYGAGVGAG YGAGYGAGAG AGYGAGAGSG AASGAGAGSG AGAGAGSGAG AGSGAGAGSG
AGAGSGAGAG SGAGAGSGAG SGAGAGSGAG AGYGAGYGAG VGAGYGAGAG VGYGAGAGAG
YGAGAGSGAA SGAGAGSGSG AGSGAGAGSG AGAGSGAGAG AGSGAGAGSG AGAGSGAGAG
YGAGYGAGAG SGAASGAGAG AGAGAGTGSS GFGPYVANGG YSGYEYAWSS ESDFGTGSGA
GAGSGAGAGS GAGAGYGAGV GAGYGAGYGA GAGAGYGAGA GSGAGSGAGA GSGAGAGSGA
GAGSGAGAGS GAGAGSGAGA GSGAGAGSGA GAGYGAGYGA GAGAGYGAGA GVGYGAGAGA
GYGAGAGSGA GSGAGAGSGS GAGAGSGSGA GSGAGAGSGA GAGSGAGAGS GAGAGSGAGA
GSGAGAGSGA GAGYGAGYGI GVGAGYGAGA GVGYGAGAGA GYGAGAGSGA ASGAGAGSGA
GAGSGAGAGS GAGAGSGAGA GSGAGAGSGA GAGSGAGAGS GAGAGSGAGA GYGAGAGVGY
GAGAGSGAAS GAGAGSGAGA GSGAGAGSGA GAGSGAGAGS GAGAGSGAGA GSGAGSGAGA
GSGAGAGYGA GYGAGVGAGY GAGAGYGAGY GVGAGAGYGA GAGSGAGSGA GAGSGAGAGS
GAGAGSGAGA GSGAGAGSGA GSGAGAGYGA GAGAGYGAGA GAGYGAGAGS GAASGAGAGA
GAGSGAGAGS GAGAGSGAGS GAGAGSGAGA GYGAGAGSGA ASGAGAGSGA GAGAGAGAGA
GSGAGAGSGA GAGYGAGAGS GAASGAGAGA GAGTGSSGFG PYVANGGYSR REGYEYAWSS
KSDFETGSGA ASGAGAGAGS GAGAGSGAGA GSGAGAGSGA GAGGSVSYGA GRGYGQGAGS
AASSVSSASS RSYDYSRRNV RKNCGIPRRQ LVVKFRALPC VNC