MUCAP_PIG
ID MUCAP_PIG Reviewed; 1150 AA.
AC P12021;
DT 01-OCT-1989, integrated into UniProtKB/Swiss-Prot.
DT 01-DEC-1992, sequence version 2.
DT 25-MAY-2022, entry version 105.
DE RecName: Full=Apomucin;
DE AltName: Full=Mucin core protein;
DE Flags: Fragment;
OS Sus scrofa (Pig).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Laurasiatheria; Artiodactyla; Suina; Suidae; Sus.
OX NCBI_TaxID=9823;
RN [1]
RP NUCLEOTIDE SEQUENCE [MRNA].
RC TISSUE=Submandibular gland;
RX PubMed=2033060; DOI=10.1016/s0021-9258(18)92874-7;
RA Eckhardt A.E., Timpte C.S., Abernethy J.L., Zhao Y., Hill R.L.;
RT "Porcine submaxillary mucin contains a cystine-rich, carboxyl-terminal
RT domain in addition to a highly repetitive, glycosylated domain.";
RL J. Biol. Chem. 266:9678-9686(1991).
RN [2]
RP NUCLEOTIDE SEQUENCE [MRNA] OF 1-503.
RC TISSUE=Submandibular gland;
RX PubMed=2826455; DOI=10.1016/s0021-9258(19)35463-8;
RA Timpte C.S., Eckhardt A.E., Abernethy J.L., Hill R.L.;
RT "Porcine submaxillary gland apomucin contains tandemly repeated, identical
RT sequences of 81 residues.";
RL J. Biol. Chem. 263:1081-1088(1988).
RN [3]
RP PROTEIN SEQUENCE OF 45-80.
RC TISSUE=Submandibular gland;
RX PubMed=3611111; DOI=10.1016/s0021-9258(18)60964-0;
RA Eckhardt A.E., Timpte C.S., Abernethy J.L., Toumadje A., Johnson W.C. Jr.,
RA Hill R.L.;
RT "Structural properties of porcine submaxillary gland apomucin.";
RL J. Biol. Chem. 262:11339-11344(1987).
RN [4]
RP PROTEIN SEQUENCE OF 45-125, AND GLYCOSYLATION AT SER-46; SER-50; SER-51;
RP SER-57; SER-58; SER-61; THR-66; SER-67; THR-73; THR-74; SER-76; SER-77;
RP THR-81; THR-83; SER-87; SER-91; THR-93; THR-94; THR-96; SER-98; SER-101;
RP SER-103; THR-104; SER-106; SER-107; SER-108; SER-110; THR-114; SER-117;
RP THR-123 AND SER-124.
RC TISSUE=Submandibular gland;
RX PubMed=9092502; DOI=10.1074/jbc.272.15.9709;
RA Gerken T.A., Owens C.L., Pasumarthy M.;
RT "Determination of the site-specific O-glycosylation pattern of the porcine
RT submaxillary mucin tandem repeat glycopeptide. Model proposed for the
RT polypeptide:GalNAc transferase peptide binding site.";
RL J. Biol. Chem. 272:9709-9719(1997).
CC -!- FUNCTION: Apomucin is part of mucin, the major glycoprotein synthesized
CC and secreted by mucous cells of the submaxillary gland. Its highly
CC viscous aqueous solutions serve to lubricate the oral cavity and to
CC protect it from the external environment.
CC -!- SUBUNIT: Intermolecular disulfide bonds could help maintain a
CC multimeric mucin structure.
CC -!- SUBCELLULAR LOCATION: Secreted.
CC -!- TISSUE SPECIFICITY: Submaxillary mucosae.
CC -!- DOMAIN: Contains tandemly repeated, identical sequences of 81 residues.
CC -!- PTM: Extensively O-glycosylated on most but not all Ser and Thr
CC residues of the repeat units. Highest glycosylation appears to occur on
CC Ser residues which have Gly at positions at +2 or -2 from the
CC glycosylation site or, where Gly is the penultimate residue. The
CC presence of proline (usually at position +3 or -3) appears to also
CC enhance glycosylation. {ECO:0000269|PubMed:9092502}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; M61883; AAA30998.1; -; mRNA.
DR EMBL; M21174; AAA30990.1; -; mRNA.
DR AlphaFoldDB; P12021; -.
DR SMR; P12021; -.
DR iPTMnet; P12021; -.
DR PaxDb; P12021; -.
DR PRIDE; P12021; -.
DR eggNOG; KOG1216; Eukaryota.
DR Proteomes; UP000008227; Unplaced.
DR Proteomes; UP000314985; Unplaced.
DR GO; GO:0005576; C:extracellular region; IEA:UniProtKB-SubCell.
DR InterPro; IPR006207; Cys_knot_C.
DR InterPro; IPR006208; Glyco_hormone_CN.
DR InterPro; IPR001007; VWF_dom.
DR Pfam; PF00007; Cys_knot; 1.
DR SMART; SM00041; CT; 1.
DR SMART; SM00214; VWC; 2.
DR PROSITE; PS01185; CTCK_1; 1.
DR PROSITE; PS01225; CTCK_2; 1.
DR PROSITE; PS01208; VWFC_1; 1.
DR PROSITE; PS50184; VWFC_2; 1.
PE 1: Evidence at protein level;
KW Direct protein sequencing; Disulfide bond; Glycoprotein;
KW Reference proteome; Repeat; Secreted.
FT CHAIN <1..1150
FT /note="Apomucin"
FT /id="PRO_0000158958"
FT REPEAT <1..44
FT /note="1"
FT REPEAT 45..125
FT /note="2"
FT REPEAT 126..206
FT /note="3"
FT REPEAT 207..287
FT /note="4"
FT REPEAT 288..368
FT /note="5"
FT REPEAT 369..391
FT /note="6; truncated"
FT DOMAIN 929..995
FT /note="VWFC"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00220"
FT DOMAIN 1062..1146
FT /note="CTCK"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00039"
FT REGION <1..368
FT /note="6 X 81 AA tandem repeats"
FT REGION 1..730
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 776..925
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 776..792
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 799..917
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT CARBOHYD 46
FT /note="O-linked (GalNAc...) serine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 50
FT /note="O-linked (GalNAc...) serine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 51
FT /note="O-linked (GalNAc...) serine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 57
FT /note="O-linked (GalNAc...) serine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 58
FT /note="O-linked (GalNAc...) serine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 61
FT /note="O-linked (GalNAc...) serine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 66
FT /note="O-linked (GalNAc...) threonine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 67
FT /note="O-linked (GalNAc...) serine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 73
FT /note="O-linked (GalNAc...) threonine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 74
FT /note="O-linked (GalNAc...) threonine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 76
FT /note="O-linked (GalNAc...) serine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 77
FT /note="O-linked (GalNAc...) serine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 81
FT /note="O-linked (GalNAc...) threonine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 83
FT /note="O-linked (GalNAc...) threonine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 87
FT /note="O-linked (GalNAc...) serine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 91
FT /note="O-linked (GalNAc...) serine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 93
FT /note="O-linked (GalNAc...) threonine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 94
FT /note="O-linked (GalNAc...) threonine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 96
FT /note="O-linked (GalNAc...) threonine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 98
FT /note="O-linked (GalNAc...) serine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 101
FT /note="O-linked (GalNAc...) serine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 103
FT /note="O-linked (GalNAc...) serine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 104
FT /note="O-linked (GalNAc...) threonine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 106
FT /note="O-linked (GalNAc...) serine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 107
FT /note="O-linked (GalNAc...) serine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 108
FT /note="O-linked (GalNAc...) serine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 110
FT /note="O-linked (GalNAc...) serine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 114
FT /note="O-linked (GalNAc...) threonine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 117
FT /note="O-linked (GalNAc...) serine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 123
FT /note="O-linked (GalNAc...) threonine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 124
FT /note="O-linked (GalNAc...) serine; partial"
FT /evidence="ECO:0000269|PubMed:9092502"
FT CARBOHYD 418
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT CARBOHYD 547
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT CARBOHYD 917
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT CARBOHYD 985
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT CARBOHYD 1002
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT CARBOHYD 1068
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT DISULFID 1062..1109
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00039"
FT DISULFID 1076..1123
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00039"
FT DISULFID 1085..1139
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00039"
FT DISULFID 1089..1141
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00039"
FT DISULFID ?..1145
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00039"
FT NON_TER 1
SQ SEQUENCE 1150 AA; 109616 MW; 3CB68B5D29DD7F5A CRC64;
ETARPSVAGS GTTGTVSGAS GSTGSSSGST GATGASIGQP ETSRISVAGS SGAPAVSSGA
SQAAGTSGAG PGTTASSVGV TETARPSVAG SGTTGTVSGA SGSTGSSSGS PGATGASIGQ
PETSRISVAG SSGAPAVSSG ASQAAGTSGA GPGTTASSVG VTETARPSVA GSGTTGTVSG
ASGSTGSSSG SPGATGASIG QPETSRISVA GSSGAPAVSS GASQAAGTSG AGPGTTASSV
GVTETARPSV AGSGTTGTVS GASGSTGSSS GSPGATGASI GQPETSRISV AGSSGAPAVS
SGASQAAGTS GAGPGTTASS VGVTETARPS VAGSGTTGTV SGASGSTGSS SGSPGATGAS
IGQPETSRIS VAGSSGAPAV SSGASQAAGT SEATTSIEGA GTSGVGFKTE ATTFPGENET
TRVGIATGTT GIVSRKTLEP GSYNTEATTS IGRSGTTHTD LPGGTTIVLP GFSHSSQSSK
PGSSVTTPGS PESGSETGTS GEFSTTVISG SSHTEATTFI GGSGSPGTGS RPGTTGELSG
TTIASGNATT EATTSTETRI GPQTGAQTTV PGSQVSGSET GTSEAVSNPA IASGSSSTGT
TSGASDSQVT GSRTGTTGVV LGTTVAPGSS STGATTGVLI NEGTRSTSLG TTRVASGTTY
ESGTSNSVPS GGSGTPGSGI NTGGSSTQVT GIQTGTTAVG FGSTLLPGSS NTGATTSPSE
RTSPGSKTGI TRVVSGTTVA SGSSNTGATT SLGRGETTQG GIKIVITGVT VGTTVAPGSF
NTKATTPTEV RAATGAGTAV GATSRSTGIS TGPENSTPGT TETGSGTTSS PGGVKTEATT
FKGVGTTEAG ISSGNSPGSG GVTSSQEGTS REASETTTAP RISATGSTSV SKEITASPKV
SSPETTAGAT EDQENENKTG CPAPLPPPPV CHGPLGEEKS PGDVWTANCH KCTCTEAKTV
DCKPKECPSP PTCKTGERLI KFKANDTCCE IGHCEKRTCL FNNTDYEVGS SFDDPNNPCV
TYSCQNTGFT AVVQNCPKQT WCAEEDRVYD SKQCCYTCKS SCKPSPVNVT VRYNGCTIKV
EMARCVGECK KTVTYDYDIF QLKNSCLCCQ EEDYEFRDIV LDCPDGSTLP YRYRHITACS
CLDPCQQSMT