COCA1_CHICK
ID COCA1_CHICK Reviewed; 3124 AA.
AC P13944; Q04509;
DT 01-JAN-1990, integrated into UniProtKB/Swiss-Prot.
DT 01-NOV-1997, sequence version 3.
DT 03-AUG-2022, entry version 168.
DE RecName: Full=Collagen alpha-1(XII) chain;
DE AltName: Full=Fibrochimerin;
DE Flags: Precursor;
GN Name=COL12A1;
OS Gallus gallus (Chicken).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Archelosauria; Archosauria; Dinosauria; Saurischia; Theropoda;
OC Coelurosauria; Aves; Neognathae; Galloanserae; Galliformes; Phasianidae;
OC Phasianinae; Gallus.
OX NCBI_TaxID=9031;
RN [1]
RP NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM LONG).
RC STRAIN=White leghorn; TISSUE=Embryonic fibroblast;
RX PubMed=1918137; DOI=10.1083/jcb.115.1.209;
RA Yamagata M., Yamada K.M., Yamada S.S., Shinomura T., Tanaka H., Nishida Y.,
RA Obara M., Kimata K.;
RT "The complete primary structure of type XII collagen shows a chimeric
RT molecule with reiterated fibronectin type III motifs, von Willebrand factor
RT A motifs, a domain homologous to a noncollagenous region of type IX
RT collagen, and short collagenous domains with an Arg-Gly-Asp site.";
RL J. Cell Biol. 115:209-221(1991).
RN [2]
RP NUCLEOTIDE SEQUENCE [MRNA] OF 2456-3124, AND PROTEIN SEQUENCE OF 2772-2794
RP AND 2846-2873.
RX PubMed=2584192; DOI=10.1016/s0021-9258(19)47179-2;
RA Gordon M.K., Gerecke D.R., Dublet B., van der Rest M., Olsen B.R.;
RT "Type XII collagen. A large multidomain molecule with partial homology to
RT type IX collagen.";
RL J. Biol. Chem. 264:19772-19778(1989).
RN [3]
RP NUCLEOTIDE SEQUENCE [MRNA] OF 2960-3076.
RX PubMed=3476925; DOI=10.1073/pnas.84.17.6040;
RA Gordon M.K., Gerecke D.R., Olsen B.R.;
RT "Type XII collagen: distinct extracellular matrix component discovered by
RT cDNA cloning.";
RL Proc. Natl. Acad. Sci. U.S.A. 84:6040-6044(1987).
RN [4]
RP NUCLEOTIDE SEQUENCE [MRNA] OF 1-1283 (ISOFORM SHORT), AND ALTERNATIVE
RP SPLICING.
RC TISSUE=Embryo;
RX PubMed=1420368; DOI=10.1016/0167-4781(92)90145-p;
RA Trueb J., Trueb B.;
RT "The two splice variants of collagen XII share a common 5' end.";
RL Biochim. Biophys. Acta 1171:97-98(1992).
RN [5]
RP ALTERNATIVE SPLICING.
RX PubMed=7642694; DOI=10.1083/jcb.130.4.1005;
RA Koch M., Bohrmann B., Matthison M., Hagios C., Trueb B., Chiquet M.;
RT "Large and small splice variants of collagen XII: differential expression
RT and ligand binding.";
RL J. Cell Biol. 130:1005-1014(1995).
CC -!- FUNCTION: Type XII collagen interacts with type I collagen-containing
CC fibrils, the COL1 domain could be associated with the surface of the
CC fibrils, and the COL2 and NC3 domains may be localized in the
CC perifibrillar matrix.
CC -!- SUBUNIT: Trimer of identical chains each containing 190 kDa of non-
CC triple-helical sequences.
CC -!- SUBCELLULAR LOCATION: Secreted, extracellular space, extracellular
CC matrix {ECO:0000250}.
CC -!- ALTERNATIVE PRODUCTS:
CC Event=Alternative splicing; Named isoforms=2;
CC Comment=The final tissue form of collagen XII may contain homotrimers
CC of either isoform Long or isoform Short or any combination of isoform
CC Long and isoform Short. Only isoform Long is a proteoglycan. Isoform
CC Long has more restricted expression in embryonic tissue than isoform
CC Short.;
CC Name=Long;
CC IsoId=P13944-1; Sequence=Displayed;
CC Name=Short;
CC IsoId=P13944-2; Sequence=VSP_001148;
CC -!- TISSUE SPECIFICITY: Type XII collagen is present in tendons, ligaments,
CC perichondrium, and periosteum, all dense connective tissues containing
CC type I collagen.
CC -!- DOMAIN: This sequence defines five distinct domains, two triple-helical
CC domains (COL1 and COL2) and three non-triple-helical domains (NC1, NC2,
CC and NC3).
CC -!- PTM: The triple-helical tail is stabilized by disulfide bonds at each
CC end.
CC -!- PTM: Prolines at the third position of the tripeptide repeating unit
CC (G-X-Y) are hydroxylated in some or all of the chains.
CC -!- PTM: O-glycosylated; glycosaminoglycan of chondroitin-sulfate type.
CC {ECO:0000250}.
CC -!- SIMILARITY: Belongs to the fibril-associated collagens with interrupted
CC helices (FACIT) family. {ECO:0000305}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; D00824; BAA00701.1; -; mRNA.
DR EMBL; X61024; CAA43358.1; -; mRNA.
DR EMBL; J05137; AAA48635.1; -; mRNA.
DR EMBL; M17375; AAA48718.1; -; mRNA.
DR EMBL; X67327; CAA47744.1; -; mRNA.
DR PIR; A40020; A40020.
DR SMR; P13944; -.
DR ComplexPortal; CPX-3109; Collagen type XII trimer.
DR STRING; 9031.ENSGALP00000025593; -.
DR PaxDb; P13944; -.
DR PRIDE; P13944; -.
DR VEuPathDB; HostDB:geneid_395875; -.
DR eggNOG; KOG3544; Eukaryota.
DR InParanoid; P13944; -.
DR PhylomeDB; P13944; -.
DR Proteomes; UP000000539; Unplaced.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0062023; C:collagen-containing extracellular matrix; IBA:GO_Central.
DR GO; GO:0005615; C:extracellular space; IBA:GO_Central.
DR GO; GO:0007155; P:cell adhesion; IEA:UniProtKB-KW.
DR GO; GO:0035987; P:endodermal cell differentiation; IBA:GO_Central.
DR CDD; cd00063; FN3; 18.
DR Gene3D; 2.60.40.10; -; 18.
DR Gene3D; 3.40.50.410; -; 4.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR013320; ConA-like_dom_sf.
DR InterPro; IPR003961; FN3_dom.
DR InterPro; IPR036116; FN3_sf.
DR InterPro; IPR013783; Ig-like_fold.
DR InterPro; IPR001791; Laminin_G.
DR InterPro; IPR002035; VWF_A.
DR InterPro; IPR036465; vWFA_dom_sf.
DR Pfam; PF01391; Collagen; 4.
DR Pfam; PF00041; fn3; 17.
DR Pfam; PF00092; VWA; 4.
DR SMART; SM00060; FN3; 18.
DR SMART; SM00210; TSPN; 1.
DR SMART; SM00327; VWA; 4.
DR SUPFAM; SSF49265; SSF49265; 11.
DR SUPFAM; SSF49899; SSF49899; 1.
DR SUPFAM; SSF53300; SSF53300; 4.
DR PROSITE; PS50853; FN3; 18.
DR PROSITE; PS50234; VWFA; 4.
PE 1: Evidence at protein level;
KW Alternative splicing; Cell adhesion; Collagen; Direct protein sequencing;
KW Disulfide bond; Extracellular matrix; Glycoprotein; Hydroxylation;
KW Proteoglycan; Reference proteome; Repeat; Secreted; Signal.
FT SIGNAL 1..23
FT /evidence="ECO:0000255"
FT CHAIN 24..3124
FT /note="Collagen alpha-1(XII) chain"
FT /id="PRO_0000005782"
FT DOMAIN 27..117
FT /note="Fibronectin type-III 1"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 139..311
FT /note="VWFA 1"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00219"
FT DOMAIN 335..424
FT /note="Fibronectin type-III 2"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 439..615
FT /note="VWFA 2"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00219"
FT DOMAIN 633..722
FT /note="Fibronectin type-III 3"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 724..815
FT /note="Fibronectin type-III 4"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 816..906
FT /note="Fibronectin type-III 5"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 908..998
FT /note="Fibronectin type-III 6"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 999..1087
FT /note="Fibronectin type-III 7"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 1089..1179
FT /note="Fibronectin type-III 8"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 1199..1371
FT /note="VWFA 3"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00219"
FT DOMAIN 1387..1476
FT /note="Fibronectin type-III 9"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 1477..1568
FT /note="Fibronectin type-III 10"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 1569..1659
FT /note="Fibronectin type-III 11"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 1660..1756
FT /note="Fibronectin type-III 12"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 1759..1853
FT /note="Fibronectin type-III 13"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 1854..1939
FT /note="Fibronectin type-III 14"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 1940..2030
FT /note="Fibronectin type-III 15"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 2031..2121
FT /note="Fibronectin type-III 16"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 2122..2210
FT /note="Fibronectin type-III 17"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 2211..2299
FT /note="Fibronectin type-III 18"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 2327..2500
FT /note="VWFA 4"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00219"
FT DOMAIN 2524..2716
FT /note="Laminin G-like"
FT DOMAIN 2751..2802
FT /note="Collagen-like 1"
FT DOMAIN 2807..2858
FT /note="Collagen-like 2"
FT DOMAIN 2859..2900
FT /note="Collagen-like 3"
FT DOMAIN 2945..2994
FT /note="Collagen-like 4"
FT REGION 1075..1100
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2455..2750
FT /note="Nonhelical region (NC3)"
FT REGION 2749..2900
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2751..2902
FT /note="Triple-helical region (COL2) with 1 imperfection"
FT REGION 2903..2945
FT /note="Nonhelical region (NC2)"
FT REGION 2935..3080
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 2946..3048
FT /note="Triple-helical region (COL1) with 2 imperfections"
FT REGION 3049..3124
FT /note="Nonhelical region (NC1)"
FT MOTIF 2899..2901
FT /note="Cell attachment site"
FT /evidence="ECO:0000255"
FT COMPBIAS 2782..2802
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2828..2842
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2944..2958
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 2967..2981
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 3022..3040
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT CARBOHYD 32
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT CARBOHYD 797
FT /note="O-linked (Xyl...) (chondroitin sulfate) serine"
FT /evidence="ECO:0000255"
FT CARBOHYD 890
FT /note="O-linked (Xyl...) (chondroitin sulfate) serine"
FT /evidence="ECO:0000255"
FT CARBOHYD 981
FT /note="O-linked (Xyl...) (chondroitin sulfate) serine"
FT /evidence="ECO:0000255"
FT CARBOHYD 1006
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT CARBOHYD 1032
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT CARBOHYD 1044
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT CARBOHYD 1512
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT CARBOHYD 1767
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT CARBOHYD 2210
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT CARBOHYD 2273
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT CARBOHYD 2532
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT CARBOHYD 2683
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT VAR_SEQ 25..1188
FT /note="Missing (in isoform Short)"
FT /evidence="ECO:0000303|PubMed:1420368"
FT /id="VSP_001148"
FT CONFLICT 1258
FT /note="T -> S (in Ref. 4; CAA47744)"
FT /evidence="ECO:0000305"
FT CONFLICT 1264
FT /note="D -> E (in Ref. 4; CAA47744)"
FT /evidence="ECO:0000305"
FT CONFLICT 2759
FT /note="P -> A (in Ref. 2; AAA48635)"
FT /evidence="ECO:0000305"
FT CONFLICT 2803
FT /note="L -> F (in Ref. 2; AAA48635)"
FT /evidence="ECO:0000305"
FT CONFLICT 2977
FT /note="V -> F (in Ref. 2; AAA48635 and 3; AAA48718)"
FT /evidence="ECO:0000305"
FT CONFLICT 3075..3076
FT /note="QP -> AG (in Ref. 3; AAA48718)"
FT /evidence="ECO:0000305"
SQ SEQUENCE 3124 AA; 340582 MW; 094285AFE7F346CF CRC64;
MRTALCSAVA ALCAAALLSS IEAEVNPPSD LNFTIIDEHN VQMSWKRPPD AIVGYRITVV
PTNDGPTKEF TLSPSTTQTV LSDLIPEIEY VVSIASYDEV EESLPVFGQL TIQTGGPGIP
EEKKVEAQIQ KCSISAMTDL VFLVDGSWSV GRNNFRYILD FMVALVSAFD IGEEKTRVGV
VQYSSDTRTE FNLNQYFRRS DLLDAIKRIP YKGGNTMTGE AIDYLVKNTF TESAGARKGF
PKVAIVITDG KAQDEVEIPA RELRNIGVEV FSLGIKAADA KELKLIASQP SLKHVFNVAN
FDGIVDIQNE IILQVCSGVD EQLGELVSGE EVVEPASNLV ATQISSKSVR ITWDPSTSQI
TGYRVQFIPM IAGGKQHVLS VGPQTTALNV KDLSPDTEYQ INVYAMKGLT PSEPITIMEK
TQQVKVQVEC SRGVDVKADV VFLVDGSYSI GIANFVKVRA FLEVLVKSFE ISPRKVQISL
VQYSRDPHME FSLNRYNRVK DIIQAINTFP YRGGSTNTGK AMTYVREKVF VTSKGSRPNV
PRVMILITDG KSSDAFKEPA IKLRDADVEI FAVGVKDAVR TELEAIASPP AETHVYTVED
FDAFQRISFE LTQSVCLRIE QELAAIRKKS YVPAKNMVFS DVTSDSFKVS WSAAGSEEKS
YLIKYKVAIG GDEFIVSVPA SSTSSVLTNL LPETTYAVSV IAEYEDGDGP PLDGEETTLE
VKGAPRNLRI TDETTDSFIV GWTPAPGNVL RYRLVYRPLT GGERRQVTVS ANERSTTLRN
LIPDTRYEVS VIAEYQSGPG NALNGYAKTD EVRGNPRNLR VSDATTSTTM KLSWSAAPGK
VQHVLYNLHT RYAGVETKEL TVKGDTTSKE LKGLDEATRY ALTVSALYAS GAGEALSGEG
ETLEERGSPR NLITTDITDT TVGLSWTPAP GTVNNYRIVW KSLYDDTMGE KRVPGNTVDA
VLDGLEPETK YRISIYAAYS SGEGDPVEGE AFTDVSQSAR TVTVDNETEN TMRVSVAALT
WEGLVLARVL PNRSGGRQMF GKVNASATSI VLKRLKPRTT YDLSVVPIYD FGQGKSRKAE
GTTASPFKPP RNLRTSDSTM SSFRVTWEPA PGRVKGYKVT FHPTEDDRNL GELVVGPYDS
TVVLEELRAG TTYKVNVFGM FDGGESNPLV GQEMTTLSDT TTEPFLSRGL ECRTRAEADI
VLLVDGSWSI GRPNFKTVRN FISRIVEVFD IGPDKVQIGL AQYSGDPRTE WNLNAYRTKE
ALLDAVTNLP YKGGNTLTGM ALDFILKNNF KQEAGLRPRA RKIGVLITDG KSQDDVVTPS
RRLRDEGVEL YAIGIKNADE NELKQIATDP DDIHAYNVAD FSFLASIGED VTTNLCNSVK
GPGDLPPPSN LVISEVTPHS FRLRWSPPPE SVDRYRVEYY PTTGGPPKQF YVSRMETTTV
LKDLTPETEY IVNVFSVVED ESSEPLIGRE ITYPLSSVRN LNVYDIGSTS MRVRWEPVNG
ATGYLLTYEP VNATVPTTEK EMRVGPSVNE VQLVDLIPNT EYTLTAYVLY GDITSDPLTS
QEVTLPLPGP RGVTIRDVTH STMNVLWDPA PGKVRKYIIR YKIADEADVK EVEIDRLKTS
TTLTDLSSQR LYNVKVVAVY DEGESLPVVA SCYSAVPSPV NLRITEITKN SFRGTWDHGA
PDVSLYRITW GPYGRSEKAE SIVNGDVNSL LFENLNPDTL YEVSVTAIYP DESETVDDLI
GSERTLPLVP ITTPAPKSGP RNLQVYNATS HSLTVKWDPA SGRVQRYKII YQPINGDGPE
QSTMVGGRQN SVVIQKLQPD TPYAITVSSM YADGEGGRMT GRGRTKPLTT VKNMLVYDPT
TSTLNVRWDH AEGNPRQYKV FYRPTAGGAE EMTTVPGNTN YVILRSLEPN TPYTVTVVPV
FPEGDGGRTT DTGRTLERGT PRNIQVYNPT PNSMNVRWEP APGPVQQYRV NYSPLSGPRP
SESIVVPANT RDVMLERLTP DTAYSINVIA LYADGEGNPS QAQGRTLPRS GPRNLRVFDE
TTNSLSVQWD HADGPVQQYR IIYSPTVGDP IDEYTTVPGI RNNVILQPLQ SDTPYKITVV
AVYEDGDGGQ LTGNGRTVGL LPPQNIYITD EWYTRFRVSW DPSPSPVLGY KIVYKPVGSN
EPMEVFVGEV TSYTLHNLSP STTYDVNVYA QYDSGMSIPL TDQGTTLYLN VTDLTTYKIG
WDTFCIRWSP HRSATSYRLK LNPADGSRGQ EITVRGSETS HCFTGLSPDT EYNATVFVQT
PNLEGPPVSV REHTVLKPTE APTPPPTPPP PPTIPPARDV CRGAKADIVF LTDASWSIGD
DNFNKVVKFV FNTVGAFDLI NPAGIQVSLV QYSDEAQSEF KLNTFDDKAQ ALGALQNVQY
RGGNTRTGKA LTFIKEKVLT WESGMRRGVP KVLVVVTDGR SQDEVRKAAT VIQHSGFSVF
VVGVADVDYN ELAKIASKPS ERHVFIVDDF DAFEKIQDNL VTFVCETATS TCPLIYLEGY
TSPGFKMLES YNLTEKHFAS VQGVSLESGS FPSYVAYRLH KNAFVSQPIR EIHPEGLPQA
YTIIMLFRLL PESPSEPFAI WQITDRDYKP QVGVVLDPGS KVLSFFNKDT RGEVQTVTFD
NDEVKKIFYG SFHKVHIVVT SSNVKIYIDC SEILEKPIKE AGNITTDGYE ILGKLLKGDR
RSATLEIQNF DIVCSPVWTS RDRCCDLPSM RDEAKCPALP NACTCTQDSV GPPGPPGPPG
GPGAKGPRGE RGLTGSSGPP GPRGETGPPG PQGPPGPQGP NGLQIPGEPG RQGMKGDAGQ
PGLPGRSGTP GLPGPPGPVG PPGERGFTGK DGPTGPRGPP GPAGAPGVPG VAGPSGKPGK
PGDRGTPGTP GMKGEKGDRG DIASQNMMRA VARQVCEQLI NGQMSRFNQM LNQIPNDYYS
NRNQPGPPGP PGPPGAAGTR GEPGPGGRPG FPGPPGVQGP PGERGMPGEK GERGTGSQGP
RGLPGPPGPQ GESRTGPPGS TGSRGPPGPP GRPGNAGIRG PPGPPGYCDS SQCASIPYNG
QGFPEPYVPE SGPYQPEGEP FIVPMESERR EDEYEDYGVE MHSPEYPEHM RWKRSLSRKA
KRKP