COKA1_HUMAN
ID COKA1_HUMAN Reviewed; 1284 AA.
AC Q9P218; Q4VXQ4; Q6PI59; Q8WUT2; Q96CY9; Q9BQU6; Q9BQU7;
DT 10-OCT-2002, integrated into UniProtKB/Swiss-Prot.
DT 23-MAR-2010, sequence version 4.
DT 03-AUG-2022, entry version 191.
DE RecName: Full=Collagen alpha-1(XX) chain;
DE Flags: Precursor;
GN Name=COL20A1; Synonyms=KIAA1510;
OS Homo sapiens (Human).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae;
OC Homo.
OX NCBI_TaxID=9606;
RN [1]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RX PubMed=11780052; DOI=10.1038/414865a;
RA Deloukas P., Matthews L.H., Ashurst J.L., Burton J., Gilbert J.G.R.,
RA Jones M., Stavrides G., Almeida J.P., Babbage A.K., Bagguley C.L.,
RA Bailey J., Barlow K.F., Bates K.N., Beard L.M., Beare D.M., Beasley O.P.,
RA Bird C.P., Blakey S.E., Bridgeman A.M., Brown A.J., Buck D., Burrill W.D.,
RA Butler A.P., Carder C., Carter N.P., Chapman J.C., Clamp M., Clark G.,
RA Clark L.N., Clark S.Y., Clee C.M., Clegg S., Cobley V.E., Collier R.E.,
RA Connor R.E., Corby N.R., Coulson A., Coville G.J., Deadman R., Dhami P.D.,
RA Dunn M., Ellington A.G., Frankland J.A., Fraser A., French L., Garner P.,
RA Grafham D.V., Griffiths C., Griffiths M.N.D., Gwilliam R., Hall R.E.,
RA Hammond S., Harley J.L., Heath P.D., Ho S., Holden J.L., Howden P.J.,
RA Huckle E., Hunt A.R., Hunt S.E., Jekosch K., Johnson C.M., Johnson D.,
RA Kay M.P., Kimberley A.M., King A., Knights A., Laird G.K., Lawlor S.,
RA Lehvaeslaiho M.H., Leversha M.A., Lloyd C., Lloyd D.M., Lovell J.D.,
RA Marsh V.L., Martin S.L., McConnachie L.J., McLay K., McMurray A.A.,
RA Milne S.A., Mistry D., Moore M.J.F., Mullikin J.C., Nickerson T.,
RA Oliver K., Parker A., Patel R., Pearce T.A.V., Peck A.I.,
RA Phillimore B.J.C.T., Prathalingam S.R., Plumb R.W., Ramsay H., Rice C.M.,
RA Ross M.T., Scott C.E., Sehra H.K., Shownkeen R., Sims S., Skuce C.D.,
RA Smith M.L., Soderlund C., Steward C.A., Sulston J.E., Swann R.M.,
RA Sycamore N., Taylor R., Tee L., Thomas D.W., Thorpe A., Tracey A.,
RA Tromans A.C., Vaudin M., Wall M., Wallis J.M., Whitehead S.L.,
RA Whittaker P., Willey D.L., Williams L., Williams S.A., Wilming L.,
RA Wray P.W., Hubbard T., Durbin R.M., Bentley D.R., Beck S., Rogers J.;
RT "The DNA sequence and comparative analysis of human chromosome 20.";
RL Nature 414:865-871(2001).
RN [2]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 1), AND NUCLEOTIDE SEQUENCE
RP [LARGE SCALE MRNA] OF 1013-1284 (ISOFORMS 2/3).
RC TISSUE=Brain;
RX PubMed=15489334; DOI=10.1101/gr.2596504;
RG The MGC Project Team;
RT "The status, quality, and expansion of the NIH full-length cDNA project:
RT the Mammalian Gene Collection (MGC).";
RL Genome Res. 14:2121-2127(2004).
RN [3]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] OF 158-1284 (ISOFORM 2).
RC TISSUE=Brain;
RX PubMed=10819331; DOI=10.1093/dnares/7.2.143;
RA Nagase T., Kikuno R., Ishikawa K., Hirosawa M., Ohara O.;
RT "Prediction of the coding sequences of unidentified human genes. XVII. The
RT complete sequences of 100 new cDNA clones from brain which code for large
RT proteins in vitro.";
RL DNA Res. 7:143-150(2000).
RN [4]
RP NUCLEOTIDE SEQUENCE [MRNA] OF 1053-1160 (ISOFORM 3).
RA Hillier L., Allen M., Bowles L., Dubuque T., Geisel G., Jost S.,
RA Krizman D., Kucaba T., Lacy M., Le N., Lennon G., Marra M., Martin J.,
RA Moore B., Schellenberg K., Steptoe M., Tan F., Theising B., White Y.,
RA Wylie T., Waterston R., Wilson R.;
RT "WashU-NCI human EST project.";
RL Submitted (JUN-2002) to the EMBL/GenBank/DDBJ databases.
RN [5]
RP STRUCTURE BY NMR OF 466-554.
RG RIKEN structural genomics initiative (RSGI);
RT "Solution structures of the FN3 domains of human collagen alpha-1(xx)
RT chain.";
RL Submitted (OCT-2006) to the PDB data bank.
CC -!- FUNCTION: Probable collagen protein.
CC -!- INTERACTION:
CC Q9P218-2; Q8IZU0: FAM9B; NbExp=3; IntAct=EBI-10318410, EBI-10175124;
CC Q9P218-2; Q04864: REL; NbExp=3; IntAct=EBI-10318410, EBI-307352;
CC Q9P218-2; Q8IYF3: TEX11; NbExp=3; IntAct=EBI-10318410, EBI-742397;
CC -!- SUBCELLULAR LOCATION: Secreted, extracellular space {ECO:0000305}.
CC -!- ALTERNATIVE PRODUCTS:
CC Event=Alternative splicing; Named isoforms=3;
CC Name=1;
CC IsoId=Q9P218-1; Sequence=Displayed;
CC Name=2;
CC IsoId=Q9P218-2; Sequence=VSP_038876, VSP_038878;
CC Name=3;
CC IsoId=Q9P218-3; Sequence=VSP_038877;
CC -!- TISSUE SPECIFICITY: High expression in heart, lung, liver, skeletal
CC muscle, kidney, pancreas, spleen, testis, ovary, subthalamic nucleus
CC and fetal liver. Weak expression in other tissues tested.
CC -!- SEQUENCE CAUTION:
CC Sequence=AAH19637.1; Type=Erroneous initiation; Note=Truncated N-terminus.; Evidence={ECO:0000305};
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AL121827; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; BC013658; AAH13658.1; -; mRNA.
DR EMBL; BC019637; AAH19637.1; ALT_INIT; mRNA.
DR EMBL; BC043183; AAH43183.1; -; mRNA.
DR EMBL; AB040943; BAA96034.1; -; mRNA.
DR EMBL; AI272270; -; NOT_ANNOTATED_CDS; mRNA.
DR CCDS; CCDS46628.1; -. [Q9P218-1]
DR RefSeq; NP_065933.2; NM_020882.2. [Q9P218-1]
DR RefSeq; XP_011527242.1; XM_011528940.1. [Q9P218-2]
DR PDB; 2DKM; NMR; -; A=466-556.
DR PDB; 2EE3; NMR; -; A=557-651.
DR PDB; 2EKJ; NMR; -; A=741-832.
DR PDB; 5KF4; X-ray; 2.50 A; A/B/C/D=368-466.
DR PDBsum; 2DKM; -.
DR PDBsum; 2EE3; -.
DR PDBsum; 2EKJ; -.
DR PDBsum; 5KF4; -.
DR AlphaFoldDB; Q9P218; -.
DR SMR; Q9P218; -.
DR BioGRID; 121679; 39.
DR ComplexPortal; CPX-1761; Collagen type XX trimer.
DR IntAct; Q9P218; 10.
DR STRING; 9606.ENSP00000351767; -.
DR GlyGen; Q9P218; 3 sites, 1 O-linked glycan (2 sites).
DR iPTMnet; Q9P218; -.
DR PhosphoSitePlus; Q9P218; -.
DR BioMuta; COL20A1; -.
DR DMDM; 292495087; -.
DR jPOST; Q9P218; -.
DR MassIVE; Q9P218; -.
DR PaxDb; Q9P218; -.
DR PeptideAtlas; Q9P218; -.
DR PRIDE; Q9P218; -.
DR ProteomicsDB; 83709; -. [Q9P218-1]
DR ProteomicsDB; 83710; -. [Q9P218-2]
DR ProteomicsDB; 83711; -. [Q9P218-3]
DR Antibodypedia; 29690; 207 antibodies from 27 providers.
DR DNASU; 57642; -.
DR Ensembl; ENST00000358894.11; ENSP00000351767.6; ENSG00000101203.17. [Q9P218-1]
DR Ensembl; ENST00000422202.5; ENSP00000414753.1; ENSG00000101203.17. [Q9P218-2]
DR GeneID; 57642; -.
DR KEGG; hsa:57642; -.
DR MANE-Select; ENST00000358894.11; ENSP00000351767.6; NM_020882.4; NP_065933.2.
DR UCSC; uc011aau.2; human. [Q9P218-1]
DR CTD; 57642; -.
DR DisGeNET; 57642; -.
DR GeneCards; COL20A1; -.
DR HGNC; HGNC:14670; COL20A1.
DR HPA; ENSG00000101203; Tissue enhanced (brain, testis).
DR MIM; 619390; gene.
DR neXtProt; NX_Q9P218; -.
DR OpenTargets; ENSG00000101203; -.
DR PharmGKB; PA142672086; -.
DR VEuPathDB; HostDB:ENSG00000101203; -.
DR eggNOG; KOG3544; Eukaryota.
DR GeneTree; ENSGT00940000163709; -.
DR HOGENOM; CLU_002527_0_0_1; -.
DR InParanoid; Q9P218; -.
DR OMA; VKPVQQY; -.
DR OrthoDB; 67372at2759; -.
DR PhylomeDB; Q9P218; -.
DR TreeFam; TF329914; -.
DR PathwayCommons; Q9P218; -.
DR Reactome; R-HSA-1650814; Collagen biosynthesis and modifying enzymes.
DR Reactome; R-HSA-8948216; Collagen chain trimerization.
DR SignaLink; Q9P218; -.
DR BioGRID-ORCS; 57642; 12 hits in 1062 CRISPR screens.
DR ChiTaRS; COL20A1; human.
DR EvolutionaryTrace; Q9P218; -.
DR GenomeRNAi; 57642; -.
DR Pharos; Q9P218; Tdark.
DR PRO; PR:Q9P218; -.
DR Proteomes; UP000005640; Chromosome 20.
DR RNAct; Q9P218; protein.
DR Bgee; ENSG00000101203; Expressed in right testis and 108 other tissues.
DR ExpressionAtlas; Q9P218; baseline and differential.
DR Genevisible; Q9P218; HS.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0062023; C:collagen-containing extracellular matrix; IBA:GO_Central.
DR GO; GO:0005788; C:endoplasmic reticulum lumen; TAS:Reactome.
DR GO; GO:0005576; C:extracellular region; TAS:Reactome.
DR GO; GO:0005615; C:extracellular space; IBA:GO_Central.
DR CDD; cd00063; FN3; 6.
DR Gene3D; 2.60.40.10; -; 6.
DR Gene3D; 3.40.50.410; -; 1.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR013320; ConA-like_dom_sf.
DR InterPro; IPR003961; FN3_dom.
DR InterPro; IPR036116; FN3_sf.
DR InterPro; IPR013783; Ig-like_fold.
DR InterPro; IPR001791; Laminin_G.
DR InterPro; IPR002035; VWF_A.
DR InterPro; IPR036465; vWFA_dom_sf.
DR Pfam; PF01391; Collagen; 2.
DR Pfam; PF00041; fn3; 5.
DR Pfam; PF00092; VWA; 1.
DR SMART; SM00060; FN3; 6.
DR SMART; SM00210; TSPN; 1.
DR SMART; SM00327; VWA; 1.
DR SUPFAM; SSF49265; SSF49265; 4.
DR SUPFAM; SSF49899; SSF49899; 1.
DR SUPFAM; SSF53300; SSF53300; 1.
DR PROSITE; PS50853; FN3; 6.
DR PROSITE; PS50234; VWFA; 1.
PE 1: Evidence at protein level;
KW 3D-structure; Alternative splicing; Collagen; Glycoprotein;
KW Reference proteome; Repeat; Secreted; Signal.
FT SIGNAL 1..22
FT /evidence="ECO:0000255"
FT CHAIN 23..1284
FT /note="Collagen alpha-1(XX) chain"
FT /id="PRO_0000013986"
FT DOMAIN 28..119
FT /note="Fibronectin type-III 1"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 179..354
FT /note="VWFA"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00219"
FT DOMAIN 379..468
FT /note="Fibronectin type-III 2"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 469..559
FT /note="Fibronectin type-III 3"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 560..647
FT /note="Fibronectin type-III 4"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 649..738
FT /note="Fibronectin type-III 5"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 743..833
FT /note="Fibronectin type-III 6"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00316"
FT DOMAIN 842..1037
FT /note="Laminin G-like"
FT DOMAIN 1071..1127
FT /note="Collagen-like 1"
FT DOMAIN 1133..1190
FT /note="Collagen-like 2"
FT REGION 122..171
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1065..1190
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 1212..1284
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 122..164
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT CARBOHYD 607
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT VAR_SEQ 165..166
FT /note="PA -> PGGSEWRET (in isoform 2)"
FT /evidence="ECO:0000303|PubMed:10819331"
FT /id="VSP_038876"
FT VAR_SEQ 1098
FT /note="R -> RGEPGPPGQMGPEGPGGQQGSPGTQGRAVQGPV (in isoform
FT 3)"
FT /evidence="ECO:0000303|Ref.4"
FT /id="VSP_038877"
FT VAR_SEQ 1204..1205
FT /note="AS -> ACESAIQT (in isoform 2)"
FT /evidence="ECO:0000303|PubMed:10819331"
FT /id="VSP_038878"
FT VARIANT 134
FT /note="P -> L (in dbSNP:rs753686)"
FT /id="VAR_055671"
FT CONFLICT 13
FT /note="L -> V (in Ref. 2; AAH43183)"
FT /evidence="ECO:0000305"
FT CONFLICT 78
FT /note="P -> H (in Ref. 2; AAH43183)"
FT /evidence="ECO:0000305"
FT CONFLICT 612
FT /note="T -> M (in Ref. 3; BAA96034)"
FT /evidence="ECO:0000305"
FT CONFLICT 924
FT /note="E -> K (in Ref. 2; AAH43183)"
FT /evidence="ECO:0000305"
FT CONFLICT 1230
FT /note="P -> Q (in Ref. 2; AAH43183)"
FT /evidence="ECO:0000305"
FT STRAND 374..377
FT /evidence="ECO:0007829|PDB:5KF4"
FT STRAND 381..386
FT /evidence="ECO:0007829|PDB:5KF4"
FT STRAND 393..398
FT /evidence="ECO:0007829|PDB:5KF4"
FT STRAND 405..413
FT /evidence="ECO:0007829|PDB:5KF4"
FT STRAND 420..425
FT /evidence="ECO:0007829|PDB:5KF4"
FT STRAND 430..433
FT /evidence="ECO:0007829|PDB:5KF4"
FT STRAND 441..450
FT /evidence="ECO:0007829|PDB:5KF4"
FT STRAND 453..455
FT /evidence="ECO:0007829|PDB:5KF4"
FT STRAND 458..463
FT /evidence="ECO:0007829|PDB:5KF4"
FT STRAND 474..478
FT /evidence="ECO:0007829|PDB:2DKM"
FT STRAND 480..486
FT /evidence="ECO:0007829|PDB:2DKM"
FT STRAND 494..505
FT /evidence="ECO:0007829|PDB:2DKM"
FT STRAND 512..525
FT /evidence="ECO:0007829|PDB:2DKM"
FT STRAND 532..539
FT /evidence="ECO:0007829|PDB:2DKM"
FT STRAND 541..544
FT /evidence="ECO:0007829|PDB:2DKM"
FT STRAND 548..551
FT /evidence="ECO:0007829|PDB:2DKM"
FT STRAND 563..567
FT /evidence="ECO:0007829|PDB:2EE3"
FT STRAND 573..578
FT /evidence="ECO:0007829|PDB:2EE3"
FT STRAND 585..593
FT /evidence="ECO:0007829|PDB:2EE3"
FT TURN 594..596
FT /evidence="ECO:0007829|PDB:2EE3"
FT STRAND 602..605
FT /evidence="ECO:0007829|PDB:2EE3"
FT STRAND 610..613
FT /evidence="ECO:0007829|PDB:2EE3"
FT STRAND 621..629
FT /evidence="ECO:0007829|PDB:2EE3"
FT STRAND 635..643
FT /evidence="ECO:0007829|PDB:2EE3"
FT STRAND 745..753
FT /evidence="ECO:0007829|PDB:2EKJ"
FT STRAND 756..762
FT /evidence="ECO:0007829|PDB:2EKJ"
FT STRAND 769..771
FT /evidence="ECO:0007829|PDB:2EKJ"
FT STRAND 793..798
FT /evidence="ECO:0007829|PDB:2EKJ"
SQ SEQUENCE 1284 AA; 135830 MW; E84D522A932D89C6 CRC64;
MSSGDPAHLG LCLWLWLGAT LGREQVQASG LLRLAVLPED RLQMKWRESE GSGLGYLVQV
KPMAGDSEQE VILTTKTPKA TVGGLSPSKG YTLQIFELTG SGRFLLARRE FVIEDLKSSS
LDRSSQRPLG SGAPEPTPSH TGSPDPEQAS EPQVAFTPSQ DPRTPAGPQF RCLPPVPADM
VFLVDGSWSI GHSHFQQVKD FLASVIAPFE IGPDKVQVGL TQYSGDAQTE WDLNSLSTKE
QVLAAVRRLR YKGGNTFTGL ALTHVLGQNL QPAAGLRPEA AKVVILVTDG KSQDDVHTAA
RVLKDLGVNV FAVGVKNADE AELRLLASPP RDITVHSVLD FLQLGALAGL LSRLICQRLQ
GGSPRQGPAA APALDTLPAP TSLVLSQVTS SSIRLSWTPA PRHPLKYLIV WRASRGGTPR
EVVVEGPAAS TELHNLASRT EYLVSVFPIY EGGVGEGLRG LVTTAPLPPP RALTLAAVTP
RTVHLTWQPS AGATHYLVRC SPASPKGEEE EREVQVGRPE VLLDGLEPGR DYEVSVQSLR
GPEGSEARGI RARTPTLAPP RHLGFSDVSH DAARVFWEGA PRPVRLVRVT YVSSEGGHSG
QTEAPGNATS ATLGPLSSST TYTVRVTCLY PGGGSSTLTG RVTTKKAPSP SQLSMTELPG
DAVQLAWVAA APSGVLVYQI TWTPLGEGKA HEISVPGNLG TAVLPGLGRH TEYDVTILAY
YRDGARSDPV SLRYTPSTVS RSPPSNLALA SETPDSLQVS WTPPLGRVLH YWLTYAPASG
LGPEKSVSVP GARSHVTLPD LQAATKYRVL VSAIYAAGRS EAVSATGQTA CPALRPDGSL
PGFDLMVAFS LVEKAYASIR GVAMEPSAFG GTPTFTLFKD AQLTRRVSDV YPAPLPPEHT
IVFLVRLLPE TPREAFALWQ MTAEDFQPLL GVLLDAGKKS LTYFHRDPRA ALQEATFDPQ
EVRKIFFGSF HKVHVAVGRS KVRLYVDCRK VAERPLGEMG SPPAAGFVTL GRLAKARGPR
SSSAAFQLQM LQIVCSDTWA DEDRCCELPA SRDGETCPAF VSACSCSSET PGPPGPQGPP
GLPGRNGTPG EQGFPGPRGP PGVKGEKGDH GLPGLQGHPG HQGIPGRVGL QGPKGMRGLE
GTAGLPGPPG PRGFQGMAGA RGTSGERGPP GTVGPTGLPG PKGERGEKGE PQSLATLYQL
VSQASHVSKF DSFHENTRPP MPILEQKLEP GTEPLGSPGT RSKALVPGEW GRGGRHLEGR
GEPGAVGQMG SPGQQGASTQ GLWE