CO4A1_BOVIN
ID CO4A1_BOVIN Reviewed; 1669 AA.
AC Q7SIB2; G1K238;
DT 05-JUL-2005, integrated into UniProtKB/Swiss-Prot.
DT 25-OCT-2017, sequence version 2.
DT 03-AUG-2022, entry version 115.
DE RecName: Full=Collagen alpha-1(IV) chain {ECO:0000250|UniProtKB:P02462};
DE Contains:
DE RecName: Full=Arresten;
DE Flags: Precursor;
GN Name=COL4A1 {ECO:0000250|UniProtKB:P02462};
OS Bos taurus (Bovine).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Laurasiatheria; Artiodactyla; Ruminantia; Pecora; Bovidae;
OC Bovinae; Bos.
OX NCBI_TaxID=9913;
RN [1]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Hereford;
RX PubMed=19393038; DOI=10.1186/gb-2009-10-4-r42;
RA Zimin A.V., Delcher A.L., Florea L., Kelley D.R., Schatz M.C., Puiu D.,
RA Hanrahan F., Pertea G., Van Tassell C.P., Sonstegard T.S., Marcais G.,
RA Roberts M., Subramanian P., Yorke J.A., Salzberg S.L.;
RT "A whole-genome assembly of the domestic cow, Bos taurus.";
RL Genome Biol. 10:R42.01-R42.10(2009).
RN [2]
RP PARTIAL PROTEIN SEQUENCE, AND PROLINE HYDROXYLATION.
RX PubMed=6430279; DOI=10.1042/bj2200227;
RA Schuppan D., Glanville R.W., Timpl R., Dixit S.N., Kang A.H.;
RT "Sequence comparison of pepsin-resistant segments of basement-membrane
RT collagen alpha 1(IV) chains from bovine lens capsule and mouse tumour.";
RL Biochem. J. 220:227-233(1984).
RN [3]
RP PARTIAL PROTEIN SEQUENCE, HYDROXYLATION AT PRO-204; PRO-207; PRO-210;
RP PRO-587; PRO-602; PRO-605; PRO-647; PRO-1214 AND PRO-1424, AND
RP IDENTIFICATION BY MASS SPECTROMETRY.
RX PubMed=24368846; DOI=10.1073/pnas.1307597111;
RA Pokidysheva E., Boudko S., Vranka J., Zientek K., Maddox K., Moser M.,
RA Faessler R., Ware J., Baechinger H.P.;
RT "Biological role of prolyl 3-hydroxylation in type IV collagen.";
RL Proc. Natl. Acad. Sci. U.S.A. 111:161-166(2014).
RN [4]
RP INTERCHAIN SULFILIMINE BONDS, AND IDENTIFICATION BY MASS SPECTROMETRY.
RX PubMed=19729652; DOI=10.1126/science.1176811;
RA Vanacore R., Ham A.-J.L., Voehler M., Sanders C.R., Conrads T.P.,
RA Veenstra T.D., Sharpless K.B., Dawson P.E., Hudson B.G.;
RT "A sulfilimine bond identified in collagen IV.";
RL Science 325:1230-1234(2009).
RN [5]
RP X-RAY CRYSTALLOGRAPHY (2.0 ANGSTROMS) OF 1441-1669, DISULFIDE BONDS, AND
RP SUBUNIT.
RX PubMed=11970952; DOI=10.1074/jbc.m201740200;
RA Sundaramoorthy M., Meiyappan M., Todd P., Hudson B.G.;
RT "Crystal structure of NC1 domains. Structural basis for type IV collagen
RT assembly in basement membranes.";
RL J. Biol. Chem. 277:31142-31153(2002).
CC -!- FUNCTION: Type IV collagen is the major structural component of
CC glomerular basement membranes (GBM), forming a 'chicken-wire' meshwork
CC together with laminins, proteoglycans and entactin/nidogen.
CC {ECO:0000250|UniProtKB:P02463}.
CC -!- FUNCTION: Arresten, comprising the C-terminal NC1 domain, inhibits
CC angiogenesis and tumor formation. The C-terminal half is found to
CC possess the anti-angiogenic activity. Specifically inhibits endothelial
CC cell proliferation, migration and tube formation.
CC {ECO:0000250|UniProtKB:P02462}.
CC -!- SUBUNIT: There are six type IV collagen isoforms, alpha 1(IV)-alpha
CC 6(IV), each of which can form a triple helix structure with 2 other
CC chains to generate type IV collagen network. Interacts with EFEMP2 (By
CC similarity). {ECO:0000250|UniProtKB:P02463,
CC ECO:0000305|PubMed:11970952}.
CC -!- SUBCELLULAR LOCATION: Secreted, extracellular space, extracellular
CC matrix, basement membrane {ECO:0000250|UniProtKB:P02463}.
CC -!- DOMAIN: Alpha chains of type IV collagen have a non-collagenous domain
CC (NC1) at their C-terminus, frequent interruptions of the G-X-Y repeats
CC in the long central triple-helical domain (which may cause flexibility
CC in the triple helix), and a short N-terminal triple-helical 7S domain.
CC NC1 domain mediates hexamerization of alpha chains of type IV collagen
CC (By similarity). {ECO:0000250|UniProtKB:P02463, ECO:0000305}.
CC -!- PTM: Lysines at the third position of the tripeptide repeating unit (G-
CC X-Y) are hydroxylated in all cases. The modified lysines can be O-
CC glycosylated. {ECO:0000250|UniProtKB:P02462}.
CC -!- PTM: Contains 4-hydroxyproline. Prolines at the third position of the
CC tripeptide repeating unit (G-X-Y) are hydroxylated in some or all of
CC the chains. {ECO:0000250|UniProtKB:P02462}.
CC -!- PTM: Contains 3-hydroxyproline (PubMed:24368846). This modification
CC occurs on the first proline residue in the sequence motif Gly-Pro-Hyp,
CC where Hyp is 4-hydroxyproline (By similarity).
CC {ECO:0000250|UniProtKB:P02463, ECO:0000269|PubMed:24368846}.
CC -!- PTM: Type IV collagens contain numerous cysteine residues which are
CC involved in inter- and intramolecular disulfide bonding. 12 of these,
CC located in the NC1 domain, are conserved in all known type IV
CC collagens. {ECO:0000250|UniProtKB:P02462}.
CC -!- PTM: The trimeric structure of the NC1 domains is stabilized by
CC covalent bonds (sulfilimine cross-links) between Lys and Met residues
CC (PubMed:19729652). These cross-links are important for the mechanical
CC stability of the basement membrane (By similarity). Sulfilimine cross-
CC link is catalyzed by PXDN (By similarity).
CC {ECO:0000250|UniProtKB:P02463, ECO:0000269|PubMed:19729652}.
CC -!- PTM: Proteolytic processing produces the C-terminal NC1 peptide,
CC arresten. {ECO:0000250|UniProtKB:P02462}.
CC -!- SIMILARITY: Belongs to the type IV collagen family.
CC {ECO:0000255|PROSITE-ProRule:PRU00736}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; DAAA02034907; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; DAAA02034908; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR EMBL; DAAA02034909; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR PDB; 1M3D; X-ray; 2.00 A; A/B/D/E/G/H/J/K=1441-1669.
DR PDB; 1T60; X-ray; 1.50 A; A/B/D/E/G/H/J/K/M/N/P/Q/S/T/V/W=1441-1669.
DR PDB; 1T61; X-ray; 1.50 A; A/B/D/E=1441-1669.
DR PDBsum; 1M3D; -.
DR PDBsum; 1T60; -.
DR PDBsum; 1T61; -.
DR AlphaFoldDB; Q7SIB2; -.
DR SMR; Q7SIB2; -.
DR ComplexPortal; CPX-3107; Collagen type IV trimer variant 1.
DR STRING; 9913.ENSBTAP00000035211; -.
DR GlyConnect; 108; 20 N-Linked glycans.
DR PaxDb; Q7SIB2; -.
DR PRIDE; Q7SIB2; -.
DR Ensembl; ENSBTAT00000035335; ENSBTAP00000035211; ENSBTAG00000012849.
DR VEuPathDB; HostDB:ENSBTAG00000012849; -.
DR VGNC; VGNC:50081; COL4A1.
DR eggNOG; KOG3544; Eukaryota.
DR GeneTree; ENSGT00940000157678; -.
DR HOGENOM; CLU_002023_1_0_1; -.
DR InParanoid; Q7SIB2; -.
DR OMA; ETEDMFT; -.
DR OrthoDB; 63831at2759; -.
DR TreeFam; TF316865; -.
DR EvolutionaryTrace; Q7SIB2; -.
DR Proteomes; UP000009136; Chromosome 12.
DR Bgee; ENSBTAG00000012849; Expressed in theca cell and 104 other tissues.
DR ExpressionAtlas; Q7SIB2; baseline and differential.
DR GO; GO:0005604; C:basement membrane; IBA:GO_Central.
DR GO; GO:0005587; C:collagen type IV trimer; IEA:Ensembl.
DR GO; GO:0031012; C:extracellular matrix; IBA:GO_Central.
DR GO; GO:0005615; C:extracellular space; IBA:GO_Central.
DR GO; GO:0005201; F:extracellular matrix structural constituent; IBA:GO_Central.
DR GO; GO:0030020; F:extracellular matrix structural constituent conferring tensile strength; IEA:Ensembl.
DR GO; GO:0048407; F:platelet-derived growth factor binding; IEA:Ensembl.
DR GO; GO:0071711; P:basement membrane organization; IEA:Ensembl.
DR GO; GO:0007420; P:brain development; IEA:Ensembl.
DR GO; GO:0001569; P:branching involved in blood vessel morphogenesis; IEA:Ensembl.
DR GO; GO:0071230; P:cellular response to amino acid stimulus; IEA:Ensembl.
DR GO; GO:0038063; P:collagen-activated tyrosine kinase receptor signaling pathway; IEA:Ensembl.
DR GO; GO:0030198; P:extracellular matrix organization; IBA:GO_Central.
DR GO; GO:0007528; P:neuromuscular junction development; IEA:Ensembl.
DR GO; GO:0061333; P:renal tubule morphogenesis; IEA:Ensembl.
DR GO; GO:0061304; P:retinal blood vessel morphogenesis; IEA:Ensembl.
DR Gene3D; 2.170.240.10; -; 1.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR001442; Collagen_IV_NC.
DR InterPro; IPR036954; Collagen_IV_NC_sf.
DR InterPro; IPR016187; CTDL_fold.
DR Pfam; PF01413; C4; 2.
DR Pfam; PF01391; Collagen; 17.
DR SMART; SM00111; C4; 2.
DR SUPFAM; SSF56436; SSF56436; 2.
DR PROSITE; PS51403; NC1_IV; 1.
PE 1: Evidence at protein level;
KW 3D-structure; Basement membrane; Collagen; Direct protein sequencing;
KW Disulfide bond; Extracellular matrix; Hydroxylation; Reference proteome;
KW Repeat; Secreted; Signal.
FT SIGNAL 1..27
FT /evidence="ECO:0000250|UniProtKB:P02462"
FT CHAIN 28..1669
FT /note="Collagen alpha-1(IV) chain"
FT /id="PRO_0000059399"
FT PROPEP 28..172
FT /note="N-terminal propeptide (7S domain)"
FT /evidence="ECO:0000250|UniProtKB:P02462"
FT /id="PRO_0000441824"
FT CHAIN 1445..1669
FT /note="Arresten"
FT /id="PRO_0000441825"
FT DOMAIN 1445..1669
FT /note="Collagen IV NC1"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00736"
FT REGION 50..1445
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 173..1440
FT /note="Triple-helical region"
FT /evidence="ECO:0000305"
FT COMPBIAS 101..123
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 140..154
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 191..215
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 281..299
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 365..380
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 409..425
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 789..817
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 850..864
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1341..1355
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1415..1434
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT MOD_RES 204
FT /note="3-hydroxyproline"
FT /evidence="ECO:0000269|PubMed:24368846"
FT MOD_RES 207
FT /note="3-hydroxyproline"
FT /evidence="ECO:0000269|PubMed:24368846"
FT MOD_RES 210
FT /note="3-hydroxyproline"
FT /evidence="ECO:0000269|PubMed:24368846"
FT MOD_RES 587
FT /note="3-hydroxyproline"
FT /evidence="ECO:0000269|PubMed:24368846"
FT MOD_RES 602
FT /note="3-hydroxyproline"
FT /evidence="ECO:0000269|PubMed:24368846"
FT MOD_RES 603
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P02463"
FT MOD_RES 605
FT /note="3-hydroxyproline"
FT /evidence="ECO:0000269|PubMed:24368846"
FT MOD_RES 606
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P02463"
FT MOD_RES 623
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P02463"
FT MOD_RES 626
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P02463"
FT MOD_RES 629
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P02463"
FT MOD_RES 632
FT /note="4-hydroxyproline"
FT /evidence="ECO:0000250|UniProtKB:P02463"
FT MOD_RES 647
FT /note="3-hydroxyproline"
FT /evidence="ECO:0000269|PubMed:24368846"
FT MOD_RES 1214
FT /note="3-hydroxyproline"
FT /evidence="ECO:0000269|PubMed:24368846"
FT MOD_RES 1424
FT /note="3-hydroxyproline"
FT /evidence="ECO:0000269|PubMed:24368846"
FT DISULFID 1460..1551
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00736,
FT ECO:0000269|PubMed:11970952"
FT DISULFID 1493..1548
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00736,
FT ECO:0000269|PubMed:11970952"
FT DISULFID 1505..1511
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00736,
FT ECO:0000269|PubMed:11970952"
FT DISULFID 1570..1665
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00736,
FT ECO:0000269|PubMed:11970952"
FT DISULFID 1604..1662
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00736,
FT ECO:0000269|PubMed:11970952"
FT DISULFID 1616..1622
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00736,
FT ECO:0000269|PubMed:11970952"
FT CROSSLNK 1533
FT /note="S-Lysyl-methionine sulfilimine (Met-Lys) (interchain
FT with K-1651)"
FT /evidence="ECO:0000269|PubMed:19729652"
FT CROSSLNK 1651
FT /note="S-Lysyl-methionine sulfilimine (Lys-Met) (interchain
FT with M-1533)"
FT /evidence="ECO:0000269|PubMed:19729652"
FT STRAND 1446..1451
FT /evidence="ECO:0007829|PDB:1T60"
FT STRAND 1453..1456
FT /evidence="ECO:0007829|PDB:1T60"
FT STRAND 1465..1478
FT /evidence="ECO:0007829|PDB:1T60"
FT STRAND 1481..1484
FT /evidence="ECO:0007829|PDB:1T60"
FT HELIX 1490..1492
FT /evidence="ECO:0007829|PDB:1T60"
FT STRAND 1493..1496
FT /evidence="ECO:0007829|PDB:1T60"
FT STRAND 1502..1505
FT /evidence="ECO:0007829|PDB:1T60"
FT STRAND 1511..1514
FT /evidence="ECO:0007829|PDB:1T60"
FT STRAND 1519..1524
FT /evidence="ECO:0007829|PDB:1T60"
FT HELIX 1538..1544
FT /evidence="ECO:0007829|PDB:1T60"
FT STRAND 1547..1555
FT /evidence="ECO:0007829|PDB:1T60"
FT STRAND 1557..1561
FT /evidence="ECO:0007829|PDB:1T60"
FT STRAND 1563..1566
FT /evidence="ECO:0007829|PDB:1T60"
FT STRAND 1575..1587
FT /evidence="ECO:0007829|PDB:1T60"
FT HELIX 1589..1591
FT /evidence="ECO:0007829|PDB:1T60"
FT STRAND 1593..1595
FT /evidence="ECO:0007829|PDB:1T60"
FT HELIX 1601..1603
FT /evidence="ECO:0007829|PDB:1T60"
FT STRAND 1604..1607
FT /evidence="ECO:0007829|PDB:1T60"
FT STRAND 1613..1617
FT /evidence="ECO:0007829|PDB:1T60"
FT STRAND 1620..1623
FT /evidence="ECO:0007829|PDB:1T61"
FT STRAND 1629..1634
FT /evidence="ECO:0007829|PDB:1T60"
FT HELIX 1638..1640
FT /evidence="ECO:0007829|PDB:1T60"
FT STRAND 1648..1650
FT /evidence="ECO:0007829|PDB:1T60"
FT HELIX 1656..1658
FT /evidence="ECO:0007829|PDB:1T60"
FT STRAND 1661..1666
FT /evidence="ECO:0007829|PDB:1T60"
SQ SEQUENCE 1669 AA; 160412 MW; AFC2A53A7BE484B1 CRC64;
MGPRLGVWLL LLLAALLLHE ESSRAAAKGG CAGSGCGKCD CHGVKGQKGE RGLPGLQGVI
GFPGMQGPEG PQGPPGQKGD TGEPGLPGTK GTRGPSGVPG YPGNPGLPGI PGQDGPPGPP
GIPGCNGTKG ERGPVGPPGL PGFAGNPGPP GLPGMKGDPG EILGHIPGTL LKGERGYPGQ
PGAPGSPGLP GLQGPVGPPG FTGPPGPPGP PGPPGEKGQM GLSFQGPKGE KGDQGVSGPP
GLPGQAQVIT KGDTAMRGEK GQKGEPGFPG LPGFGEKGEP GKPGPRGKPG KDGEKGEKGS
PGFPGDSGYP GQPGQDGLKG EKGEAGPPGL PGTVIGTGPL GEKGEPGYPG GPGAKGETGP
KGFPGIPGQP GPPGFPTPGL IGAPGFPGDR GEKGEPGLPG VSLPGPSGRD GLPGPPGPPG
PPGQPGHTNG IVECQPGPPG DQGPPGIPGQ PGLTGEVGEK GQKGDSCLVC DTAELRGPPG
PQGPPGEIGF PGQPGAKGDR GLPGRDGLEG LPGPQGAPGL MGQPGAKGEP GEIYFDIRLK
GDKGDPGFPG QPGMPGRAGS PGRDGQPGLP GPRGSPGSVG LKGERGPPGG VGFPGSRGDI
GPPGPPGFGP IGPIGDKGQI GFPGTPGAPG QPGPKGEAGK VVPLPGPPGA EGLPGSPGFQ
GPQGDRGFPG SPGRPGLPGE KGAIGQPGIG FPGPPGPKGV DGLPGDAGPP GNPGRQGFNG
LPGNPGPPGQ KGEPGVGLPG LKGLPGIPGI PGTPGEKGNV GGPGIPGEHG AIGPPGLQGL
RGDPGPPGFQ GPKGAPGVPG IGPPGAMGPP GGQGPPGSSG PPGVKGEKGF PGFPGLDMPG
PKGDKGSQGL PGLTGQSGLP GLPGQQGTPG QPGIPGPKGE MGVMGTPGQP GSPGPAGVPG
LPGAKGDHGF PGSSGPRGDP GFKGDKGDVG LPGKPGSMDK VDMGSMKGEK GDQGEKGQTG
PTGDKGSRGD PGTPGVPGKD GQAGHPGQPG PKGDPGVSGI PGAPGLPGPK GSAGGMGLPG
MPGPKGVAGI PGPQGIPGLP GDKGAKGEKG QAGLPGIGIP GRPGDKGDQG LAGFPGSPGE
KGEKGSTGIP GMPGSPGPKG SPGSVGYPGS PGLPGEKGDK GLPGLDGIPG IKGEAGLPGK
PGPTGPAGQK GEPGSDGIPG SVGEKGESGL PGRGFPGFPG SKGDKGSKGD VGFPGLSGSP
GIPGSKGEQG FMGPPGPQGQ PGLPGTPGHA VEGPKGDRGP QGQPGLPGRP GPMGPPGLPG
LEGLKGERGN PGWPGTPGAP GPKGDPGFQG MPGIGGSPGI TGAKGDVGPP GVPGFHGQKG
APGLQGVKGD QGDQGFPGTK GLPGPPGPPG PFSIIKGEPG LPGPEGPAGL KGLQGPPGPK
GQQGVTGSVG LPGPPGEPGF DGAPGQKGET GPFGPPGPRG FPGPPGPDGL PGSMGPPGTP
SVDHGFLVTR HSQTTDDPQC PPGTKILYHG YSLLYVQGNE RAHGQDLGTA GSCLRKFSTM
PFLFCNINNV CNFASRNDYS YWLSTPEPMP MSMAPITGEN IRPFISRCAV CEAPAMVMAV
HSQTIQIPQC PTGWSSLWIG YSFVMHTSAG AEGSGQALAS PGSCLEEFRS APFIECHGRG
TCNYYANAYS FWLATIERSE MFKKPTPSTL KAGELRTHVS RCQVCMRRT