CO4A2_ASCSU
ID CO4A2_ASCSU Reviewed; 1763 AA.
AC P27393;
DT 01-AUG-1992, integrated into UniProtKB/Swiss-Prot.
DT 01-AUG-1992, sequence version 1.
DT 03-AUG-2022, entry version 108.
DE RecName: Full=Collagen alpha-2(IV) chain;
DE Flags: Precursor;
OS Ascaris suum (Pig roundworm) (Ascaris lumbricoides).
OC Eukaryota; Metazoa; Ecdysozoa; Nematoda; Chromadorea; Rhabditida;
OC Spirurina; Ascaridomorpha; Ascaridoidea; Ascarididae; Ascaris.
OX NCBI_TaxID=6253;
RN [1]
RP NUCLEOTIDE SEQUENCE [MRNA] (ISOFORMS I AND II).
RX PubMed=1714907; DOI=10.1016/s0021-9258(18)98528-5;
RA Pettitt J., Kingston I.B.;
RT "The complete primary structure of a nematode alpha 2(IV) collagen and the
RT partial structural organization of its gene.";
RL J. Biol. Chem. 266:16149-16156(1991).
CC -!- FUNCTION: Collagen type IV is specific for basement membranes.
CC -!- SUBUNIT: Trimers of two alpha 1(IV) and one alpha 2(IV) chain. Type IV
CC collagen forms a mesh-like network linked through intermolecular
CC interactions between 7S domains and between NC1 domains.
CC -!- SUBCELLULAR LOCATION: Secreted, extracellular space, extracellular
CC matrix, basement membrane.
CC -!- ALTERNATIVE PRODUCTS:
CC Event=Alternative splicing; Named isoforms=2;
CC Name=I;
CC IsoId=P27393-1; Sequence=Displayed;
CC Name=II;
CC IsoId=P27393-2; Sequence=VSP_001159;
CC -!- DOMAIN: Alpha chains of type IV collagen have a non-collagenous domain
CC (NC1) at their C-terminus, frequent interruptions of the G-X-Y repeats
CC in the long central triple-helical domain (which may cause flexibility
CC in the triple helix), and a short N-terminal triple-helical 7S domain.
CC -!- PTM: Prolines at the third position of the tripeptide repeating unit
CC (G-X-Y) are hydroxylated in some or all of the chains.
CC -!- PTM: Type IV collagens contain numerous cysteine residues which are
CC involved in inter- and intramolecular disulfide bonding. 12 of these,
CC located in the NC1 domain, are conserved in all known type IV
CC collagens.
CC -!- PTM: The trimeric structure of the NC1 domains is stabilized by
CC covalent bonds between Lys and Met residues. {ECO:0000250}.
CC -!- SIMILARITY: Belongs to the type IV collagen family.
CC {ECO:0000255|PROSITE-ProRule:PRU00736}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; M67507; AAA18014.1; -; mRNA.
DR PIR; S16366; S16366.
DR AlphaFoldDB; P27393; -.
DR GO; GO:0005604; C:basement membrane; IEA:UniProtKB-SubCell.
DR GO; GO:0005581; C:collagen trimer; IEA:UniProtKB-KW.
DR GO; GO:0005576; C:extracellular region; IEA:UniProtKB-KW.
DR GO; GO:0005201; F:extracellular matrix structural constituent; IEA:InterPro.
DR GO; GO:0048856; P:anatomical structure development; IEA:UniProt.
DR Gene3D; 2.170.240.10; -; 1.
DR InterPro; IPR008160; Collagen.
DR InterPro; IPR001442; Collagen_IV_NC.
DR InterPro; IPR036954; Collagen_IV_NC_sf.
DR InterPro; IPR016187; CTDL_fold.
DR Pfam; PF01413; C4; 2.
DR Pfam; PF01391; Collagen; 22.
DR SMART; SM00111; C4; 2.
DR SUPFAM; SSF56436; SSF56436; 2.
DR PROSITE; PS51403; NC1_IV; 1.
PE 2: Evidence at transcript level;
KW Alternative splicing; Basement membrane; Collagen; Disulfide bond;
KW Extracellular matrix; Glycoprotein; Hydroxylation; Proteoglycan; Repeat;
KW Secreted; Signal.
FT SIGNAL 1..26
FT /evidence="ECO:0000255"
FT CHAIN 27..1763
FT /note="Collagen alpha-2(IV) chain"
FT /id="PRO_0000005828"
FT DOMAIN 1533..1756
FT /note="Collagen IV NC1"
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00736"
FT REGION 27..42
FT /note="7S domain"
FT REGION 43..1529
FT /note="Triple-helical region"
FT REGION 51..529
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 550..1529
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 134..154
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 361..405
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 581..596
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 645..659
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 710..735
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 812..833
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 891..905
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 1506..1529
FT /note="Pro residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT CARBOHYD 126
FT /note="N-linked (GlcNAc...) asparagine"
FT /evidence="ECO:0000255"
FT DISULFID 1548..1637
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00736"
FT DISULFID 1581..1634
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00736"
FT DISULFID 1593..1599
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00736"
FT DISULFID 1656..1752
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00736"
FT DISULFID 1690..1749
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00736"
FT DISULFID 1702..1709
FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00736"
FT VAR_SEQ 230..266
FT /note="GEQGPRGPQGPPGPVPSTGAKGTIIGPEGAPGMKGEK -> GDIGPAGPPGP
FT PGPREFTGSGSIVGPRGHSGDKGVK (in isoform II)"
FT /evidence="ECO:0000303|PubMed:1714907"
FT /id="VSP_001159"
FT CARBOHYD P27393-2:249
FT /note="O-linked (Xyl...) (glycosaminoglycan) serine"
FT /evidence="ECO:0000255"
SQ SEQUENCE 1763 AA; 168527 MW; 304F528BC06AAE0D CRC64;
MSSRLRIPLW LLLPTTALVY FVTTVSTQIT CRDCTNRGCF CVGEKGSMGI PGPQGPPGAQ
GIRGFPGPEG LPGPKGQKGS QGPPGPQGIK GDRGIIGVPG FPGNDGANGR PGEPGPPGAP
GWDGCNGTDG APGVPGLPGP PGMPGFPGPP GVPGMKGEPA IGYAGAPGEK GDAGMPGMPG
LPGPPGRDGF PGEKGDRGDV GQAGPRGPPG EAGPPGNPGI GSIGPKGDPG EQGPRGPQGP
PGPVPSTGAK GTIIGPEGAP GMKGEKGDPG EAGPRGFPGT PGVAGQPGLP GMKGEKGLSG
PAGPRGKEGR PGLPGPPGFK GDRGLDGLPG VPGLPGQKGE AGFPGRDGAK GARGPPGPPG
GGEFSDGPPG PPGLPGREGQ PGPPGADGYP GPPGPQGPQG LPGGPGLPGL PGLEGLPGPK
GEKGDSGIPG APGVQGPPGL AGPPGAKGEP GPRGVDGQSI PGLPGKDGRP GLDGLPGRKG
EMGLPGVRGP PGDSLNGLPG PPGPRGPQGP KGYDGRDGAP GLPGIPGPKG DRGGTCAFCA
HGAKGEKGDA GYAGLPGPQG ERGLPGIPGA TGAPGDDGLP GAPGRPGPPG PPGQDGLPGL
PGQKGEPTQL TLRPGPPGYP GQKGETGFPG PRGQEGLPGK PGIVGAPGLP GPPGPKGEPG
LTGLPEKPGK DGIPGLPGLK GEPGYGQPGM PGLPGMKGDA GLPGLPGLPG AVGPMGPPVP
ESQLRPGPPG KDGLPGLPGP KGEAGFPGAP GLQGPAGLPG LPGMKGNPGL PGAPGLAGLP
GIPGEKGIAG KPGLPGLTGA KGEAGYPGQP GLPGPKGEPG PSTTGPPGPP GFPGLKGKDG
IPGAPGLPGL EGQRGLPGVP GQKGEIGLPG LAGAPGFPGA KGEPGLPGLP GKEGPQGPPG
QPGAPGFPGQ KGDEGLPGLP GVSGMKGDTG LPGVPGLAGP PGQPGFPGQK GQPGFPGVAG
AKGEAGLPGL PGAPGQKGEQ GLAGLPGIPG MKGAPGIPGA PGQDGLPGLP GVKGDRGFNG
LPGEKGEPGP AARDGEKGEP GLPGQPGLRG PQGPPGLPGL PGLKGDEGQP GYGAPGLMGE
KGLPGLPGKP GRPGAPGPKG LDGAPGFPGL KGEAGLPGAP GLPGQDGLPG LPGQKGESGF
PGQPGLVGPP GLPGKMGAPG IRGEKGDAGL PGLPGERGLD GLPGQKGEAG FPGAPGLPGP
VGPKGSAGAP GFPGLKGEPG LPGLEGQPGP RGMKGEAGLP GAPGRDGLPG LPGMKGEAGL
PGLPGQPGKS ITGPKGNAGL PGLPGKDGLP GLPGLKGEPG KPGYAGAAGI KGEPGLPGIP
GAKGEPGLSG IPGKRGNDGI PGKPGPAGLP GLPGMKGESG LPGPQGPAGL PGLPGLKGEP
GLPGFPGQKG ETGFPGQPGI PGLPGMKGDS GYPGAPGRDG APGKQGEPGP MGPPGAQPIV
QRGEKGEMGP MGAPGIRGEK GLPGLDGLPG PSGPPGFAGA KGRDGFPGQP GMPGEKGAPG
LPGFPGIEGI PGPPGLPGPS GPPGPPGPSY KDGFLLVKHS QTSEVPQCPP GMVKLWDGYS
LLYIEGNEKS HNQDLGHAGS CLSRFSTMPF LFCDVNNVCN YASRNDKSYW LSTTAPIPMM
PVSEGGIEPY ISRCAVCEAP ANVIAVHSQT IQIPNCPNGW NSLWIGYSFA MHTGAGAEGG
GQSLSSPGSC LEDFRATPFI ECNGARGTCH YFANKFSFWL TTIEDDQQFR IPESETLKAG
SLRTRVSRCQ VCIRSPDVQP YRG