COLL3_MIMIV
ID COLL3_MIMIV Reviewed; 939 AA.
AC Q5UPX3;
DT 13-SEP-2005, integrated into UniProtKB/Swiss-Prot.
DT 07-DEC-2004, sequence version 1.
DT 29-SEP-2021, entry version 59.
DE RecName: Full=Collagen-like protein 3;
GN OrderedLocusNames=MIMI_R239;
OS Acanthamoeba polyphaga mimivirus (APMV).
OC Viruses; Varidnaviria; Bamfordvirae; Nucleocytoviricota; Megaviricetes;
OC Imitervirales; Mimiviridae; Mimivirus.
OX NCBI_TaxID=212035;
OH NCBI_TaxID=5757; Acanthamoeba polyphaga (Amoeba).
RN [1]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Rowbotham-Bradford;
RX PubMed=15486256; DOI=10.1126/science.1101485;
RA Raoult D., Audic S., Robert C., Abergel C., Renesto P., Ogata H.,
RA La Scola B., Susan M., Claverie J.-M.;
RT "The 1.2-megabase genome sequence of Mimivirus.";
RL Science 306:1344-1350(2004).
CC -!- FUNCTION: May participate in the formation of a layer of cross-linked
CC glycosylated fibrils at the viral surface thus giving it a hairy-like
CC appearance. {ECO:0000305}.
CC -!- SUBCELLULAR LOCATION: Virion.
CC -!- PTM: May be hydroxylated on lysine by the viral-encoded procollagen-
CC lysine,2-oxoglutarate 5-dioxygenase. {ECO:0000305}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AY653733; AAV50512.1; -; Genomic_DNA.
DR RefSeq; YP_003986735.1; NC_014649.1.
DR GeneID; 9924846; -.
DR KEGG; vg:9924846; -.
DR Proteomes; UP000001134; Genome.
DR InterPro; IPR008160; Collagen.
DR Pfam; PF01391; Collagen; 7.
PE 4: Predicted;
KW Collagen; Glycoprotein; Hydroxylation; Reference proteome; Repeat; Virion.
FT CHAIN 1..939
FT /note="Collagen-like protein 3"
FT /id="PRO_0000059418"
FT DOMAIN 88..147
FT /note="Collagen-like 1"
FT DOMAIN 148..207
FT /note="Collagen-like 2"
FT DOMAIN 211..330
FT /note="Collagen-like 3"
FT DOMAIN 364..423
FT /note="Collagen-like 4"
FT DOMAIN 427..486
FT /note="Collagen-like 5"
FT DOMAIN 493..552
FT /note="Collagen-like 6"
FT DOMAIN 564..622
FT /note="Collagen-like 7"
FT DOMAIN 638..697
FT /note="Collagen-like 8"
FT REGION 84..332
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 358..697
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 896..923
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 121..288
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 295..312
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 358..525
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 539..579
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 587..689
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 896..910
FT /note="Polar residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT CARBOHYD 15
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 35
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 39
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 82
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 788
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 820
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 858
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 919
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 925
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
SQ SEQUENCE 939 AA; 90971 MW; 567203610B9FAC27 CRC64;
MFGNCQKQNC CTQRNNTGIC YSVCPPQTVI TVPGNVTCNA SRIYVNIGRP NNCFGNDGDL
YLDTNTNNLY YKRDGVWLLV GNLSGSSGPS GPQGPKGEKG SNGDKGDKGE IGIQGLKGES
GADADKGDKG DKGDKGSKGT KGENGDKGNK GDKGDPGIKG SKGEKGSKGD KGSKGDKGDP
GIKGESGADA DKGDKGDKGS KGDKGDKGID GNKGEKGSKG DKGDKGDIGL KGESGADADK
GDKGDKGSKG DKGDKGDIGP KGESGADADK GDKGDKGSKG DKGDKGTKGE SGLIGTKGDK
GDKGDKGIKG DKGEAGTSIL EGSGVPSPDL GNNGDLYIDG MTGLLYAKIN DEWVPVTSIK
GDKGDKGDTG LKGESGADAD KGEKGDPGNK GDKGNKGDKG SKGDKGDKGD KGDTGLKGES
GADADKGDKG GKGEKGDNGE KGSKGEKGEK GEKGDNGEKG DKGDNGEKGE KGEKGEKGDN
GEKGEKGDVG IKGESGADAD KGDKGEKGDK GVNGDKGDKG SKGDTGIKGE AGTAANKGDK
GSKGDKGDKG SKGDIGISIK GDKGDKGDKG SKGDKGDIGI KGESGLSIKG DKGDKGGKGD
KGDLGSKGDI GLKGDKGDKG DVGLKGDKGD KGDVGSKGDK GDKGSKGDKG SKGDTGSKGD
KGDKGSKGDK GDKGDKGSKG DKGDIGSIGP KGEKGEAGST NSLYIGSAQL GNAGDFNNEI
IPVTGIAYIT AVGAGGSGYS GTTGGYGGGG AGGAFINYPV YVSSSQTYSA HVGFGGAQVA
GNSNKGENTT ITIGNLVLLA GGGEGGTATT GGTGGLVSIN GTQITAGAPG GTSGNSGVSS
IYINPLVIGG AGGGGGSNGT NAGNGGNYAG FTGGIASPGV NGGGGGGASL FSNGFNGETG
APTTDSGTNY GAGGGGGGNG TQGGNGSLGY VRIDFYSAP