COLL7_MIMIV
ID COLL7_MIMIV Reviewed; 1937 AA.
AC Q5UNS9;
DT 13-SEP-2005, integrated into UniProtKB/Swiss-Prot.
DT 07-DEC-2004, sequence version 1.
DT 29-SEP-2021, entry version 66.
DE RecName: Full=Collagen-like protein 7;
GN OrderedLocusNames=MIMI_L669;
OS Acanthamoeba polyphaga mimivirus (APMV).
OC Viruses; Varidnaviria; Bamfordvirae; Nucleocytoviricota; Megaviricetes;
OC Imitervirales; Mimiviridae; Mimivirus.
OX NCBI_TaxID=212035;
OH NCBI_TaxID=5757; Acanthamoeba polyphaga (Amoeba).
RN [1]
RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].
RC STRAIN=Rowbotham-Bradford;
RX PubMed=15486256; DOI=10.1126/science.1101485;
RA Raoult D., Audic S., Robert C., Abergel C., Renesto P., Ogata H.,
RA La Scola B., Susan M., Claverie J.-M.;
RT "The 1.2-megabase genome sequence of Mimivirus.";
RL Science 306:1344-1350(2004).
CC -!- FUNCTION: May participate in the formation of a layer of cross-linked
CC glycosylated fibrils at the viral surface thus giving it a hairy-like
CC appearance. {ECO:0000305}.
CC -!- SUBCELLULAR LOCATION: Virion.
CC -!- PTM: May be hydroxylated on lysine by the viral-encoded procollagen-
CC lysine,2-oxoglutarate 5-dioxygenase. {ECO:0000305}.
CC ---------------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC ---------------------------------------------------------------------------
DR EMBL; AY653733; AAV50930.1; -; Genomic_DNA.
DR RefSeq; YP_003987191.1; NC_014649.1.
DR PRIDE; Q5UNS9; -.
DR GeneID; 9925315; -.
DR KEGG; vg:9925315; -.
DR Proteomes; UP000001134; Genome.
DR InterPro; IPR008160; Collagen.
DR Pfam; PF01391; Collagen; 8.
PE 4: Predicted;
KW Collagen; Glycoprotein; Hydroxylation; Reference proteome; Repeat; Virion.
FT CHAIN 1..1937
FT /note="Collagen-like protein 7"
FT /id="PRO_0000059422"
FT DOMAIN 102..161
FT /note="Collagen-like 1"
FT DOMAIN 168..227
FT /note="Collagen-like 2"
FT DOMAIN 297..356
FT /note="Collagen-like 3"
FT DOMAIN 363..422
FT /note="Collagen-like 4"
FT DOMAIN 453..512
FT /note="Collagen-like 5"
FT DOMAIN 672..731
FT /note="Collagen-like 6"
FT DOMAIN 735..854
FT /note="Collagen-like 7"
FT DOMAIN 867..926
FT /note="Collagen-like 8"
FT DOMAIN 936..995
FT /note="Collagen-like 9"
FT DOMAIN 1023..1142
FT /note="Collagen-like 10"
FT REGION 88..248
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 294..531
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 583..643
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT REGION 670..1144
FT /note="Disordered"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 298..513
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 589..605
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT COMPBIAS 670..1142
FT /note="Basic and acidic residues"
FT /evidence="ECO:0000256|SAM:MobiDB-lite"
FT CARBOHYD 6
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 21
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 515
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 902
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1178
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1192
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1212
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1217
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1245
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1246
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1255
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1317
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1422
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1427
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1432
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1443
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1452
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1477
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1494
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1506
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1513
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1533
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1598
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1619
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1620
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1632
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1641
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1663
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1664
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1672
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1682
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1683
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1732
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1735
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1746
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1756
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1784
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1842
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
FT CARBOHYD 1934
FT /note="N-linked (GlcNAc...) asparagine; by host"
FT /evidence="ECO:0000255"
SQ SEQUENCE 1937 AA; 194319 MW; 337FABD175726144 CRC64;
MLDMMNNSLS YNRPECINFQ NNSVQKNVIR VCENEPNTFV GSGIPSQIIG KQGDIYLDRI
TRIYYKKING VWVKNVCNNH RCCYPKECKG NPHTKGEKGE TGPKGIKGEK GDRGLKGEKG
NNGDPGEKGE KGAKGDKGES GEKGAKGDKG DKGDIGEKGE KGDKGDIGEK GEKGDKGDIG
EKGDKGDLGE KGEKGDPGQK GEKGDKGDFG DKGDKGDIGE KGDKGDIGDK GEIGNKGDVG
EKGSKGDKGI DGTSILFGFG IPSPDLGVDG DLYLDANTDE LYGKVNGQWI PITNLKGEKG
DKGNKGIDGE KGNKGDTGDK GIDGSKGDKG DTGNKGDIGD KGDQGIKGDI GDKGEKGDIG
EKGDKGEKGI KGDKGDIGEK GNKGDIGDKG EKGDKGIDGD KGIKGDKGDI GEKGDKGDIG
EKGNKGEKGD KGDKGDIGEK GDKGDTGSKG DKGDKGEKGD KGDKGEKGDI GDKGEKGDKG
DKGDKGDKGD KGDKGDTGDK GDKGDKGDKG DKGDNGTSIL FGSGPPSPDL GMVGDLYIDV
TTDELYGKVN AKMNDNIRVS AKVNVNKQIT LQATGQWIPL TNLKGDKGDK GINGNKGDKG
EKGDKGNPGT NAGKGEKGDK GDKGDAGTSI LFGQGAPDPN QGVDGDIYID TLTGELYRKV
NGLWVPEIDI KGDKGEKGDK GNAGDKGTSG EKGDLGSKGE KGDTGEKGDK GNKGDRGDKG
IKGDIGSKGD KGDIGNKGDK GDRGDKGIKG DAGLKGDKGD IGQKGDKGTK GDRGDKGEKG
DAGLKGNKGD IGLKGDKGTK GDRGDKGTKG DRGDKGDIGN KGDKGDKGTK GDRGDKGVKG
DKGDKGNKGD KGNIGIKGDK GDRSDKGLKG DKGDKGDTGD IGLKGDKGDI GEKGIKGDKG
INGSKGYKGD KGDKGSKGDK GNKGDKGSKG DKGDIGIKGS KGDKGDKGDK GSKGDKGDIG
SKGDKGDKGD IGTKGDKGTK GDKGIKGDIG SKGDKGSKGD KGSKGDKGDI GSKGDKGDKG
DKGSKGDKGS KGIKGDKGDK GTKGDKGIKG DKGDKGIKGD KGDKGDKGDK GIKGDKGDKG
DKGDKGDKGT KGDKGDKGDK GSKGDKGDKG DKGDKGEKGS KGDKGDKGDK GDKGDKGDKG
DTATCEIVNT DGQTRVSACN TGFVKIEASG YEITTAPNGS SDLAPDPFDP SNVSGMNTKL
FGIQNGAFRS GNFTATNLSD IGQYSAVFGY QTTATGSGSI VYGINNSSGI SSGSNGSLVG
GLANTGGIIR SFSGAEGSIA FGSSEINGII TTGFGAQGSI VGGYSSGGTI ATGSGANASE
AFGQATTGSL ITTGPGAIGS STRGYSVGNS VITTGLASWA SQITGYASNS GIIFSGSGTS
TSDIIASVTD SSTLTIGDGS IGSSIRGYCS SGSLISLGPN SNGSIINATV NNFSTLITSS
GVNGSSIVAR ANNSGLISIS GNCFGSQFVF NSINGGNITI GSTHAGSQIV GSANTSGIIT
VVGGLNGTLV AANSSSTSGV RISTSFGSLL AGNATTGGWF RPGQAQGSVI AGQASASGVI
LTAFNAGCNV IGYANRGSTL GINQNAFGYS VMGYADLNST ITSNALGANA CRIMGYALNN
STIFMDSLVA ANATDLFGYA NNSGVIQAAD QCAGALARGV AQNNSSIVTR GNGSMTFGFS
NNNSTIFTGQ FSDGCFVGGY ANSGGTISTG STSPASHVFG FSNSGSFITT GNNTNASTVF
GYATANSTIT TGANSNGSLA VGYTGSSGEL LGALGQTSFA IGRNNTASGA YSGAIGISSY
ANMEGSFAHS SFQDTVNPVS AGRSQNIKVM GRKVGTQIVL ANGSYATLPY DGYGDVFARL
IGSSGTAVGL TFQVTRTAGT YTVAPPITPS GGILWLSPSG VAAPPLAITA LGTSGFTITL
TDAQNYNCYF DIVNYSS