次のファイルがあります。
chr11_pilon3.g3568.t1 transcript:OIT01734 transcript:OIT01734 1.1e-107 389.8 1000 218 992 1 216 130 345 MDALTRHIQGDVPWCMLFADDIILIDETRAGVSERLEIWRQTLESKGFKISRSKTEYLECKFGDEPSGVGREVMLGSQAIAKRDSVRYLGSVIQGDGEIDGDVTHRIGAGWSKWRLASGVLCDKKIPHKLKGKFFRAMVRPAMFYEAECWPVKNSHIQRMKVAEMRMLRWMCGHTRLDKIKNEVIRQKVGVAPVDKKMGEARLRWFGHVRRRGPDA MDALTRHIQGDVPWCMLFADDIVLIDETRVGVNERLEVWRQTLESKGFKLSRSKTEYLECKFSAESSEVGRDVKLGSQVIAKRDSFRYLGSVIQGEGEIDGDVTHRIGAGWSKWRLASGVLCDKKVPQKLKGKFYRAVVRPAMLYGAECWPVKNSHVQRMKVAEMRMLRWMRGLTRLDRIRNEVIREKVGVALVDEKMREARLRWYGHVRRRRPDA MDALTRHIQGDVPWCMLFADDIILIDETRAGVSERLEIWRQTLESKGFKISRSKTEYLECKFGDEPSGVGREVMLGSQAIAKRDSVRYLGSVIQGDGEIDGDVTHRIGAGWSKWRLASGVLCDKKIPHKLKGKFFRAMVRPAMFYEAECWPVKNSHIQRMKVAEMRMLRWMCGHTRLDKIKNEVIRQKVGVAPVDKKMGEARLRWFGHVRRRGPDAR* MKVWERVVEARVREMTSISVNQFGFMPGRSTTEAIHLVRRLVEHFRDKKKDLHMVFIDLENAYDKVPREVLWRCLEAKSVPEAYIRVIKDMYDGAKTRVRTVGGDSDHFPVVMGLHQGSALSPLLFALVMDALTRHIQGDVPWCMLFADDIVLIDETRVGVNERLEVWRQTLESKGFKLSRSKTEYLECKFSAESSEVGRDVKLGSQVIAKRDSFRYLGSVIQGEGEIDGDVTHRIGAGWSKWRLASGVLCDKKVPQKLKGKFYRAVVRPAMLYGAECWPVKNSHVQRMKVAEMRMLRWMRGLTRLDRIRNEVIREKVGVALVDEKMREARLRWYGHVRRRRPDAPVRIYKSAILGHLNSHGSQNALAGPVEAEENRQKTKKEVMEEIIQKSKFFKAQKAKDREENDELTEQLDKDFTSLVESKALLSLTQPDKINALKALVNKNISVGNVKKDEVADVPRKASIGKEKPDTYEMLVSEMALDMRARPSDRTKTPEEIAQEEKERLELLEQEXXXXXXXXXXXXXXDGNASDDNSKLVKDPRTVSGDDLGDDLEEVPRTKLGWIGEILRRKENELESEDAASSGDSDDGEDEGXXXXXXXXXXXXXXXXXXXXDEEQGKTQTIKDWEQSDDDIIDTELEDDDEGFGDDAKKVVKIKDHKEENLSITVAAENKKKMQVFYGVLLQYFAVLANKKPLNSKLLNLLVKPLMEMSAVSPYFAAICARQRLQRTRAQFCEDLKNTGKSSWPSLKTIFLLRLWSMIFPCSDFRHCVMTPAILLMCEYLMRCTIISGRDIAIASFLCSLLLSVIKQSQKFCPEAIVFIQTLLMAALDRKQRSNSQLDNLMEIKELGPLLCIRSSKVEMDSLDFLTLMDLPEDSQYFHSDNYRTSMLVTVLETLQGFVNVYKELISFPEIFMLISKLLCKMAGENHIPDALREKIKDVSQLIDTKAQEHHMLRQPLKMRKKKPVPIRMLNPKFEENFVKGRDYDPDRERA 389.8 1000 216 85.6 185 31 200 0 0 92.6 0 22IV6AV2SN4IV11IL12GSDA1PS1GE3ED1MK4AV6VF9DE29IV1HQ6FY2MV5FL1EG10IV14CR1HL4KR1KR5QE5PL2KE2GR6FY6GR3 85.6 1.1e-107 99.1
gene.9403.0.4.p1 transcript:OIT35479 transcript:OIT35479 8.5e-191 667.5 1721 690 406 1 378 1 378 MLSAPRVSPPAVAVAAPARFKFPNVCVNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIIWGGTEDDDSSIPSKEVLSWKPLASTPXXXXXXXXXXXXXDEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHNKHNIADASSRSSFSSYNEPDQLKEQQTLSLPRGRAKIQQLDDKKNFQKLIRVEDEDRGIAIENVSKHFAGYSIDSHAQSARVVHPGSKASASPLRGWGGGSSHYSLKRDEIFRERQNLGDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQV MLSAPRAPPPAVAVAAPARFKFQNVCGNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIVWGGTEDDDSSIPSKEVLSWKPLASTSPDNNHPPPTQSSSNEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHDKHNTTDASSRSSFSSYNEPGQLKEQQTLSLPRGRAKIQQLEDRKNSQKLIRVEDEDRDIAIENVSKHFAGYSSDSHAHSARVVHPGSKASASPLRGWGGGSSHYSLKREEIFRQRRNLDDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQV MLSAPRVSPPAVAVAAPARFKFPNVCVNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIIWGGTEDDDSSIPSKEVLSWKPLASTPXXXXXXXXXXXXXDEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHNKHNIADASSRSSFSSYNEPDQLKEQQTLSLPRGRAKIQQLDDKKNFQKLIRVEDEDRGIAIENVSKHFAGYSIDSHAQSARVVHPGSKASASPLRGWGGGSSHYSLKRDEIFRERQNLGDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQVLSTCRSFSKSGVPFHSMVVTGGFCQRTQLENLRQELDILIATPGRFMFLIKEGYLQLTNLKCAVLDEVDILFSDEDFETAFQCLINSSPITTQYLFVTATLPMDIYNKLVESFPDCELVSGPGMHRTSPGLEEFLVDCSGDETAEKSPDTAFINKKNALLHLVEDSPVPKTIVFCNKIDSCRKVENALKRFDRKGFSIKILPFHAALDQRRRLANMEEFRRSKMENVSLFLVCTDRASRGIDFEGVDHVVLFDYPRDPSEYVRRVGRTARGAGGKGKAFIFAVGKQVSLARRIMERNKKGHPVHDVPSILT* MLSAPRAPPPAVAVAAPARFKFQNVCGNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIVWGGTEDDDSSIPSKEVLSWKPLASTSPDNNHPPPTQSSSNEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHDKHNTTDASSRSSFSSYNEPGQLKEQQTLSLPRGRAKIQQLEDRKNSQKLIRVEDEDRDIAIENVSKHFAGYSSDSHAHSARVVHPGSKASASPLRGWGGGSSHYSLKREEIFRQRRNLDDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQVCQISSSIKGTFATYSPYCSATTHTKRKK 667.5 1721 378 91.0 344 34 352 0 0 93.1 0 6VASP14PQ3VG50IV25PSXPXDXNXNXHXPXPXPXTXQXSXSXSDN38ND3ITAT14DG20DE1KR2FS11GD14IS4QH30DE4EQ1QR2GD102 91.0 8.5e-191 54.8
gene.9403.0.5.p1 transcript:OIT35479 transcript:OIT35479 8.5e-191 667.5 1721 690 406 1 378 1 378 MLSAPRVSPPAVAVAAPARFKFPNVCVNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIIWGGTEDDDSSIPSKEVLSWKPLASTPXXXXXXXXXXXXXDEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHNKHNIADASSRSSFSSYNEPDQLKEQQTLSLPRGRAKIQQLDDKKNFQKLIRVEDEDRGIAIENVSKHFAGYSIDSHAQSARVVHPGSKASASPLRGWGGGSSHYSLKRDEIFRERQNLGDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQV MLSAPRAPPPAVAVAAPARFKFQNVCGNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIVWGGTEDDDSSIPSKEVLSWKPLASTSPDNNHPPPTQSSSNEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHDKHNTTDASSRSSFSSYNEPGQLKEQQTLSLPRGRAKIQQLEDRKNSQKLIRVEDEDRDIAIENVSKHFAGYSSDSHAHSARVVHPGSKASASPLRGWGGGSSHYSLKREEIFRQRRNLDDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQV MLSAPRVSPPAVAVAAPARFKFPNVCVNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIIWGGTEDDDSSIPSKEVLSWKPLASTPXXXXXXXXXXXXXDEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHNKHNIADASSRSSFSSYNEPDQLKEQQTLSLPRGRAKIQQLDDKKNFQKLIRVEDEDRGIAIENVSKHFAGYSIDSHAQSARVVHPGSKASASPLRGWGGGSSHYSLKRDEIFRERQNLGDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQVLSTCRSFSKSGVPFHSMVVTGGFCQRTQLENLRQELDILIATPGRFMFLIKEGYLQLTNLKCAVLDEVDILFSDEDFETAFQCLINSSPITTQYLFVTATLPMDIYNKLVESFPDCELVSGPGMHRTSPGLEEFLVDCSGDETAEKSPDTAFINKKNALLHLVEDSPVPKTIVFCNKIDSCRKVENALKRFDRKGFSIKILPFHAALDQRRRLANMEEFRRSKMENVSLFLVCTDRASRGIDFEGVDHVVLFDYPRDPSEYVRRVGRTARGAGGKGKAFIFAVGKQVSLARRIMERNKKGHPVHDVPSILT* MLSAPRAPPPAVAVAAPARFKFQNVCGNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIVWGGTEDDDSSIPSKEVLSWKPLASTSPDNNHPPPTQSSSNEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHDKHNTTDASSRSSFSSYNEPGQLKEQQTLSLPRGRAKIQQLEDRKNSQKLIRVEDEDRDIAIENVSKHFAGYSSDSHAHSARVVHPGSKASASPLRGWGGGSSHYSLKREEIFRQRRNLDDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQVCQISSSIKGTFATYSPYCSATTHTKRKK 667.5 1721 378 91.0 344 34 352 0 0 93.1 0 6VASP14PQ3VG50IV25PSXPXDXNXNXHXPXPXPXTXQXSXSXSDN38ND3ITAT14DG20DE1KR2FS11GD14IS4QH30DE4EQ1QR2GD102 91.0 8.5e-191 54.8
gene.69001.9.9.p1 NisylKD955766g0010.1 NisylKD955766g0010.1 1.4e-294 1011.9 2615 531 530 1 530 1 530 MKEMCLAVAPLPFRLGNNLIFHNPLSIGSSSHMDVTRLNSMGGTTTSLYAESAEKDLSDTVSSSRSEGVPLLHMISENESNNWISGDAVVRESEDDEILSLDGDQMSCSLSVVSDSSSLCGDDFIGFEVASEIFGQNFVDAEKSICSVELIAKPGDLVESGVEDDNVSKPFAVKIEEQITDGSSSKSSQVVVQLPLNKGLSAAVSRSVFEVDYIPLWGFTSVCGRRPEMEDALATVPRFLRIPLQMLVGHRVPDGVSRCLSHLTAHFFGVYDGHGGSQVANYCRDRVHAVLAEELEKFMANLNDESIRQNCQEQWKKAFTNCFLMVDDEVGGTGNHEAVAAETVGSTAVVAIVCSSHIIVANCGDSRAVLCRGKEPTALSVDHKPNREDEYARIEAAGGKVIQWNGHRVFGVLAMSRSIGDRYLKPWIIPDPEVMFIPRTKDDECLILASDGLWDVMSNEEACELARKRILLWHKKNGVTLTLERGQGIDPAAQAAAECLSNRAIQKGSKDNITVIVVDLKAQRKFKSKT MKEMCLAVAPLPFRLGNNLIFRNPPSIGSSSHMDATRLNSMGDTTTSLYAESAEKDLSDTVSSSRSEGVPLLPMISENDRNNWIAGDAVVRESEDDEILSLDGDQVSCSLSVVSDSSSLCGDDFIGFEVASDIYGQNFVDAEKSICSVELIAKPGDLVESGVEDDNVSKPFAVKLEEQITDGSSSKSSQVVVQLPLNKGLSAAVSRSVFEVDYIPLWGFTSVCGRRPEMEDALATVPRFLRIPLQMLVGDRVPDGVSRCLSHLTAHFFGVYDGHGGSQVANYCRDRVHAVLAEELEKFMANLNDESIRQNCQDQWKKAFTNCFLKVDDEVGGTGNREAVAAETVGSTAVVAIVCSSHIIVANCGDSRAVLCRGKEPMALSVDHKPNREDEYARIEAAGGKVIQWNGHRVFGVLAMSRSIGDRYLKPWIIPDPEVMFIPRTKDDECLILASDGLWDVMSNEEACELARKRILLWHKKNGVTLTLERGQGIDPAAQAAAECLSNRATQKGSKDNITVIVVDLKAQRKFKSKT MKEMCLAVAPLPFRLGNNLIFHNPLSIGSSSHMDVTRLNSMGGTTTSLYAESAEKDLSDTVSSSRSEGVPLLHMISENESNNWISGDAVVRESEDDEILSLDGDQMSCSLSVVSDSSSLCGDDFIGFEVASEIFGQNFVDAEKSICSVELIAKPGDLVESGVEDDNVSKPFAVKIEEQITDGSSSKSSQVVVQLPLNKGLSAAVSRSVFEVDYIPLWGFTSVCGRRPEMEDALATVPRFLRIPLQMLVGHRVPDGVSRCLSHLTAHFFGVYDGHGGSQVANYCRDRVHAVLAEELEKFMANLNDESIRQNCQEQWKKAFTNCFLMVDDEVGGTGNHEAVAAETVGSTAVVAIVCSSHIIVANCGDSRAVLCRGKEPTALSVDHKPNREDEYARIEAAGGKVIQWNGHRVFGVLAMSRSIGDRYLKPWIIPDPEVMFIPRTKDDECLILASDGLWDVMSNEEACELARKRILLWHKKNGVTLTLERGQGIDPAAQAAAECLSNRAIQKGSKDNITVIVVDLKAQRKFKSKT* MKEMCLAVAPLPFRLGNNLIFRNPPSIGSSSHMDATRLNSMGDTTTSLYAESAEKDLSDTVSSSRSEGVPLLPMISENDRNNWIAGDAVVRESEDDEILSLDGDQVSCSLSVVSDSSSLCGDDFIGFEVASDIYGQNFVDAEKSICSVELIAKPGDLVESGVEDDNVSKPFAVKLEEQITDGSSSKSSQVVVQLPLNKGLSAAVSRSVFEVDYIPLWGFTSVCGRRPEMEDALATVPRFLRIPLQMLVGDRVPDGVSRCLSHLTAHFFGVYDGHGGSQVANYCRDRVHAVLAEELEKFMANLNDESIRQNCQDQWKKAFTNCFLKVDDEVGGTGNREAVAAETVGSTAVVAIVCSSHIIVANCGDSRAVLCRGKEPMALSVDHKPNREDEYARIEAAGGKVIQWNGHRVFGVLAMSRSIGDRYLKPWIIPDPEVMFIPRTKDDECLILASDGLWDVMSNEEACELARKRILLWHKKNGVTLTLERGQGIDPAAQAAAECLSNRATQKGSKDNITVIVVDLKAQRKFKSKT 1011.9 2615 530 96.6 512 18 519 0 0 97.9 0 21HR2LP9VA7GD29HP5EDSR4SA20MV25ED1FY40IL74HD62ED11MK10HR40TM127IT25 96.6 1.4e-294 99.8
gene.9403.9.5.p1 transcript:OIT35479 transcript:OIT35479 8.5e-191 667.5 1721 690 406 1 378 1 378 MLSAPRVSPPAVAVAAPARFKFPNVCVNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIIWGGTEDDDSSIPSKEVLSWKPLASTPXXXXXXXXXXXXXDEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHNKHNIADASSRSSFSSYNEPDQLKEQQTLSLPRGRAKIQQLDDKKNFQKLIRVEDEDRGIAIENVSKHFAGYSIDSHAQSARVVHPGSKASASPLRGWGGGSSHYSLKRDEIFRERQNLGDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQV MLSAPRAPPPAVAVAAPARFKFQNVCGNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIVWGGTEDDDSSIPSKEVLSWKPLASTSPDNNHPPPTQSSSNEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHDKHNTTDASSRSSFSSYNEPGQLKEQQTLSLPRGRAKIQQLEDRKNSQKLIRVEDEDRDIAIENVSKHFAGYSSDSHAHSARVVHPGSKASASPLRGWGGGSSHYSLKREEIFRQRRNLDDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQV MLSAPRVSPPAVAVAAPARFKFPNVCVNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIIWGGTEDDDSSIPSKEVLSWKPLASTPXXXXXXXXXXXXXDEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHNKHNIADASSRSSFSSYNEPDQLKEQQTLSLPRGRAKIQQLDDKKNFQKLIRVEDEDRGIAIENVSKHFAGYSIDSHAQSARVVHPGSKASASPLRGWGGGSSHYSLKRDEIFRERQNLGDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQVLSTCRSFSKSGVPFHSMVVTGGFCQRTQLENLRQELDILIATPGRFMFLIKEGYLQLTNLKCAVLDEVDILFSDEDFETAFQCLINSSPITTQYLFVTATLPMDIYNKLVESFPDCELVSGPGMHRTSPGLEEFLVDCSGDETAEKSPDTAFINKKNALLHLVEDSPVPKTIVFCNKIDSCRKVENALKRFDRKGFSIKILPFHAALDQRRRLANMEEFRRSKMENVSLFLVCTDRASRGIDFEGVDHVVLFDYPRDPSEYVRRVGRTARGAGGKGKAFIFAVGKQVSLARRIMERNKKGHPVHDVPSILT* MLSAPRAPPPAVAVAAPARFKFQNVCGNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIVWGGTEDDDSSIPSKEVLSWKPLASTSPDNNHPPPTQSSSNEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHDKHNTTDASSRSSFSSYNEPGQLKEQQTLSLPRGRAKIQQLEDRKNSQKLIRVEDEDRDIAIENVSKHFAGYSSDSHAHSARVVHPGSKASASPLRGWGGGSSHYSLKREEIFRQRRNLDDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQVCQISSSIKGTFATYSPYCSATTHTKRKK 667.5 1721 378 91.0 344 34 352 0 0 93.1 0 6VASP14PQ3VG50IV25PSXPXDXNXNXHXPXPXPXTXQXSXSXSDN38ND3ITAT14DG20DE1KR2FS11GD14IS4QH30DE4EQ1QR2GD102 91.0 8.5e-191 54.8
上記のファイルには同様のIDがあります。
gene.9403.0.4.p1
gene.9403.0.5.p1
gene.9403.9.5.p1
アーカイブすると、gene.9403
IDだけが同じになります。残りの列はgene.9403
同じなので、重複した項目を削除したいと思います。
私はこれを使用し、awk -F"\t" '!seen[$2, $3, $4, $5, $6, $7,$8, $9,$10,$11,$12, $13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$28,$29,$30,$31]++' select-results2.txt
上記の例の正しい結果を得ました。
chr11_pilon3.g3568.t1 transcript:OIT01734 transcript:OIT01734 1.1e-107 389.8 1000 218 992 1 216 130 345 MDALTRHIQGDVPWCMLFADDIILIDETRAGVSERLEIWRQTLESKGFKISRSKTEYLECKFGDEPSGVGREVMLGSQAIAKRDSVRYLGSVIQGDGEIDGDVTHRIGAGWSKWRLASGVLCDKKIPHKLKGKFFRAMVRPAMFYEAECWPVKNSHIQRMKVAEMRMLRWMCGHTRLDKIKNEVIRQKVGVAPVDKKMGEARLRWFGHVRRRGPDA MDALTRHIQGDVPWCMLFADDIVLIDETRVGVNERLEVWRQTLESKGFKLSRSKTEYLECKFSAESSEVGRDVKLGSQVIAKRDSFRYLGSVIQGEGEIDGDVTHRIGAGWSKWRLASGVLCDKKVPQKLKGKFYRAVVRPAMLYGAECWPVKNSHVQRMKVAEMRMLRWMRGLTRLDRIRNEVIREKVGVALVDEKMREARLRWYGHVRRRRPDA MDALTRHIQGDVPWCMLFADDIILIDETRAGVSERLEIWRQTLESKGFKISRSKTEYLECKFGDEPSGVGREVMLGSQAIAKRDSVRYLGSVIQGDGEIDGDVTHRIGAGWSKWRLASGVLCDKKIPHKLKGKFFRAMVRPAMFYEAECWPVKNSHIQRMKVAEMRMLRWMCGHTRLDKIKNEVIRQKVGVAPVDKKMGEARLRWFGHVRRRGPDAR* MKVWERVVEARVREMTSISVNQFGFMPGRSTTEAIHLVRRLVEHFRDKKKDLHMVFIDLENAYDKVPREVLWRCLEAKSVPEAYIRVIKDMYDGAKTRVRTVGGDSDHFPVVMGLHQGSALSPLLFALVMDALTRHIQGDVPWCMLFADDIVLIDETRVGVNERLEVWRQTLESKGFKLSRSKTEYLECKFSAESSEVGRDVKLGSQVIAKRDSFRYLGSVIQGEGEIDGDVTHRIGAGWSKWRLASGVLCDKKVPQKLKGKFYRAVVRPAMLYGAECWPVKNSHVQRMKVAEMRMLRWMRGLTRLDRIRNEVIREKVGVALVDEKMREARLRWYGHVRRRRPDAPVRIYKSAILGHLNSHGSQNALAGPVEAEENRQKTKKEVMEEIIQKSKFFKAQKAKDREENDELTEQLDKDFTSLVESKALLSLTQPDKINALKALVNKNISVGNVKKDEVADVPRKASIGKEKPDTYEMLVSEMALDMRARPSDRTKTPEEIAQEEKERLELLEQEXXXXXXXXXXXXXXDGNASDDNSKLVKDPRTVSGDDLGDDLEEVPRTKLGWIGEILRRKENELESEDAASSGDSDDGEDEGXXXXXXXXXXXXXXXXXXXXDEEQGKTQTIKDWEQSDDDIIDTELEDDDEGFGDDAKKVVKIKDHKEENLSITVAAENKKKMQVFYGVLLQYFAVLANKKPLNSKLLNLLVKPLMEMSAVSPYFAAICARQRLQRTRAQFCEDLKNTGKSSWPSLKTIFLLRLWSMIFPCSDFRHCVMTPAILLMCEYLMRCTIISGRDIAIASFLCSLLLSVIKQSQKFCPEAIVFIQTLLMAALDRKQRSNSQLDNLMEIKELGPLLCIRSSKVEMDSLDFLTLMDLPEDSQYFHSDNYRTSMLVTVLETLQGFVNVYKELISFPEIFMLISKLLCKMAGENHIPDALREKIKDVSQLIDTKAQEHHMLRQPLKMRKKKPVPIRMLNPKFEENFVKGRDYDPDRERA 389.8 1000 216 85.6 185 31 200 0 0 92.6 0 22IV6AV2SN4IV11IL12GSDA1PS1GE3ED1MK4AV6VF9DE29IV1HQ6FY2MV5FL1EG10IV14CR1HL4KR1KR5QE5PL2KE2GR6FY6GR3 85.6 1.1e-107 99.1
gene.9403.0.4.p1 transcript:OIT35479 transcript:OIT35479 8.5e-191 667.5 1721 690 406 1 378 1 378 MLSAPRVSPPAVAVAAPARFKFPNVCVNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIIWGGTEDDDSSIPSKEVLSWKPLASTPXXXXXXXXXXXXXDEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHNKHNIADASSRSSFSSYNEPDQLKEQQTLSLPRGRAKIQQLDDKKNFQKLIRVEDEDRGIAIENVSKHFAGYSIDSHAQSARVVHPGSKASASPLRGWGGGSSHYSLKRDEIFRERQNLGDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQV MLSAPRAPPPAVAVAAPARFKFQNVCGNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIVWGGTEDDDSSIPSKEVLSWKPLASTSPDNNHPPPTQSSSNEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHDKHNTTDASSRSSFSSYNEPGQLKEQQTLSLPRGRAKIQQLEDRKNSQKLIRVEDEDRDIAIENVSKHFAGYSSDSHAHSARVVHPGSKASASPLRGWGGGSSHYSLKREEIFRQRRNLDDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQV MLSAPRVSPPAVAVAAPARFKFPNVCVNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIIWGGTEDDDSSIPSKEVLSWKPLASTPXXXXXXXXXXXXXDEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHNKHNIADASSRSSFSSYNEPDQLKEQQTLSLPRGRAKIQQLDDKKNFQKLIRVEDEDRGIAIENVSKHFAGYSIDSHAQSARVVHPGSKASASPLRGWGGGSSHYSLKRDEIFRERQNLGDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQVLSTCRSFSKSGVPFHSMVVTGGFCQRTQLENLRQELDILIATPGRFMFLIKEGYLQLTNLKCAVLDEVDILFSDEDFETAFQCLINSSPITTQYLFVTATLPMDIYNKLVESFPDCELVSGPGMHRTSPGLEEFLVDCSGDETAEKSPDTAFINKKNALLHLVEDSPVPKTIVFCNKIDSCRKVENALKRFDRKGFSIKILPFHAALDQRRRLANMEEFRRSKMENVSLFLVCTDRASRGIDFEGVDHVVLFDYPRDPSEYVRRVGRTARGAGGKGKAFIFAVGKQVSLARRIMERNKKGHPVHDVPSILT* MLSAPRAPPPAVAVAAPARFKFQNVCGNPVNLLLLHRNVGSSCKRVVVSTKAAYSRMPMDTPGAYQLIDKESGDKFIVWGGTEDDDSSIPSKEVLSWKPLASTSPDNNHPPPTQSSSNEASTRGLTGNFGRLKFRRMRDLVRKSYTKNKERDVIDHDKHNTTDASSRSSFSSYNEPGQLKEQQTLSLPRGRAKIQQLEDRKNSQKLIRVEDEDRDIAIENVSKHFAGYSSDSHAHSARVVHPGSKASASPLRGWGGGSSHYSLKREEIFRQRRNLDDENNFFSRKSFQELGCSDYMIESLRNQHFVRPSHIQAMTFGPIIAGKSCIISDQSGSGKTLAYLLPLIQRLRQEELQGLSKPSSQSPRVVVLAPTAELASQVCQISSSIKGTFATYSPYCSATTHTKRKK 667.5 1721 378 91.0 344 34 352 0 0 93.1 0 6VASP14PQ3VG50IV25PSXPXDXNXNXHXPXPXPXTXQXSXSXSDN38ND3ITAT14DG20DE1KR2FS11GD14IS4QH30DE4EQ1QR2GD102 91.0 8.5e-191 54.8
gene.69001.9.9.p1 NisylKD955766g0010.1 NisylKD955766g0010.1 1.4e-294 1011.9 2615 531 530 1 530 1 530 MKEMCLAVAPLPFRLGNNLIFHNPLSIGSSSHMDVTRLNSMGGTTTSLYAESAEKDLSDTVSSSRSEGVPLLHMISENESNNWISGDAVVRESEDDEILSLDGDQMSCSLSVVSDSSSLCGDDFIGFEVASEIFGQNFVDAEKSICSVELIAKPGDLVESGVEDDNVSKPFAVKIEEQITDGSSSKSSQVVVQLPLNKGLSAAVSRSVFEVDYIPLWGFTSVCGRRPEMEDALATVPRFLRIPLQMLVGHRVPDGVSRCLSHLTAHFFGVYDGHGGSQVANYCRDRVHAVLAEELEKFMANLNDESIRQNCQEQWKKAFTNCFLMVDDEVGGTGNHEAVAAETVGSTAVVAIVCSSHIIVANCGDSRAVLCRGKEPTALSVDHKPNREDEYARIEAAGGKVIQWNGHRVFGVLAMSRSIGDRYLKPWIIPDPEVMFIPRTKDDECLILASDGLWDVMSNEEACELARKRILLWHKKNGVTLTLERGQGIDPAAQAAAECLSNRAIQKGSKDNITVIVVDLKAQRKFKSKT MKEMCLAVAPLPFRLGNNLIFRNPPSIGSSSHMDATRLNSMGDTTTSLYAESAEKDLSDTVSSSRSEGVPLLPMISENDRNNWIAGDAVVRESEDDEILSLDGDQVSCSLSVVSDSSSLCGDDFIGFEVASDIYGQNFVDAEKSICSVELIAKPGDLVESGVEDDNVSKPFAVKLEEQITDGSSSKSSQVVVQLPLNKGLSAAVSRSVFEVDYIPLWGFTSVCGRRPEMEDALATVPRFLRIPLQMLVGDRVPDGVSRCLSHLTAHFFGVYDGHGGSQVANYCRDRVHAVLAEELEKFMANLNDESIRQNCQDQWKKAFTNCFLKVDDEVGGTGNREAVAAETVGSTAVVAIVCSSHIIVANCGDSRAVLCRGKEPMALSVDHKPNREDEYARIEAAGGKVIQWNGHRVFGVLAMSRSIGDRYLKPWIIPDPEVMFIPRTKDDECLILASDGLWDVMSNEEACELARKRILLWHKKNGVTLTLERGQGIDPAAQAAAECLSNRATQKGSKDNITVIVVDLKAQRKFKSKT MKEMCLAVAPLPFRLGNNLIFHNPLSIGSSSHMDVTRLNSMGGTTTSLYAESAEKDLSDTVSSSRSEGVPLLHMISENESNNWISGDAVVRESEDDEILSLDGDQMSCSLSVVSDSSSLCGDDFIGFEVASEIFGQNFVDAEKSICSVELIAKPGDLVESGVEDDNVSKPFAVKIEEQITDGSSSKSSQVVVQLPLNKGLSAAVSRSVFEVDYIPLWGFTSVCGRRPEMEDALATVPRFLRIPLQMLVGHRVPDGVSRCLSHLTAHFFGVYDGHGGSQVANYCRDRVHAVLAEELEKFMANLNDESIRQNCQEQWKKAFTNCFLMVDDEVGGTGNHEAVAAETVGSTAVVAIVCSSHIIVANCGDSRAVLCRGKEPTALSVDHKPNREDEYARIEAAGGKVIQWNGHRVFGVLAMSRSIGDRYLKPWIIPDPEVMFIPRTKDDECLILASDGLWDVMSNEEACELARKRILLWHKKNGVTLTLERGQGIDPAAQAAAECLSNRAIQKGSKDNITVIVVDLKAQRKFKSKT* MKEMCLAVAPLPFRLGNNLIFRNPPSIGSSSHMDATRLNSMGDTTTSLYAESAEKDLSDTVSSSRSEGVPLLPMISENDRNNWIAGDAVVRESEDDEILSLDGDQVSCSLSVVSDSSSLCGDDFIGFEVASDIYGQNFVDAEKSICSVELIAKPGDLVESGVEDDNVSKPFAVKLEEQITDGSSSKSSQVVVQLPLNKGLSAAVSRSVFEVDYIPLWGFTSVCGRRPEMEDALATVPRFLRIPLQMLVGDRVPDGVSRCLSHLTAHFFGVYDGHGGSQVANYCRDRVHAVLAEELEKFMANLNDESIRQNCQDQWKKAFTNCFLKVDDEVGGTGNREAVAAETVGSTAVVAIVCSSHIIVANCGDSRAVLCRGKEPMALSVDHKPNREDEYARIEAAGGKVIQWNGHRVFGVLAMSRSIGDRYLKPWIIPDPEVMFIPRTKDDECLILASDGLWDVMSNEEACELARKRILLWHKKNGVTLTLERGQGIDPAAQAAAECLSNRATQKGSKDNITVIVVDLKAQRKFKSKT 1011.9 2615 530 96.6 512 18 519 0 0 97.9 0 21HR2LP9VA7GD29HP5EDSR4SA20MV25ED1FY40IL74HD62ED11MK10HR40TM127IT25 96.6 1.4e-294 99.8
ただし、私が考えないとgene.9403
間違った内容を削除することになるかと心配になります。最初の列も考慮する方法はありますか?
よろしくお願いします。
答え1
この試み:
awk '
{line = gensub(/^([^.]+\.[^.]+)[^[:blank:]]*/, "\1", 1, $0)}
!seen[line]++
' file