dc.contributor.author | Zhou, Naihui | |
dc.contributor.author | Jiang, Yuxiang | |
dc.contributor.author | Bergquist, Timothy R. | |
dc.contributor.author | Lee, Alexandra J. | |
dc.contributor.author | Kacsoh, Balint Z. | |
dc.contributor.author | Crocker, Alex W. | |
dc.contributor.author | Friedberg, Iddo | |
dc.contributor.author | Rifaioğlu, Ahmet Süreyya | en_US |
dc.date.accessioned | 2020-05-24T15:31:57Z | |
dc.date.available | 2020-05-24T15:31:57Z | |
dc.date.issued | 2019 | |
dc.identifier.citation | Zhou N, Jiang Y, Bergquist TR, et al. The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens. Genome Biol. 2019;20(1):244. Published 2019 Nov 19. doi:10.1186/s13059-019-1835-8 | en_US |
dc.identifier.issn | 1474-760X | |
dc.identifier.uri | https://doi.org/10.1186/s13059-019-1835-8 | |
dc.identifier.uri | https://hdl.handle.net/20.500.12508/1163 | |
dc.description | Tosatto, Silvio/0000-0003-4525-7793; Zhang, Feng/0000-0003-3447-897X; Gonzalez, Jose Maria Fernandez/0000-0002-4806-5140; Devignes, Marie-Dominique/0000-0002-0399-8713; Wass, Mark/0000-0001-5428-6479; Falda, Marco/0000-0003-2642-519X; Thurlby, Natalie/0000-0002-1007-0286; Zosa, Elaine/0000-0003-2482-0663; Dessimoz, Christophe/0000-0002-2170-853X; Yunes, Jeffrey/0000-0003-1869-3231; Hamid, Md Nafiz/0000-0001-8681-6526; Hoehndorf, Robert/0000-0001-8149-5890; Dogan, Tunca/0000-0002-1298-9763; NOTARO, MARCO/0000-0003-4309-2200; Cozzetto, Domenico/0000-0001-6752-5432; Lewis, Kimberley/0000-0003-3010-8453; Roche, Daniel/0000-0002-9204-1840; Martin, Maria-Jesus/0000-0001-5454-2815; Tress, Michael/0000-0001-9046-6370; Tolvanen, Martti/0000-0003-3434-7646; Cheng, Jianlin/0000-0003-0305-2853; Rose, Peter/0000-0001-9981-9750; Renaux, Alexandre/0000-0002-4339-2791; Kacsoh, Balint/0000-0001-9171-0611; O'Donovan, Claire/0000-0001-8051-7429; Kulmanov, Maxat/0000-0003-1710-1820; Friedberg, Iddo/0000-0002-1789-8000; Zhou, Naihui/0000-0001-6268-6149 | en_US |
dc.description | WOS: 000498615000001 | en_US |
dc.description | PubMed ID: 31744546 | en_US |
dc.description.abstract | Background The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function. Results Here, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility. We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory. Conclusion We conclude that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than the expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. Finally, we report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens. | en_US |
dc.description.sponsorship | National Science FoundationNational Science Foundation (NSF) [DBI1564756, DBI-1458359, DBI-1458390, DMS1614777, CMMI1825941, NSF 1458390]; Gordon and Betty Moore FoundationGordon and Betty Moore Foundation [GBMF 4552]; National Institutes of Health NIGMSUnited States Department of Health & Human ServicesNational Institutes of Health (NIH) - USANIH National Institute of General Medical Sciences (NIGMS) [P20 GM113132]; Cystic Fibrosis Foundation [CFRDP STANTO19R0]; BBSRCBiotechnology and Biological Sciences Research Council (BBSRC) [BB/K004131/1, BB/F00964X/1, BB/M025047/1, BB/M015009/1]; Consejo Nacional de Ciencia y Tecnologia Paraguay (CONACyT)Consejo Nacional de Ciencia y Tecnologia (CONACyT) [14-INV-088, PINV15-315]; NSFNational Science Foundation (NSF) [1660648, DBI 1759934, IIS1763246, DBI-1458477, 0965768, DMR-1420073, DBI-1458443]; NIHUnited States Department of Health & Human ServicesNational Institutes of Health (NIH) - USA [R01GM093123, DP1MH110234, UL1 TR002319, U24 TR002306]; Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany's Excellence Strategy-EXC 2155 "RESIST"German Research Foundation (DFG) [39087428]; National Institutes of HealthUnited States Department of Health & Human ServicesNational Institutes of Health (NIH) - USA [R01GM123055, R01GM60595, R15GM120650, GM083107, GM116960, AI134678, NIH R35-GM128637, R00-GM097033]; ERCEuropean Research Council (ERC) [StG 757700]; Spanish Ministry of Science, Innovation and Universities [BFU2017-89833-P]; Severo Ochoa award; Centre of Excellence project "BioProspecting of Adriatic Sea"; Croatian Government; European Regional Development FundEuropean Union (EU) [KK.01.1.1.01.0002]; ATT Tieto kayttoon grant; Academy of FinlandAcademy of Finland; University of Turku; CSC-IT Center for Science Ltd.; University of Miami; National Cancer Institute of the National Institutes of HealthUnited States Department of Health & Human ServicesNational Institutes of Health (NIH) - USANIH National Cancer Institute (NCI) [U01CA198942]; Helsinki Institute for Life Sciences; Academy of FinlandAcademy of Finland [292589]; National Natural Science Foundation of ChinaNational Natural Science Foundation of China [31671367, 31471245, 91631301, 61872094, 61572139]; National Key Research and Development Program of China [2016YFC1000505, 2017YFC0908402]; Italian Ministry of Education, University and Research (MIUR) PRIN 2017 projectMinistry of Education, Universities and Research (MIUR) [2017483NH8]; Shanghai Municipal Science and Technology Major Project [2017SHZDZX01, 2018SHZDZX01]; UK Biotechnology and Biological Sciences Research CouncilBiotechnology and Biological Sciences Research Council (BBSRC) [BB/N019431/1, BB/L020505/1, BB/L002817/1]; Elsevier; Extreme Science and Engineering Discovery Environment (XSEDE) award [MCB160101, MCB160124]; Ministry of Education, Science and Technological Development of the Republic of Serbia [173001]; Taiwan Ministry of Science and Technology [106-2221-E-004-011-MY2]; Montana State University; Bavarian Ministry for Education; Simons Foundation; NIH NINDSUnited States Department of Health & Human ServicesNational Institutes of Health (NIH) - USANIH National Institute of Neurological Disorders & Stroke (NINDS) [1R21NS103831-01]; University of Illinois at Chicago (UIC) Cancer Center award; UIC College of Liberal Arts and Sciences Faculty Award; UIC International Development Award; Yad Hanadiv [9660/2019]; National Institute of General Medical Science of the National Institute of Health [GM066099, GM079656]; Research Supporting Plan (PSR) of University of Milan [PSR2018-DIP-010-MFRAS]; Swiss National Science FoundationSwiss National Science Foundation (SNSF) [150654]; EMBL-European Bioinformatics Institute core funds; CAFA BBSRC [BB/N004876/1]; European Union's Horizon 2020 research and innovation program under the Marie Sklodowska-Curie grantEuropean Union (EU) [778247]; COST ActionEuropean Cooperation in Science and Technology (COST) [BM1405]; NIH/NIGMSUnited States Department of Health & Human ServicesNational Institutes of Health (NIH) - USANIH National Institute of General Medical Sciences (NIGMS) [R01 GM071749]; National Human Genome Research Institute of the National of Health [U41 HG007234]; INB Grant (ISCIII-SGEFI/ERDF) [PT17/0009/0001]; TUBITAKTurkiye Bilimsel ve Teknolojik Arastirma Kurumu (TUBITAK) [EEEAG-116E930]; KanSil [2016K121540]; Universita degli Studi di Milano; 111 ProjectMinistry of Education, China - 111 Project [B18015]; key project of Shanghai Science Technology [16JC1420402]; ZJLab; project Ribes Network POR-FESR 3S4H [TOPP-ALFREVE18-01]; PRID/SID of University of Padova [TOPP-SID19-01]; NIGMSUnited States Department of Health & Human ServicesNational Institutes of Health (NIH) - USANIH National Institute of General Medical Sciences (NIGMS) [R15GM120650]; King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) [URF/1/3454-01-01, URF/1/3790-01-01]; "the Human Project from Mind, Brain and Learning" of the NCCU Higher Education Sprout Project by the Taiwan Ministry of Education; National Center for High-performance ComputingIstanbul Technical University | en_US |
dc.description.sponsorship | The work of IF was funded, in part, by the National Science Foundation award DBI-1458359. The work of CSG and AJL was funded, in part, by the National Science Foundation award DBI-1458390 and GBMF 4552 from the Gordon and Betty Moore Foundation. The work of DAH and KAL was funded, in part, by the National Science Foundation award DBI-1458390, National Institutes of Health NIGMS P20 GM113132, and the Cystic Fibrosis Foundation CFRDP STANTO19R0. The work of AP, HY, AR, and MT was funded by BBSRC grants BB/K004131/1, BB/F00964X/1 and BB/M025047/1, Consejo Nacional de Ciencia y Tecnologia Paraguay (CONACyT) grants 14-INV-088 and PINV15-315, and NSF Advances in BioInformatics grant 1660648. The work of JC was partially supported by an NIH grant (R01GM093123) and two NSF grants (DBI 1759934 and IIS1763246). ACM acknowledges the support by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany's Excellence Strategy -EXC 2155 "RESIST" - Project ID 39087428. DK acknowledges the support from the National Institutes of Health (R01GM123055) and the National Science Foundation (DMS1614777, CMMI1825941). PB acknowledges the support from the National Institutes of Health (R01GM60595). GB and BZK acknowledge the support from the National Science Foundation (NSF 1458390) and NIH DP1MH110234. FS was funded by the ERC StG 757700 "HYPER-INSIGHT" and by the Spanish Ministry of Science, Innovation and Universities grant BFU2017-89833-P. FS further acknowledges the funding from the Severo Ochoa award to the IRB Barcelona. TS was funded by the Centre of Excellence project "BioProspecting of Adriatic Sea", co-financed by the Croatian Government and the European Regional Development Fund (KK.01.1.1.01.0002). The work of SK was funded by ATT Tieto kayttoon grant and Academy of Finland. JB and HM acknowledge the support of the University of Turku, the Academy of Finland and CSC -IT Center for Science Ltd. TB and SM were funded by the NIH awards UL1 TR002319 and U24 TR002306. The work of CZ and ZW was funded by the National Institutes of Health R15GM120650 to ZW and start-up funding from the University of Miami to ZW. The work of PWR was supported by the National Cancer Institute of the National Institutes of Health under Award Number U01CA198942. PR acknowledges NSF grant DBI-1458477. PT acknowledges the support from Helsinki Institute for Life Sciences. The work of AJM was funded by the Academy of Finland (No. 292589). The work of FZ and WT was funded by the National Natural Science Foundation of China (31671367, 31471245, 91631301) and the National Key Research and Development Program of China (2016YFC1000505, 2017YFC0908402]. CS acknowledges the support by the Italian Ministry of Education, University and Research (MIUR) PRIN 2017 project 2017483NH8. SZ is supported by the National Natural Science Foundation of China (No. 61872094 and No. 61572139) and Shanghai Municipal Science and Technology Major Project (No. 2017SHZDZX01). PLF and RLH were supported by the National Institutes of Health NIH R35-GM128637 and R00-GM097033. JG, DTJ, CW, DC, and RF were supported by the UK Biotechnology and Biological Sciences Research Council (BB/N019431/1, BB/L020505/1, and BB/L002817/1) and Elsevier. The work of YZ and CZ was funded in part by the National Institutes of Health award GM083107, GM116960, and AI134678; the National Science Foundation award DBI1564756; and the Extreme Science and Engineering Discovery Environment (XSEDE) award MCB160101 and MCB160124.; The work of BG, VP, RD, NS, and NV was funded by the Ministry of Education, Science and Technological Development of the Republic of Serbia, Project No. 173001. The work of YWL, WHL, and JMC was funded by the Taiwan Ministry of Science and Technology (106-2221-E-004-011-MY2). YWL, WHL, and JMC further acknowledge the support from "the Human Project from Mind, Brain and Learning" of the NCCU Higher Education Sprout Project by the Taiwan Ministry of Education and the National Center for High-performance Computing for computer time and facilities. The work of IK and AB was funded by Montana State University and NSF Advances in Biological Informatics program through grant number 0965768. BR, TG, and JR are supported by the Bavarian Ministry for Education through funding to the TUM. The work of RB, VG, MB, and DCEK was supported by the Simons Foundation, NIH NINDS grant number 1R21NS103831-01 and NSF award number DMR-1420073. CJJ acknowledges the funding from a University of Illinois at Chicago (UIC) Cancer Center award, a UIC College of Liberal Arts and Sciences Faculty Award, and a UIC International Development Award. The work of ML was funded by Yad Hanadiv (grant number 9660/2019). The work of OL and IN was funded by the National Institute of General Medical Science of the National Institute of Health through GM066099 and GM079656. Research Supporting Plan (PSR) of University of Milan number PSR2018-DIP-010-MFRAS. AWV acknowledges the funding from the BBSRC (CASE studentship BB/M015009/1). CD acknowledges the support from the Swiss National Science Foundation (150654). CO and MJM are supported by the EMBL-European Bioinformatics Institute core funds and the CAFA BBSRC BB/N004876/1. GG is supported by CAFA BBSRC BB/N004876/1. SCET acknowledges funding from the European Union's Horizon 2020 research and innovation program under the Marie Sklodowska-Curie grant agreement No 778247 (IDPfun) and from COST Action BM1405 (NGP-net). SEB was supported by NIH/NIGMS grant R01 GM071749. The work of MLT, JMR, and JMF was supported by the National Human Genome Research Institute of the National of Health, grant numbers U41 HG007234. The work of JMF and JMR was also supported by INB Grant (PT17/0009/0001 - ISCIII-SGEFI/ERDF). VA acknowledges the funding from TUBITAK EEEAG-116E930. RCA acknowledges the funding from KanSil 2016K121540. GV acknowledges the funding from Universita degli Studi di Milano - Project "Discovering Patterns in Multi-Dimensional Data" and Project "Machine Learning and Big Data Analysis for Bioinformatics". SZ is supported by the National Natural Science Foundation of China (No. 61872094 and No. 61572139) and Shanghai Municipal Science and Technology Major Project (No. 2017SHZDZX01). RY and SY are supported by the 111 Project (NO. B18015), the key project of Shanghai Science & Technology (No. 16JC1420402), Shanghai Municipal Science and Technology Major Project (No. 2018SHZDZX01), and ZJLab. ST was supported by project Ribes Network POR-FESR 3S4H (No. TOPP-ALFREVE18-01) and PRID/SID of University of Padova (No. TOPP-SID19-01). CZ and ZW were supported by the NIGMS grant R15GM120650 to ZW and start-up funding from the University of Miami to ZW. The work of MK and RH was supported by the funding from King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Award No. URF/1/3454-01-01 and URF/1/3790-01-01. The work of SDM is funded, in part, by NSF award DBI-1458443. | en_US |
dc.language.iso | eng | en_US |
dc.publisher | Bmc | en_US |
dc.relation.isversionof | 10.1186/s13059-019-1835-8 | en_US |
dc.rights | info:eu-repo/semantics/openAccess | en_US |
dc.subject | Protein function prediction; Long-term memory; Biofilm; Critical assessment; Community challenge | en_US |
dc.subject | Long-term memory | en_US |
dc.subject | Biofilm | en_US |
dc.subject | Critical assessment | en_US |
dc.subject | Community challenge | en_US |
dc.subject.classification | Biotechnology & applied microbiology | en_US |
dc.subject.classification | Genetics & heredity | en_US |
dc.subject.classification | Proteins | Genes | Protein functions | en_US |
dc.subject.other | Candida-albicans | en_US |
dc.subject.other | Ontology | en_US |
dc.subject.other | Identification | en_US |
dc.subject.other | Generation | en_US |
dc.subject.other | Library | en_US |
dc.subject.other | Adult | en_US |
dc.subject.other | Article | en_US |
dc.subject.other | Big data | en_US |
dc.subject.other | Biofilm | en_US |
dc.subject.other | Bioinformatics | en_US |
dc.subject.other | Biological ontology | en_US |
dc.subject.other | Candida albicans | en_US |
dc.subject.other | Candida albicans | en_US |
dc.subject.other | Drosophila melanogaster | en_US |
dc.subject.other | Expectation | en_US |
dc.subject.other | Female | en_US |
dc.subject.other | Human | en_US |
dc.subject.other | Human experiment | en_US |
dc.subject.other | Long term memory | en_US |
dc.subject.other | Male | en_US |
dc.subject.other | Nonhuman | en_US |
dc.subject.other | Plant leaf | en_US |
dc.subject.other | Pseudomonas | en_US |
dc.subject.other | Animal | en_US |
dc.subject.other | Bacterial genome | en_US |
dc.subject.other | Fungal genome | en_US |
dc.subject.other | Genetics | en_US |
dc.subject.other | Locomotion | en_US |
dc.subject.other | Long term memory | en_US |
dc.subject.other | Molecular genetics | en_US |
dc.subject.other | Procedures | en_US |
dc.subject.other | Pseudomonas aeruginosa | en_US |
dc.subject.other | Molecular Sequence Annotation | en_US |
dc.subject.other | Pseudomonas aeruginosa | en_US |
dc.title | The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens | en_US |
dc.type | article | en_US |
dc.relation.journal | Genome Biology | en_US |
dc.contributor.department | Mühendislik ve Doğa Bilimleri Fakültesi -- Bilgisayar Mühendisliği Bölümü | en_US |
dc.identifier.volume | 20 | en_US |
dc.identifier.issue | 1 | en_US |
dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | en_US |
dc.contributor.isteauthor | Rifaioğlu, Ahmet Süreyya | |
dc.relation.index | Web of Science Core Collection - Science Citation Index Expanded | en_US |