{"id":237,"date":"2024-03-27T10:23:12","date_gmt":"2024-03-27T10:23:12","guid":{"rendered":"https:\/\/wp.lancs.ac.uk\/caiss\/?p=237"},"modified":"2024-03-27T10:23:12","modified_gmt":"2024-03-27T10:23:12","slug":"in-the-artificial-intelligence-ai-science-boom-beware-your-results-are-only-as-good-as-your-data","status":"publish","type":"post","link":"https:\/\/wp.lancs.ac.uk\/caiss\/2024\/03\/27\/in-the-artificial-intelligence-ai-science-boom-beware-your-results-are-only-as-good-as-your-data\/","title":{"rendered":"\u201cIn the Artificial Intelligence (AI) Science boom, beware: your results are only as good as your data.&#8221;"},"content":{"rendered":"\n<div class=\"twitter-share\"><a href=\"https:\/\/twitter.com\/intent\/tweet?via=caiss_uk\" class=\"twitter-share-button\">Tweet<\/a><\/div>\n<p>Hunter Moseley shines a light on how we can make our experimental results more trustworthy.\u00a0 Thoroughly vetting them before and after publication will ensure huge complex data sets are both accurate and valid.\u00a0 We need to question results and papers; just because it has been published does not mean it is accurate or even correct in spite of who the author may be and their credentials.<\/p>\n<p>The key to ensuring the accuracy of these results is reproducibility, careful examination of the data with peers and other research groups investigating the outcomes.\u00a0 This is vitally important with a\u00a0 data set that is used in new applications.\u00a0 Mosely and his colleagues found something unexpected when they investigated some recent research papers.\u00a0\u00a0 Duplicates appeared in the data sets which were used in three papers meaning they were corrupt.<\/p>\n<p>In machine learning it is usual to split a data set in two and to use one subset to train a model and the other to evaluate the performance of this model.\u00a0 With no overlap between training and testing subsets, performance in the testing phase will reflect how well the model learns and performs.\u00a0 However, in their examination they found what they described as a \u201ccatastrophic data leakage\u201d problem in that the two subsets were cross contaminated, thereby messing up the ideal separation.\u00a0 About one quarter of the dataset in question was represented more than once, corrupting the cross validation steps.\u00a0 After cleaning up the data sets and applying the published methods again the observed performance was a lot less impressive with a drop in the accuracy score from 0.94 to 0.82.\u00a0 A score of 0.94 is reasonably high and \u201cindicates that the algorithm is usable in many scientific applications\u201d, but at 0.82 it is useful but with limitations and then \u201conly if handled appropriately\u201d.<\/p>\n<p><b>So what?<\/b><\/p>\n<p><b> <\/b>Studies that are published with flawed results obviously call research into question.\u00a0 If researchers do not\u00a0 make their code and methods fully available then this type of error can occur.\u00a0 If high performance is reported this may lead to other researchers not attempting to improve on results, feeling that \u201ctheir algorithms are lacking in comparison.\u201d\u00a0 Some journals like to publish reviews of successful results so this could prevent progress in research as it is not considered valid or even worth publishing!<\/p>\n<p><b>Encouraging reproducibility:<\/b><\/p>\n<p>Moseley argues that a measured approach is needed.\u00a0 Where transparency is demonstrated with data, code and full results being available, a thorough evaluation and identification of the problematic dataset would allow an author to correct their work. Another of his solutions is to retract studies with highly flawed results and little or no support for reproducible research.\u00a0 Scientific reproducibility should not be an option.<\/p>\n<p>Researchers at all levels will need to learn to treat published data with a degree of scepticism, the research community does not want to repeat others\u2019 mistakes.\u00a0 But data sets are complex, especially when using AI.\u00a0 Making these data sets and the code used to analyse them available will benefit the original authors, help validate the research and ensure rigour in the research community.<\/p>\n<p>Link to full article in <a href=\"https:\/\/www.nature.com\/articles\/d41586-024-00306-2?utm_source=Live+Audience&amp;utm_campaign=5093011eda-briefing-dy-20240202&amp;utm_medium=email&amp;utm_term=0_b27a691814-5093011eda-51936156\">Nature<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Hunter Moseley shines a light on how we can make our experimental results more trustworthy.\u00a0 Thoroughly vetting them before and after publication will ensure huge complex data sets are both&hellip; <a href=\"https:\/\/wp.lancs.ac.uk\/caiss\/2024\/03\/27\/in-the-artificial-intelligence-ai-science-boom-beware-your-results-are-only-as-good-as-your-data\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">\u201cIn the Artificial Intelligence (AI) Science boom, beware: your results are only as good as your data.&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1669,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[3,4],"tags":[],"class_list":["post-237","post","type-post","status-publish","format-standard","hentry","category-articles","category-litreview","without-featured-image"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/wp.lancs.ac.uk\/caiss\/wp-json\/wp\/v2\/posts\/237","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/wp.lancs.ac.uk\/caiss\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wp.lancs.ac.uk\/caiss\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/wp.lancs.ac.uk\/caiss\/wp-json\/wp\/v2\/users\/1669"}],"replies":[{"embeddable":true,"href":"https:\/\/wp.lancs.ac.uk\/caiss\/wp-json\/wp\/v2\/comments?post=237"}],"version-history":[{"count":1,"href":"https:\/\/wp.lancs.ac.uk\/caiss\/wp-json\/wp\/v2\/posts\/237\/revisions"}],"predecessor-version":[{"id":238,"href":"https:\/\/wp.lancs.ac.uk\/caiss\/wp-json\/wp\/v2\/posts\/237\/revisions\/238"}],"wp:attachment":[{"href":"https:\/\/wp.lancs.ac.uk\/caiss\/wp-json\/wp\/v2\/media?parent=237"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wp.lancs.ac.uk\/caiss\/wp-json\/wp\/v2\/categories?post=237"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wp.lancs.ac.uk\/caiss\/wp-json\/wp\/v2\/tags?post=237"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}