{"id":1608,"date":"2025-11-14T14:26:20","date_gmt":"2025-11-14T14:26:20","guid":{"rendered":"https:\/\/wp.lancs.ac.uk\/factor\/?p=1608"},"modified":"2025-11-15T10:43:56","modified_gmt":"2025-11-15T10:43:56","slug":"vanroose-peer-the-risk-of-data-contamination-in-forensic-voice-identification-a-perception-experiment-using-voice-mixed-speech-samples","status":"publish","type":"post","link":"https:\/\/wp.lancs.ac.uk\/factor\/2025\/11\/14\/vanroose-peer-the-risk-of-data-contamination-in-forensic-voice-identification-a-perception-experiment-using-voice-mixed-speech-samples\/","title":{"rendered":"Vanroose-Peer &#8211; The Risk of Data Contamination in (Forensic) Voice Identification: a Perception Experiment using Voice-mixed Speech Samples"},"content":{"rendered":"<p>FACTOR is pleased to announce our next talk by Scarlett Varoose-Peer (FACTOR, <a href=\"https:\/\/www.lancaster.ac.uk\/linguistics\">LAEL<\/a>), in partnership with PhonLab:<\/p>\n<blockquote><p><strong>TITLE<\/strong><\/p>\n<p>The Risk of Data Contamination in (Forensic) Voice Identification: a Perception Experiment using Voice-mixed Speech Samples<\/p>\n<p><strong>ABSTRACT<\/strong><\/p>\n<div>Data contamination is a widely acknowledged phenomenon across many forensic disciplines. However, in an apparent single-speaker sample obtained from a multi-speaker recording, how could we establish whether that sample has been \u2018mixed\u2019 with other voices present in that original recording? This talk introduces \u2018voice-mixing\u2019 \u2013 a form of data contamination that specifically impacts voice identification analyses. In doing so, this talk details the design and results of a perception experiment in which 121 lay listeners were exposed to 21 audio samples created using the West Yorkshire Regional English Dataset (WYRED; Gold, Ross &amp; Earnshaw, 2018): 6 were \u2018authentic\u2019 (i.e., non-voice-mixed) and 15 were \u2018voice-mixed\u2019. The voice-mixed samples ranged from low-to-high similarity speaker pairs. Using a 5-point Likert scale, listeners reported the extent to which they believed each sample to be voice-mixed\/authentic. The results revealed that there was not a significant relationship between actual exposure to a voice-mixed sample and perceiving it as voice-mixed. However, on modelling the degree of similarity within the voice-mixed samples, this revealed a significant relationship: listeners were more likely to rate high-similarity voice-mixed samples as \u2018authentic\u2019 than they were for lower similarity samples. The results also revealed that several participant-intrinsic and participant-extrinsic factors impacted the perception of voice-mixed versus authentic samples. Taken together, I argue that data contamination- and specifically voice-mixing, poses a very real risk in (forensic) voice identification. In a climate of striving for trust in intelligence operations and the criminal justice system, it is time we start to address these risks.<\/div>\n<p><strong>TIME &amp; PLACE<\/strong><\/p>\n<p>W07, 1500-1550, Tue 18th Nov 2025<\/p>\n<p>In-person: <a href=\"https:\/\/www.lancaster.ac.uk\/itpi\/web\/services\/room\/14710\">County South B89<\/a><\/p>\n<p>Online: <a href=\"https:\/\/teams.microsoft.com\/l\/meetup-join\/19%3ameeting_NzcxYWZmMmYtOTlkMi00OTA4LTkwYjctM2RjNWJkYTYzYjE3%40thread.v2\/0?context=%7b%22Tid%22%3a%229c9bcd11-977a-4e9c-a9a0-bc734090164a%22%2c%22Oid%22%3a%220595e637-9946-47a0-b5a8-8902e7cf9f1c%22%7d\">Teams<\/a><\/p>\n<p>Find information on how to get to campus <a href=\"https:\/\/www.lancaster.ac.uk\/about-us\/maps-and-travel\/\">here<\/a>, and how to navigate campus buildings <a href=\"https:\/\/use.mazemap.com\/#v=1&amp;config=lancaster&amp;campusid=341&amp;zlevel=NaN&amp;center=-2.780714,54.009646&amp;zoom=14.3\">here<\/a>.<\/p><\/blockquote>\n","protected":false},"excerpt":{"rendered":"FACTOR is pleased to announce our next talk by Scarlett Varoose-Peer (FACTOR, LAEL), in partnership with PhonLab: TITLE The Risk of Data Contamination in (Forensic) Voice Identification: a Perception Experiment using Voice-mixed Speech Samples ABSTRACT Data contamination is a widely acknowledged phenomenon across many forensic disciplines. However, in an apparent single-speaker sample obtained from a [&hellip;]","protected":false},"author":77,"featured_media":1613,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"episode_type":"audio","audio_file":"","podmotor_file_id":"","podmotor_episode_id":"","cover_image":"","cover_image_id":"","duration":"","filesize":"","filesize_raw":"","date_recorded":"","explicit":"","block":"","itunes_episode_number":"","itunes_title":"","itunes_season_number":"","itunes_episode_type":"","_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_post_was_ever_published":false},"categories":[1],"tags":[],"series":[],"class_list":["post-1608","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"episode_featured_image":"https:\/\/wp.lancs.ac.uk\/factor\/files\/2025\/11\/20251114_1437_Forensic-Voice-Fusion_simple_compose_01ka1cnv5afdc929h7c32w2b29.png","episode_player_image":"https:\/\/i0.wp.com\/wp.lancs.ac.uk\/factor\/files\/2018\/11\/cover_2000.png?fit=2000%2C2000&ssl=1","download_link":"","player_link":"","audio_player":false,"episode_data":{"playerMode":"dark","subscribeUrls":[],"rssFeedUrl":"https:\/\/wp.lancs.ac.uk\/factor\/feed\/podcast\/the-forge","embedCode":"<blockquote class=\"wp-embedded-content\" data-secret=\"cfqxEUQN6I\"><a href=\"https:\/\/wp.lancs.ac.uk\/factor\/2025\/11\/14\/vanroose-peer-the-risk-of-data-contamination-in-forensic-voice-identification-a-perception-experiment-using-voice-mixed-speech-samples\/\">Vanroose-Peer &#8211; The Risk of Data Contamination in (Forensic) Voice Identification: a Perception Experiment using Voice-mixed Speech Samples<\/a><\/blockquote><iframe sandbox=\"allow-scripts\" security=\"restricted\" src=\"https:\/\/wp.lancs.ac.uk\/factor\/2025\/11\/14\/vanroose-peer-the-risk-of-data-contamination-in-forensic-voice-identification-a-perception-experiment-using-voice-mixed-speech-samples\/embed\/#?secret=cfqxEUQN6I\" width=\"500\" height=\"350\" title=\"&#8220;Vanroose-Peer &#8211; The Risk of Data Contamination in (Forensic) Voice Identification: a Perception Experiment using Voice-mixed Speech Samples&#8221; &#8212; FACTOR\" data-secret=\"cfqxEUQN6I\" frameborder=\"0\" marginwidth=\"0\" marginheight=\"0\" scrolling=\"no\" class=\"wp-embedded-content\"><\/iframe><script type=\"text\/javascript\">\n\/* <![CDATA[ *\/\n\/*! This file is auto-generated *\/\n!function(d,l){\"use strict\";l.querySelector&&d.addEventListener&&\"undefined\"!=typeof URL&&(d.wp=d.wp||{},d.wp.receiveEmbedMessage||(d.wp.receiveEmbedMessage=function(e){var t=e.data;if((t||t.secret||t.message||t.value)&&!\/[^a-zA-Z0-9]\/.test(t.secret)){for(var s,r,n,a=l.querySelectorAll('iframe[data-secret=\"'+t.secret+'\"]'),o=l.querySelectorAll('blockquote[data-secret=\"'+t.secret+'\"]'),c=new RegExp(\"^https?:$\",\"i\"),i=0;i<o.length;i++)o[i].style.display=\"none\";for(i=0;i<a.length;i++)s=a[i],e.source===s.contentWindow&&(s.removeAttribute(\"style\"),\"height\"===t.message?(1e3<(r=parseInt(t.value,10))?r=1e3:~~r<200&&(r=200),s.height=r):\"link\"===t.message&&(r=new URL(s.getAttribute(\"src\")),n=new URL(t.value),c.test(n.protocol))&&n.host===r.host&&l.activeElement===s&&(d.top.location.href=t.value))}},d.addEventListener(\"message\",d.wp.receiveEmbedMessage,!1),l.addEventListener(\"DOMContentLoaded\",function(){for(var e,t,s=l.querySelectorAll(\"iframe.wp-embedded-content\"),r=0;r<s.length;r++)(t=(e=s[r]).getAttribute(\"data-secret\"))||(t=Math.random().toString(36).substring(2,12),e.src+=\"#?secret=\"+t,e.setAttribute(\"data-secret\",t)),e.contentWindow.postMessage({message:\"ready\",secret:t},\"*\")},!1)))}(window,document);\n\/\/# sourceURL=https:\/\/wp.lancs.ac.uk\/factor\/wp-includes\/js\/wp-embed.min.js\n\/* ]]> *\/\n<\/script>\n"},"jetpack_featured_media_url":"https:\/\/i0.wp.com\/wp.lancs.ac.uk\/factor\/files\/2025\/11\/20251114_1437_Forensic-Voice-Fusion_simple_compose_01ka1cnv5afdc929h7c32w2b29.png?fit=1024%2C1024&ssl=1","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/pfyJPS-pW","jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/wp.lancs.ac.uk\/factor\/wp-json\/wp\/v2\/posts\/1608","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/wp.lancs.ac.uk\/factor\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wp.lancs.ac.uk\/factor\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/wp.lancs.ac.uk\/factor\/wp-json\/wp\/v2\/users\/77"}],"replies":[{"embeddable":true,"href":"https:\/\/wp.lancs.ac.uk\/factor\/wp-json\/wp\/v2\/comments?post=1608"}],"version-history":[{"count":2,"href":"https:\/\/wp.lancs.ac.uk\/factor\/wp-json\/wp\/v2\/posts\/1608\/revisions"}],"predecessor-version":[{"id":1611,"href":"https:\/\/wp.lancs.ac.uk\/factor\/wp-json\/wp\/v2\/posts\/1608\/revisions\/1611"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/wp.lancs.ac.uk\/factor\/wp-json\/wp\/v2\/media\/1613"}],"wp:attachment":[{"href":"https:\/\/wp.lancs.ac.uk\/factor\/wp-json\/wp\/v2\/media?parent=1608"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wp.lancs.ac.uk\/factor\/wp-json\/wp\/v2\/categories?post=1608"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wp.lancs.ac.uk\/factor\/wp-json\/wp\/v2\/tags?post=1608"},{"taxonomy":"series","embeddable":true,"href":"https:\/\/wp.lancs.ac.uk\/factor\/wp-json\/wp\/v2\/series?post=1608"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}