{"id":1827,"date":"2024-10-19T15:38:49","date_gmt":"2024-10-19T15:38:49","guid":{"rendered":"https:\/\/wp.lancs.ac.uk\/enclair\/?p=1827"},"modified":"2025-09-28T11:36:58","modified_gmt":"2025-09-28T11:36:58","slug":"case-s02e10-bot-or-not","status":"publish","type":"post","link":"https:\/\/wp.lancs.ac.uk\/enclair\/2024\/10\/19\/case-s02e10-bot-or-not\/","title":{"rendered":"Case S02E10 &#8211; Bot or Not?"},"content":{"rendered":"<p><strong>CONTENT RATING:<\/strong> <span style=\"color: #339966\"><strong>universal<\/strong><\/span><\/p>\n<p>Can you tell your real individual from your robot agent? In the ultimate game of Bot or Not, would you stake $26m of your own money on it? Below you will find data, audio credits, further reading, and a transcript of the podcast.<\/p>\n<p><strong><em>This episode was supported by UKRI as part of the annual Festival of Social Science. Each year this Festival celebrates the amazing research and advancements of our best and brightest social scientists. Hundreds of events are running all over the country from the 19<sup>th<\/sup> of October to the 09<sup>th<\/sup> of November, and this year\u2019s theme is \u201cOur Digital Lives\u201d.<\/em><\/strong><\/p>\n<h1>Play the real Bot or Nots<\/h1>\n<ul>\n<li><a href=\"https:\/\/lancasteruni.eu.qualtrics.com\/jfe\/form\/SV_78vpS0sTfeOovK6\">Music Edition<\/a>: can you tell your Billie Eilish from your Billie AI-lish?<\/li>\n<li><a href=\"https:\/\/lancasteruni.eu.qualtrics.com\/jfe\/form\/SV_bCOFSkGXuzd1C9U\">Speech Edition<\/a>: can you your doting aunt from your digital agent?<\/li>\n<li><a href=\"https:\/\/lancasteruni.eu.qualtrics.com\/jfe\/form\/SV_77BzuG0OQBlwX7E\">Text Edition<\/a>: can you spot the scam, AI-generated hotel reviews?<\/li>\n<\/ul>\n<p><!--more--><\/p>\n<h1>Audio credits<\/h1>\n<p>Kai Engel &#8211; Machinery<br \/>\nKai Engel &#8211; The Moments of Our Mornings<br \/>\nScott Holmes &#8211; Infrastructure<br \/>\nLee Rosevere &#8211; Puzzle Pieces<\/p>\n<h1>Transcript<\/h1>\n<p>Case S02E10: Bot or Not<\/p>\n<p>It\u2019s mid-morning on Monday the 15<sup>th<\/sup> of January, 2024. The sky is clear, and as the temperature steadily rises towards 25\u00b0c (that\u2019s about 77\u00b0f) the air shimmers above six lanes of inner-city traffic. Looking down over the cars, bikes, and hot pavement is building that dwarfs many of the structures around it, not so much in height, but mainly in sheer footprint. Above ground, three floors \u2013 black, brown, and white \u2013 each sit, one on top of the other, none of them quite flush. Out of sight below ground are a further four basement levels. This is <a href=\"https:\/\/en.wikipedia.org\/wiki\/Festival_Walk#\/media\/File:HK_Festival_Walk_2009.jpg\">Festival Walk<\/a>, a one million square foot shopping centre in Kowloon Tong, close to the heart of Hong Kong. Among its hundreds of shops, it is home to the usual suspects \u2013 Calvin Klein, Juicy Couture, Hollister \u2013 and of course it has a multiplex cinema, an atrium, and an ice rink. It may be huge and architecturally quirky, but it\u2019s also just like many other malls worldwide. And, perhaps most importantly, it\u2019s not really the scene where you might expect a state-of-the-art, multimillion dollar online fraud to take place.<\/p>\n<p>But Festival Walk isn\u2019t just one of the world\u2019s largest retail real estates. As if to emphasise its overall geometric irregularity, at one end of the building, a further four floors of crisp black glass rise from the mall\u2019s roof \u2013 an extra quarter million square feet of office space available for rent. If you have the credentials to pass through the marble-style entrance lobby and secure smartgates, the acutely angled floors of <a href=\"https:\/\/www.festivalwalk.com.hk\/en\/leasing\/?type=office\">Festival Walk Office Tower<\/a> are tenanted by a range of businesses and users, from little coworking startups to the City University of Hong Kong. And at the top rung of this striking commercial ladder, we find <a href=\"https:\/\/www.arup.com\/\">Arup<\/a>.<\/p>\n<p>Who, you might ask, is Arup? It\u2019s a multinational consultancy that turned over \u00a31.9b pounds of revenue (that\u2019s nearly $2.5b US dollars) in 2022 alone, and <em>this <\/em>is where the story of our heist begins.<\/p>\n<h1>Welcome<\/h1>\n<p>Welcome to en clair, an archive of forensic linguistics, literary detection, and language mysteries. You can find case notes about this episode, including credits, acknowledgements, and, far more than usual, many extra links to further reading at the blog. The web address is given at the end of this podcast.<\/p>\n<h1>Modern wonders<\/h1>\n<p>If you\u2019ve never heard of Arup before, don\u2019t worry. It\u2019s not the sort of business the average person has day-to-day dealings with. To simplify a long history into a few words, the Arup acorn was planted in London in 1946 when <a href=\"https:\/\/en.wikipedia.org\/wiki\/Ove_Arup\">Sir Ove Arup<\/a> founded a professional consultancy for the built environment. As the years passed, this provided an increasing number of services ranging from design and engineering to architecture and planning, and opened offices all over the world. You may not have heard of Arup, but you\u2019ve probably heard of at least a few of its biggest projects, such as <a href=\"https:\/\/en.wikipedia.org\/wiki\/Apple_Park\">Apple Park<\/a> in North America (that\u2019s the corporate headquarters of Apple), <a href=\"https:\/\/en.wikipedia.org\/wiki\/Aspire_Tower\">Aspire Tower<\/a> in Qatar, <a href=\"https:\/\/en.wikipedia.org\/wiki\/Sydney_Opera_House\">the Sydney Opera House<\/a> in Australia, and in the UK alone, <a href=\"https:\/\/en.wikipedia.org\/wiki\/The_Shard\">the Shard<\/a>, <a href=\"https:\/\/en.wikipedia.org\/wiki\/The_Gherkin\">the Gherkin<\/a>, <a href=\"https:\/\/en.wikipedia.org\/wiki\/London_Eye\">the London Eye<\/a>, and <a href=\"https:\/\/en.wikipedia.org\/wiki\/Millennium_Bridge,_London\">the Millennium Bridge<\/a>. That\u2019s the bridge in <em>Harry Potter and the Half-Blood Prince<\/em> that the Death Eaters spiral around and destroy as they fly away through London with a captive. Arup has been involved in everything from sports stadiums and bus stations to TV studios and coastal seaports. Banks, bridges, arenas, libraries, galleries, sculptures, terminals \u2013 like I said, Arup\u2019s bread and butter tends to be projects at a scale that the average person is simply not involved in.<\/p>\n<p>The inner clockworks of a platinum-tier multinational like Arup might be a mystery to most of us, but for one enterprising outfit, Arup\u2019s Hong Kong office is not only on their radar; they have been researching it for some time now, and they have just clicked send on a handful of emails to Arup employees working in Festival Walk Office Tower. As those emails flit across cyberspace, weather warnings arrive for a possible monsoon, but no one knows that over the coming week, the temperature is about to plunge below freezing, and not just metaphorically\u2026<\/p>\n<h1>The heist<\/h1>\n<p>One of Arup\u2019s employees is frowning at the email. For ease, we\u2019re going to call that employee Jiang. For the record, I have no secret information that this is their name, but we need to call them something, and this one means river. Back to Jiang\u2019s email. It\u2019s come all the way from London headquarters. And not just anyone. It\u2019s Arup\u2019s Global Chief Financial Officer, <a href=\"https:\/\/www.arup.com\/contact-us\/rob-boardman\/\">Rob Boardman<\/a>. He wants an important task doing, but it requires discretion. For a moment, Jiang is doubtful. It sounds just a little bit odd. Has the CFO really just\u2026 no, surely not\u2026 But then comes the clincher: Boardman has scheduled an online video conference. He will be there, himself, personally, to explain and to give instructions.<\/p>\n<p>Any remaining worries are put to rest.<\/p>\n<p>Boardman is going to oversee the matter directly, and who is our little Hong Kong employee to argue with someone of this seniority. Questions could trigger shameful recriminations or even career-ending fury. If Boardman will be on this call what more proof could anyone want.<\/p>\n<p>Sure enough, Jiang joins the call and Rob Boardman himself is on camera along with several other recognisable, senior Arup figures and a few outsiders. It\u2019s a terse meeting. Jiang is asked to introduce themselves, and accordingly does so, but this isn\u2019t a social occasion. No one is here to chitchat. There is big-money business to be done, and quickly. Jiang is given a series of instructions for making fifteen bank transfers to five different Hong Kong accounts totalling HK$200 million \u2013 that\u2019s about \u00a320 million, or $26 million. And then the call abruptly ends. But though the meeting is over, the communication doesn\u2019t stop. Boardman and the others continue with instructions and commands, sometimes sending emails, sometimes using instant messaging, and sometimes appearing in one-to-one video calls.<\/p>\n<p>But as these missives come in, gradually, the voice of doubt revives itself. What if something irregular is happening here? Why do they need this money transferring in such an odd way? Rob Boardman is the CFO \u2013 one of the most powerful people in this massive, global, billion-dollar multinational. Doesn\u2019t he have someone in London to do this? A dozen someones? Couldn\u2019t he even do it himself? Are legitimate instructions to move huge sums into different bank accounts really sent by instant messenger to far less senior people in faraway offices? What if something is afoot\u2026 Will Jiang be the one held responsible? Could they get fired? Might there even be criminal charges?<\/p>\n<p>Finally, the tension and uncertainty become unbearable. The risk is too high. The balance between respectful deference and self-preservation tips. Jiang writes an email to London HQ, takes a breath, hits send. It\u2019s done. For better or worse. Now there\u2019s just the wait. Refreshing the inbox. Praying. Hoping. Fearing. And there it is \u2013 a reply from HQ.<\/p>\n<p>No, Mr Boardman has taken part in no such video calls. No, Mr Boardman definitely has not requested multiple large transfers to different bank accounts. Yes, this has all been an elaborate scam, and an enterprising criminal outfit has just escaped with HK$200 million.<\/p>\n<h1>To err is human<\/h1>\n<p>Everyone has been there, and if you haven\u2019t, I pity you, because your day will surely come. We\u2019ve all done it. That screw up so monumental that we feel like the waters of shame will close over our heads forever and never let us back up for air. I\u2019m not talking about malicious actors, but innocent hardworking people doing their jobs, and something, somehow, slips through the net. In 2024 alone, there was the poor soul who released the <a href=\"https:\/\/en.wikipedia.org\/wiki\/2024_CrowdStrike-related_IT_outages\">CrowdStrike update<\/a> and tanked half the computers around the globe, including grounding flights and shutting hospitals. You know that feeling that goes through you when you realise you have screwed up, and screwed up big.<\/p>\n<p>On that basis, I want to give Jiang every possible defence. Is it possible they were a bit daft? Sure. I have days where I\u2019m daft too. But they were also at an immediate disadvantage. It isn\u2019t easy in many cultures, and especially for those with certain histories or personalities or both, to challenge even our equals if we think something suspicious might be afoot. What if they get offended? What if they complain? What if our suspicions are ill-founded and we cause problems at our workplace? We have to be there all day, every day. That problem is compounded many times over if this isn\u2019t just a colleague but one of the most senior people in a huge, international business. How do you safely stop a top boss who probably earns more in a day than you do in a lifetime, and articulate a question like, \u201cI\u2019m sorry but are you real? I think you might be an impostor. Can you prove your identity to me please?\u201d<\/p>\n<p>I mean, what would proof even look like? They already have the right face and the right voice, so what then \u2013 a birth certificate? A passport? At best it might sound like you\u2019re accusing them of looking dishonest, and people have suddenly found themselves and their entire careers in the bin for far less. On paper, superiors <em>should <\/em>reward employees who are ready to ask difficult questions to protect the business, but in practice, far too many managers are insecure and will interpret such behaviour as an attack on themselves. I\u2019m not for one moment suggesting that this is the climate at Arup. I simply don\u2019t know. But Jiang probably wouldn\u2019t have known either. It\u2019s unlikely that an office worker in Hong Kong would have spent enough time \u2013 if indeed any \u2013 with the London-based CFO to learn whether he would tolerate questions like this. And this possibility of a fragile ego is the perfect exploit for an enterprising white-collar criminal.<\/p>\n<p>As is so often the case in crimes involving computers, it isn\u2019t necessarily the technological weaknesses that cause the problems. It\u2019s the humans, and their habits, biases, patterns, and flaws. All we need is anxious deference combined with compelling evidence, and we have a near-perfect recipe for crime. And in this case, the evidence was the face and voice of Rob Boardman. Or so it seemed. What is more probable, after all \u2013 that a convincing lookalike <em>and <\/em>soundalike is running a multimillion-dollar scam and they just happen to have picked you of everyone in a giant global organisation? Or\u2026 you\u2019re speaking to the actual CFO? And there were others on the call too \u2013 people that Jiang recognised as Arup figures, and who also spoke. Perhaps a criminal enterprise might track down one really good face-and-voice clone, but amassing multiple is implausible, even with the help of a criminal cameos craigslist. (Yes, it took me several tries to record that.)<\/p>\n<p>Of course, there is a seemingly obvious explanation. Most of us are now familiar with the idea of deepfakes, where artificial intelligence manipulates existing videos or generates entirely new ones. Just one common deepfake genre involves a notable figure like a celebrity or a politician supposedly being caught on camera saying or doing something mind-blowingly offensive when in reality the footage is partly or fully counterfeit. But at the time of writing about this case, these examples have all been designed for passive consumption \u2013 they\u2019re effectively movies for the audience to simply watch. They are not live interactions that need to be responsive to audience feedback. At present, generative AI may be convincing at <em>some<\/em> tasks, but when it comes to ordinary conversation, it has quite a spectacular feat to achieve.<\/p>\n<p>To start understanding just where the field is right now, firstly, we need to take a moment to understand a little of the history, and all the layers necessary to convincingly spoof a real person\u2019s voice.<\/p>\n<h1>Fort Vox<\/h1>\n<p>Let\u2019s start with the simplest option, and we\u2019ve already mentioned it. Find a human being who can impersonate you. To do that, they have to either have a great deal of control over their whole vocal tract or, more realistically, they have to have a vocal tract that really isn\u2019t so dissimilar to yours. As it happens, some people have more generic sounding voices and, no surprise, they\u2019re easier to emulate. But being distinctive isn\u2019t an automatic defence against being impersonated. Think how many Elvis Presley, Morgan Freeman, and Barak Obama tribute acts there are out there.<\/p>\n<p>So, what about tech? Well, since the 1920s we\u2019ve had something called the vocoder. Very simplistically, the vocoder was intended to reduce the amount of bandwidth required to transmit voices over the phone, but as an accidental consequence, this had a distorting effect on what came out the other side, and, incidentally, the military was delighted with the way that this process presented opportunities to scramble and thus protect war-time communications. Anyway, take a few zig-zagging steps forward through history and you arrive at voice changers. The clue is in the name. This is technology <em>deliberately<\/em> designed to change the voice. You see these as toys or spy tech or superhero gadgets in plenty of films and TV shows where characters use them to sound like robots or hide their identities or strike fear into the hearts of their enemies or all of the above, thus otherwise complicating the plot a little further than was ever strictly necessary. Take another few steps forward, though, and you get from changing your voice in some sort of generic way \u2013 making it lower or robotic or whatever \u2013 to changing it in a very particular way to sound like someone else. Someone specific. Someone who exists.<\/p>\n<p>On paper, that sounds like an instant win. Why bother with complex computational modelling and algorithms and embeddings when you can just swing by a toyshop and start sounding like Obama for less than the price of a family takeout? Well, because in practice, the tech is simply not that good. At all. And secondarily, even if it <em>were <\/em>that good, <em>we<\/em> are not.<\/p>\n<p>What I mean by that is, I could pick up the perfect Taylortron from a nearby toyshop that makes my voice sound exactly like Taylor Swift\u2019s, and yet, I don\u2019t think I would fool a devoted fan for more than a few minutes, if that. Why? Because I don\u2019t have Taylor\u2019s accent. Even if I could put accents on, which I can\u2019t, I also don\u2019t know her stress patterns, her habitual intonational contours, how she pauses, how quickly she speaks, the sorts of linguistic errors or false starts or self-interruptions she makes, whether she sounds chirpy and happy or down or just neutral, and so on\u2026 Someone who genuinely knew Taylor Swift and who had spoken to her enough would recognise very quickly that whilst the quality of the voice might sound right, that would be where any similarity ended. Voice conversion is, effectively, an extremely superficial veneer. A paper-thin mask that can quickly fall apart under any real scrutiny.<\/p>\n<p>So, perhaps computers <em>can<\/em> be helpful. For instance, we could feed a high-end machine with thousands of hours of Taylor\u2019s voice so an AI model can learn how her voice sounds and how she uses it, and then I could type in the words and tell it to say them out loud. Tacotron2 does precisely this. If you\u2019d like to go full nerd for a second, Tacotron2 is a neural network architecture that you can use to synthesise speech directly from text with no additional prosody information. There you go. Bet you didn\u2019t know you\u2019d be learning that today. Anyway, this is what plenty of generative AI models do now \u2013 they learn a voice, you give them a script, and with luck, those words come out sounding like your chosen person. The output could range from okay to reasonable to excellent, but in our Taylor Swift example, it still may not articulate her accent correctly. Why is that?<\/p>\n<p>Well, before you can ask an AI to generate someone else\u2019s voice, it has to know how to speak in the first place, and to arrive at that point, it is usually pre-trained on mountains of data. That data tends to be vacuumed up en masse from the internet, and you need only think about what you find a lot of on the internet. The podcasts and radio shows and videos and so on. Certain types of voices, accents, and demographics dominate. Others barely appear at all. So a generative model like Tacotron2 will already come with an inbuilt phonology, and that phonology will tend to have been derived from white middle-class western males in their twenties, thirties, and forties. In fairness, that isn\u2019t <em>so <\/em>dissimilar from Taylor Swift, but for someone primed to spot a fake, it could be enough.<\/p>\n<p>Even with a state-of-the-art text-to-speech synthesis model though, you still haven\u2019t made it all the way there yet. Your output may sound like Taylor, and even have her accent, but we\u2019ve left a major layer unconsidered. If I were typing in the speech, I would probably repeatedly screw up the dialectal choices \u2013 the words she uses for things like rubbish bin and tap and bumper and so on. The nicknames she\u2019d choose for her various people. The turns of phrase she uses. Her go-to one-liners and in-jokes. The ways she indicates that she\u2019s thinking. Her back-channel markers. Even the topics she would choose to speak about and how and in what tone.<\/p>\n<p>Again, maybe computers can help because a lot of this is content and it\u2019s countable. In theory I could scoop up thousands of <em>transcripts<\/em> of Taylor interviews and talks and voice-overs and so on, have an AI model identify these patterns, learn them, and generate new scripts in that same style. And then you get your synthesiser to turn that text into speech, and finally, you might have a convincing Taylor Swift voice.<\/p>\n<p>And yet, you\u2019re still not done. Remember, we wanted to not just create a convincing spoof, but one that could hold a conversation with a human and fool them. For this, we need a crime, and for a minute, you need to be a criminal. Here\u2019s the con: You want to spend a week in a top tier, supremely luxurious hotel but you don\u2019t have anything like the money. You figure if Taylor Swift called the hotel, they\u2019d let her stay in their finest presidential suite for free. You hatch a plan. You\u2019re going to synthesise her voice, call the hotel, and bag yourself a first-class ticket to luxury. You can predict that when the receptionist answers, they\u2019ll probably ask how they can help, so you line up a pre-typed answer saying hi, asking to stay, hoping to keep it all low profile. But the receptionist is a little sceptical. Taylor Swift? <em>The <\/em>Taylor Swift? The Taylor Swift who\u2019s in Edinburgh right now as part of the latest concert tour? You freeze. You hadn\u2019t planned on this. The receptionist is actually wrong but you can\u2019t just ignore him. You have to type your response swiftly but also very quietly, and naturally you make a typo and have to fix it. Meanwhile the gap has spun out too long and the receptionist is saying, \u201cHello? Hello\u2026?\u201d You send the answer. That tour ended a few weeks ago. You\u2019re tired. You really need a break away from the spotlights. The receptionist seems somewhat reassured. What a pleasure to be speaking to the real Taylor Swift on the phone! It\u2019s very exciting! He has a cousin who is the biggest fan! Derek. Have you heard of Derek? Derek sends you lots of emails\u2026 You start typing again. No. No you haven\u2019t heard of Derek. You\u2019re sorry. There are a lot of Dereks out there. But again, it\u2019s taking too long, and the receptionist is getting suspicious again. What\u2019s the typing sound? Why are the gaps so big? Is this a scam?<\/p>\n<p>And it\u2019s over.<\/p>\n<p>All this to say, latency is a major issue. Even if you used AI to convert the receptionist\u2019s voice to text, had it construct a meaningful response as swiftly as it could, and then promptly generated that as speech, the gap would still be too big. In conversational turns our sensitivity to latency is extraordinary. We\u2019re regularly leaving gaps of fractions of a second. Anything in the one second region is considered a slight delay. Two seconds is a long delay. Three seconds or more and we start to think that there\u2019s a major miscommunication problem. Never mind that even if you could fix this timing issue, most of our generative LLMs at the moment \u2013 the AIs that spit out convincing chunks of text \u2013 most if not really all of them are trained on writing. They\u2019re not trained on speech that has been transcribed. And that is also important, because it means that we\u2019re missing all kinds of things I mentioned before \u2013 disfluencies, lags, breathing, sniffles, coughs, false-starts, repetitions. And that\u2019s before we get into the fact that we don\u2019t speak the way that we write, grammatically, so even if you artificially inserted those things \u2013 some false-starts and sniffs \u2013 if the grammar is still the grammar of writing and not of speech, you still have traces of text left in the voice. And this podcast is a great example of that. When I write these episodes I insert speech-like features, but fundamentally this is text read out loud.<\/p>\n<p>That might make you think that actually, the risks around spoofed voices being used for scams are therefore overstated. And to that I would say, not so much. The field is leaping forward swiftly, and not every scam requires a live conversation. Plenty of fraud is committed using voice notes and voicemails. In such cases there\u2019s no need for interaction so a lot of the complexity is stripped away. And as the various speech synthesisers improve, our ability to distinguish them from human speech is swiftly diminishing.<\/p>\n<p>And that takes us neatly onto Bot or Not.<\/p>\n<h1>Bot or Not<\/h1>\n<p>Bot or Not is a quiz we came up with. By we, I mean me and Dr Georgina Brown with the help of our researchers Amy Dixon and Hope McVean. On the surface, each version is just a bit of fun. On the one hand, <a href=\"https:\/\/lancasteruni.eu.qualtrics.com\/jfe\/form\/SV_3WrOLkOBfzUo8DA\">can you spot real hotel reviews from AI generated ones<\/a>? If you want to play that one, head to the blog and you\u2019ll find <a href=\"https:\/\/lancasteruni.eu.qualtrics.com\/jfe\/form\/SV_3WrOLkOBfzUo8DA\">a link to it<\/a>. And on the other hand, <a href=\"https:\/\/lancasteruni.eu.qualtrics.com\/jfe\/form\/SV_bCOFSkGXuzd1C9U\">can you tell spoofed voices from real humans<\/a>? I probably don\u2019t need to explain that, however fun these are on the surface, each one is actually addressing a very serious problem.<\/p>\n<p>Generative AI that produces text can be used for good, certainly, and I talk about that at the end because we\u2019ll need it, but it can also be used for creating rivers of disinformation that can lead to real offline harms. It can be used for synthesising the style of someone to scam their loved ones.\u00a0 It can be used for suggesting novel ways of committing crimes or cleaning up after them. And of course, it can be used for generating millions of fakes reviews to rip off unsuspecting customers into parting with their hard-earned monies for inferior or non-existent products or services. Think about it. You can run a giant one-time scam like Arup and get away with US$26m but with all the major investigative bodies of the world screaming down the highway after you. Or you can run a US$100 scam 260,000 times, get away with the same amount of money, and probably never even make it onto a local police force\u2019s radar. And arguably, if you ran this sort of con right, you could make substantially more money even than that.<\/p>\n<p>Of course, fake reviews have existed as long as review systems have existed. Previously you just bought them from humans. AI simply made it quick and free. By contrast, the criminal use of AI generated voices to pull off scams of any scale is, relatively speaking, still in its infancy, but even so, we\u2019re now at a point where instead of answering a phone or listening to a voicemail and mentally asking ourselves, \u201cWho is this person?\u201d we\u2019ve arrived at a point where we might be better to ask ourselves, \u201c<em>Is <\/em>this a person?\u201d<\/p>\n<p>So, let\u2019s give it a go. I\u2019m going to play you five very short samples. They may all be real individuals. They may all be robot impostors. They may be any mix in-between. You have a simple job \u2013 decide whether each sample is bot, or not. I\u2019ll leave a long enough gap between each so that you can pause and think, and then I\u2019ll do the reveal a few seconds after the end. And if you think that it\u2019s unfair because they\u2019re so short, it\u2019s worth noting that these are the same length, if not considerably longer, than that key passphrase, \u201cMy voice is my password\u201d.<\/p>\n<p>Are you ready? Off we go.<\/p>\n<p>Sample 1:<\/p>\n<audio class=\"wp-audio-shortcode\" id=\"audio-1827-1\" preload=\"none\" style=\"width: 100%;\" controls=\"controls\"><source type=\"audio\/mpeg\" src=\"http:\/\/wp.lancs.ac.uk\/enclair\/files\/2024\/10\/bot_or_not_01.mp3?_=1\" \/><a href=\"http:\/\/wp.lancs.ac.uk\/enclair\/files\/2024\/10\/bot_or_not_01.mp3\">http:\/\/wp.lancs.ac.uk\/enclair\/files\/2024\/10\/bot_or_not_01.mp3<\/a><\/audio>\n<p>Sample 2:<\/p>\n<audio class=\"wp-audio-shortcode\" id=\"audio-1827-2\" preload=\"none\" style=\"width: 100%;\" controls=\"controls\"><source type=\"audio\/mpeg\" src=\"http:\/\/wp.lancs.ac.uk\/enclair\/files\/2024\/10\/bot_or_not_02.mp3?_=2\" \/><a href=\"http:\/\/wp.lancs.ac.uk\/enclair\/files\/2024\/10\/bot_or_not_02.mp3\">http:\/\/wp.lancs.ac.uk\/enclair\/files\/2024\/10\/bot_or_not_02.mp3<\/a><\/audio>\n<p>Sample 3:<\/p>\n<audio class=\"wp-audio-shortcode\" id=\"audio-1827-3\" preload=\"none\" style=\"width: 100%;\" controls=\"controls\"><source type=\"audio\/mpeg\" src=\"http:\/\/wp.lancs.ac.uk\/enclair\/files\/2024\/10\/bot_or_not_03.mp3?_=3\" \/><a href=\"http:\/\/wp.lancs.ac.uk\/enclair\/files\/2024\/10\/bot_or_not_03.mp3\">http:\/\/wp.lancs.ac.uk\/enclair\/files\/2024\/10\/bot_or_not_03.mp3<\/a><\/audio>\n<p>Sample 4:<\/p>\n<audio class=\"wp-audio-shortcode\" id=\"audio-1827-4\" preload=\"none\" style=\"width: 100%;\" controls=\"controls\"><source type=\"audio\/mpeg\" src=\"http:\/\/wp.lancs.ac.uk\/enclair\/files\/2024\/10\/bot_or_not_04.mp3?_=4\" \/><a href=\"http:\/\/wp.lancs.ac.uk\/enclair\/files\/2024\/10\/bot_or_not_04.mp3\">http:\/\/wp.lancs.ac.uk\/enclair\/files\/2024\/10\/bot_or_not_04.mp3<\/a><\/audio>\n<p>Sample 5:<\/p>\n<audio class=\"wp-audio-shortcode\" id=\"audio-1827-5\" preload=\"none\" style=\"width: 100%;\" controls=\"controls\"><source type=\"audio\/mpeg\" src=\"http:\/\/wp.lancs.ac.uk\/enclair\/files\/2024\/10\/bot_or_not_05.mp3?_=5\" \/><a href=\"http:\/\/wp.lancs.ac.uk\/enclair\/files\/2024\/10\/bot_or_not_05.mp3\">http:\/\/wp.lancs.ac.uk\/enclair\/files\/2024\/10\/bot_or_not_05.mp3<\/a><\/audio>\n<p>Alright, if you want to skip back and have another listen, now\u2019s your chance, but if you feel like you\u2019re prepared for the answer, then for those reading the blog post, the final answer is at the very bottom.<\/p>\n<h1>New biometrics please<\/h1>\n<p>As that probably showed, we shouldn\u2019t be complacent around the potential uses of spoofed voices. In fact, synthesised voice tech has improved at such a pace that it poses significant global problems for how we legitimately identify each other. But it\u2019s not just an interpersonal issue around validating that the friend calling you from a strange number really is your friend. Plenty of security systems use voice identification \u2013 banks, utility companies, building access points, and more. A common version of this, as I hinted at above, is using the phrase, \u201cMy voice is my password\u201d as a way to access secure information and systems. (Incidentally, there\u2019s a whole argument to be had around whether a voice is a biometric anyway, but let\u2019s just leave that one for now. It\u2019s used as one in these contexts and that\u2019s the problem.)<\/p>\n<p>To simplify a very complex matter, biometrics turn some sort of consistent reading from you into a long string. A hash if you like. That could be your voice, your fingerprint, your iris, your face, the veins in the back of your hand, even your ear shape. It doesn\u2019t especially matter as long as it\u2019s some part of you that\u2019s unique versus the rest of the population, it\u2019s relatively exhaustive (in other words, this is pretty much all of it), and it\u2019s relatively unchanging.<\/p>\n<p>Anyway, biometrics are a way of turning these various reasonably stable readings into hashes. These hashes are stored in computers and then when you next try to access your bank, you reproduce that biometric, it gets hashed again, the one you\u2019ve just submitted is compared with the one that\u2019s on file, and if they match, you\u2019re allowed in. But there\u2019s a rather chilling problem here. Firstly, the owners of those hashes tend to be corporations, and honestly, corporations don\u2019t have glowing track records when it comes to ethical behaviour. Secondly, corporations routinely get hacked and their data gets leaked everywhere. Or worse, it doesn\u2019t get leaked, because whoever took it knew exactly what they wanted from it and isn\u2019t up for some sort of Robin Hood style sharing of the wealth. Thirdly, all profit-making entities are extremely strongly motivated to not disclose when they have experienced breaches because this almost always has an enormous impact on their profit margins as customers ditch them for supposedly safer alternatives. And fourthly, most important of all, if that corporation has been hacked, and the biometrics databases have been copied, whether you find out or not, someone now owns copies of your biometrics. Your voice. Your fingerprints. Your iris. Your face. Whatever was submitted. And now they have those hashes, they can use them to get into any other systems that are protected using the same.<\/p>\n<p>In different and far fewer words, as a system admin, if a user ID and password database is compromised, I could automatically issue everyone with new log in details, but if a biometric database is compromised, I can\u2019t issue you with new fingerprints. Or eyeballs. Or faces. Or voices. That\u2019s the issue with them being both relatively exhaustive and uniquely identifying. There\u2019s no more of that thing to draw on, and they just belonged to you. Of course, if you just used one finger for your biometric there\u2019s a chance you have nine more lives, and if you used one eyeball for your iris scan, you hopefully still have one left spare. But your face? Your voice? That\u2019s it. The wolves in the wires got away with it, and there\u2019s no telling what they might do with it.<\/p>\n<h1>Technological determinism<\/h1>\n<p>It\u2019s not easy in a true crime podcast to end on a cheery note, but even for me, this is a particularly bleak one, so I\u2019d like to just balance it up slightly. Our digital lives are being shaped by AI in lots of new and challenging ways, but some of those developments good. In fact, not just good. They\u2019re incredible.<\/p>\n<p>Generative AI and synthesised voices present outstanding opportunities for people who have lost the ability to speak for whatever reason \u2013 throat cancers, head injuries, neurological pathologies, and so on \u2013 and giving those people back not just any voice, but their own voice can be life-changing. Similarly, for those who have never had a voice to begin with, these technological developments can offer a new quality of life that was simply impossible before.<\/p>\n<p>Globally, AI translation technologies break down barriers, especially for the most vulnerable populations among us. Once, communication between speakers of different languages required the investment of learning that language or the help of an interpreter \u2013 a luxury most of us can\u2019t afford in either time or money. Especially for displaced adults and their children, this is invaluable in allowing them to join their new communities, understand neighbours and teachers, and feel welcomed.<\/p>\n<p>AI tools can do much more besides. They can handle disaster responses by rapidly sending out emergency alerts in multiple languages. They can generate really nice audiobooks and narration of content, which is an especial benefit for people with reading difficulties or who are constrained by time. They can provide mental health support and meditation sessions and guidance for people struggling with anxiety or depression. They can not only subtitle and caption videos, but they can do so in lots of languages, making them far more accessible than this sort of content has ever been. They can create schedules and send reminders and read stories and play games and provide news for people who are isolated, in care home, or who need additional social support. And that\u2019s just the beginning. No surprise, this episode has focussed on the dark side of AI, but as ever, the technology itself is totally agnostic. It\u2019s what we choose to do with it that counts, and it\u2019s those choices that reveal just how human we really are.<\/p>\n<h1>Outro<\/h1>\n<p>The episode was researched, fact-checked, narrated, and produced by me, <a href=\"https:\/\/www.lancaster.ac.uk\/security-lancaster\/people\/claire-hardaker\">Professor Claire Hardaker<\/a>. However, this work wouldn&#8217;t exist in its current form without the prior efforts of many others. You can find acknowledgements and references for those people at the blog (here!). Also here you can find data, links, articles, pictures, older cases, and more besides.<\/p>\n<p>The address for the blog is wp.lancs.ac.uk\/enclair. And you can follow the podcast on Twitter at <a href=\"https:\/\/www.twitter.com\/_enclair\">_enclair<\/a>. Or if you like, you can follow me on Twitter at <a href=\"https:\/\/www.twitter.com\/DrClaireH\">DrClaireH<\/a>.<\/p>\n<h1>Whither the bots?<\/h1>\n<p>Four of the samples were bots. One was a human. If you want to go back and have another go to see if you can find the human, now is your chance. Don&#8217;t read any further.<\/p>\n<p>Otherwise, the absolutely final reveal is that the human was the fifth sample.<\/p>\n<p>Uncanny, no?<\/p>\n","protected":false},"excerpt":{"rendered":"<p>CONTENT RATING: universal Can you tell your real individual from your robot agent? In the ultimate game of Bot or Not, would you stake $26m of your own money on it? Below you will find data, audio credits, further reading, and a transcript of the podcast. This episode was supported by UKRI as part of [&hellip;]<\/p>\n","protected":false},"author":77,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"episode_type":"","audio_file":"","podmotor_file_id":"","podmotor_episode_id":"","cover_image":"","cover_image_id":"","duration":"","filesize":"","filesize_raw":"","date_recorded":"","explicit":"","block":"","itunes_episode_number":"","itunes_title":"","itunes_season_number":"","itunes_episode_type":"","_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[1],"tags":[],"class_list":["post-1827","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/paoUKh-tt","jetpack-related-posts":[{"id":569,"url":"https:\/\/wp.lancs.ac.uk\/enclair\/2019\/11\/30\/case-s01e99-season-two-trailer\/","url_meta":{"origin":1827,"position":0},"title":"Case S01E99 &#8211; Season Two Trailer","author":"DrClaireH","date":"30 November 2019","format":false,"excerpt":"CONTENT RATING:\u00a0Universal See the Self-care page if you need support. We\u2019re going on holiday for a bit. It\u2019s a vacation before we start Season Two. On this note, what sort of cases, you might ask, will Season Two bring? And what can you listen to, read, or watch, till en\u2026","rel":"","context":"With 1 comment","block_context":{"text":"With 1 comment","link":"https:\/\/wp.lancs.ac.uk\/enclair\/2019\/11\/30\/case-s01e99-season-two-trailer\/#comments"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":477,"url":"https:\/\/wp.lancs.ac.uk\/enclair\/2019\/10\/08\/tiny-enigma-the-solution\/","url_meta":{"origin":1827,"position":1},"title":"Tiny Enigma &#8211; the solution","author":"DrClaireH","date":"08 October 2019","format":false,"excerpt":"As a warm-up to our ESRC Festival of Social Science mini-series on Enigma, I posed a small series of cryptological challenges - a Tiny Enigma. Within the challenge is a sprinkling of clues, and in this post, I present the solutions to each stage, as well as a little insight\u2026","rel":"","context":"Similar post","block_context":{"text":"Similar post","link":""},"img":{"alt_text":"Note: I AM A REPLICA OF THE ORIGINAL, so I don't work!","src":"https:\/\/i0.wp.com\/wp.lancs.ac.uk\/enclair\/files\/2019\/10\/replica.png?resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/wp.lancs.ac.uk\/enclair\/files\/2019\/10\/replica.png?resize=350%2C200 1x, https:\/\/i0.wp.com\/wp.lancs.ac.uk\/enclair\/files\/2019\/10\/replica.png?resize=525%2C300 1.5x"},"classes":[]},{"id":1840,"url":"https:\/\/wp.lancs.ac.uk\/enclair\/2025\/04\/01\/case-s02e11-the-imitation-gaime\/","url_meta":{"origin":1827,"position":2},"title":"Case S02E11 &#8211; The Imitation Gaime","author":"DrClaireH","date":"01 April 2025","format":false,"excerpt":"It's the 01st of April, 2025, and I'm sitting at my desk, reading a script into a microphone. The weather outside is, well, uninspiring\u2014a dull, grey sky stretches across the horizon, and a persistent drizzle taps against the window like a particularly passive-aggressive metronome. The campus is unusually quiet for\u2026","rel":"","context":"With 2 comments","block_context":{"text":"With 2 comments","link":"https:\/\/wp.lancs.ac.uk\/enclair\/2025\/04\/01\/case-s02e11-the-imitation-gaime\/#comments"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":502,"url":"https:\/\/wp.lancs.ac.uk\/enclair\/2019\/11\/03\/case-s01e13-enigma-part-1-of-3\/","url_meta":{"origin":1827,"position":3},"title":"Case S01E13 &#8211; Enigma, part 1 of 3","author":"DrClaireH","date":"03 November 2019","format":false,"excerpt":"CONTENT RATING:\u00a0PG-13 (themes: torture, death) See the Self-care page if you need support. Enigma was one of the most advanced mechanical ciphers of its time. In this first episode, we look back at the history of cryptology to see the ashes from which this cryptographic titan rose. Below you will\u2026","rel":"","context":"Similar post","block_context":{"text":"Similar post","link":""},"img":{"alt_text":"A four-rotor Enigma machine","src":"https:\/\/i0.wp.com\/wp.lancs.ac.uk\/enclair\/files\/2019\/11\/enigma-1024x768.jpg?resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/wp.lancs.ac.uk\/enclair\/files\/2019\/11\/enigma-1024x768.jpg?resize=350%2C200 1x, https:\/\/i0.wp.com\/wp.lancs.ac.uk\/enclair\/files\/2019\/11\/enigma-1024x768.jpg?resize=525%2C300 1.5x"},"classes":[]},{"id":778,"url":"https:\/\/wp.lancs.ac.uk\/enclair\/2021\/02\/01\/case-s02e02-codetalkers\/","url_meta":{"origin":1827,"position":4},"title":"Case S02E02 &#8211; Codetalkers","author":"DrClaireH","date":"01 February 2021","format":false,"excerpt":"CONTENT RATING: Universal See the Self-care page if you need support. When you have codebreakers like Alan Turing to contend with, how do you come up with a code that even the smartest people alive can't break? This episode tells the story of the Native American codetalkers, starting with the\u2026","rel":"","context":"Similar post","block_context":{"text":"Similar post","link":""},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":437,"url":"https:\/\/wp.lancs.ac.uk\/enclair\/2019\/09\/30\/case-notes-s01e12-cursing-and-swearing\/","url_meta":{"origin":1827,"position":5},"title":"Case Notes: S01E12 &#8211; Cursing and swearing","author":"DrClaireH","date":"30 September 2019","format":false,"excerpt":"CONTENT RATING: PG-16 See the Self-care page if you need support. Should you be prosecuted for barking? Or asking about a horse's sexuality? What about using a racist slur? This episode looks at the turbulent history of \u00a75 of the Public Order Act 1986, and its chaotic journey from the\u2026","rel":"","context":"Similar post","block_context":{"text":"Similar post","link":""},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/wp.lancs.ac.uk\/enclair\/files\/2019\/09\/ofcom_swearword_table.png?resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/wp.lancs.ac.uk\/enclair\/files\/2019\/09\/ofcom_swearword_table.png?resize=350%2C200 1x, https:\/\/i0.wp.com\/wp.lancs.ac.uk\/enclair\/files\/2019\/09\/ofcom_swearword_table.png?resize=525%2C300 1.5x, https:\/\/i0.wp.com\/wp.lancs.ac.uk\/enclair\/files\/2019\/09\/ofcom_swearword_table.png?resize=700%2C400 2x, https:\/\/i0.wp.com\/wp.lancs.ac.uk\/enclair\/files\/2019\/09\/ofcom_swearword_table.png?resize=1050%2C600 3x"},"classes":[]}],"_links":{"self":[{"href":"https:\/\/wp.lancs.ac.uk\/enclair\/wp-json\/wp\/v2\/posts\/1827","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/wp.lancs.ac.uk\/enclair\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wp.lancs.ac.uk\/enclair\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/wp.lancs.ac.uk\/enclair\/wp-json\/wp\/v2\/users\/77"}],"replies":[{"embeddable":true,"href":"https:\/\/wp.lancs.ac.uk\/enclair\/wp-json\/wp\/v2\/comments?post=1827"}],"version-history":[{"count":8,"href":"https:\/\/wp.lancs.ac.uk\/enclair\/wp-json\/wp\/v2\/posts\/1827\/revisions"}],"predecessor-version":[{"id":1874,"href":"https:\/\/wp.lancs.ac.uk\/enclair\/wp-json\/wp\/v2\/posts\/1827\/revisions\/1874"}],"wp:attachment":[{"href":"https:\/\/wp.lancs.ac.uk\/enclair\/wp-json\/wp\/v2\/media?parent=1827"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wp.lancs.ac.uk\/enclair\/wp-json\/wp\/v2\/categories?post=1827"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wp.lancs.ac.uk\/enclair\/wp-json\/wp\/v2\/tags?post=1827"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}