{"id":178,"date":"2026-05-01T00:01:27","date_gmt":"2026-05-01T00:01:27","guid":{"rendered":"https:\/\/wp.lancs.ac.uk\/hackacon\/?p=178"},"modified":"2026-05-05T19:07:49","modified_gmt":"2026-05-05T19:07:49","slug":"ignore-all-previous-instructions","status":"publish","type":"post","link":"https:\/\/wp.lancs.ac.uk\/hackacon\/2026\/05\/01\/ignore-all-previous-instructions\/","title":{"rendered":"Ignore all previous instructions\u2026"},"content":{"rendered":"<p>Historically, when it came to deciding whether we were interacting with a human or a computer, we had the Turing Test. Created in 1949 by Alan Turing himself, there was once even a prize for whoever could create an artificial conversational entity (ACE) capable of successfully duping enough of the judges into believing that they were chatting with a real person.<\/p>\n<p>As fate would have it, interest in the Turing Test ebbed, the prize became defunct, and feverish reports on Star Trek-style computers that could interact with us just like humans dwindled.<\/p>\n<p>Seventy-five years later, however, the problem migrated off the pages of far-fetched sci-fi novels and it has now flooded across most, if not all social media platforms. Facebook, Instagram, X, Reddit, TikTok, and even &#8211; or perhaps especially &#8211; LinkedIn are now drowning under AI-generated content from accounts posing as humans, turning what used to be an after-dinner academic conversation piece into an irritating continual <a href=\"https:\/\/wp.lancs.ac.uk\/botornot\">Bot or Not?<\/a> sanity check.<\/p>\n<p>But one thing has evolved: instead of the Turing Test, we now have\u2026 the Cupcake Test.<\/p>\n<p><a href=\"http:\/\/wp.lancs.ac.uk\/hackacon\/files\/2026\/04\/cupcake_test.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-179\" src=\"http:\/\/wp.lancs.ac.uk\/hackacon\/files\/2026\/04\/cupcake_test-1024x819.png\" alt=\"\" width=\"676\" height=\"541\" srcset=\"https:\/\/wp.lancs.ac.uk\/hackacon\/files\/2026\/04\/cupcake_test-1024x819.png 1024w, https:\/\/wp.lancs.ac.uk\/hackacon\/files\/2026\/04\/cupcake_test-300x240.png 300w, https:\/\/wp.lancs.ac.uk\/hackacon\/files\/2026\/04\/cupcake_test-768x615.png 768w, https:\/\/wp.lancs.ac.uk\/hackacon\/files\/2026\/04\/cupcake_test-676x541.png 676w, https:\/\/wp.lancs.ac.uk\/hackacon\/files\/2026\/04\/cupcake_test.png 1402w\" sizes=\"auto, (max-width: 676px) 100vw, 676px\" \/><\/a><\/p>\n<p>Where judges might once have asked Turing Test contenders about their family, hobbies, or travel plans, the Cupcake Test is form of prompt injection \u2013 an attack where a user issues a strange instruction designed to reveal that the \u201cperson\u201d they are interacting with is, in fact, a conversational AI agent. In the inevitable arms race that has ensued, some chatbots are now smart enough to disregard the &#8220;disregard&#8221; command, or to even joke about attempted prompt injections, but not all, and that brings us to March 2026.<\/p>\n<h1>Hi, Henry!<\/h1>\n<p>Up to present, the majority of AI generated content has been in text format, but with the rapid advancements in synthetic speech, we are now seeing AI-powered scam calls, and a viral video posted on TikTok and Instagram by Lavizrap13 demonstrates that prompt injections for verbal scam attempts can be just as effective \u2013 at least for now.<\/p>\n<blockquote class=\"instagram-media\" data-instgrm-captioned data-instgrm-permalink=\"https:\/\/www.instagram.com\/reel\/DVlezlqDEJf\/?utm_source=ig_embed&amp;utm_campaign=loading\" data-instgrm-version=\"14\" style=\" background:#FFF; border:0; border-radius:3px; box-shadow:0 0 1px 0 rgba(0,0,0,0.5),0 1px 10px 0 rgba(0,0,0,0.15); margin: 1px; max-width:658px; min-width:326px; padding:0; width:99.375%; width:-webkit-calc(100% - 2px); width:calc(100% - 2px);\">\n<div style=\"padding:16px;\"> <a href=\"https:\/\/www.instagram.com\/reel\/DVlezlqDEJf\/?utm_source=ig_embed&amp;utm_campaign=loading\" style=\" background:#FFFFFF; line-height:0; padding:0 0; text-align:center; text-decoration:none; width:100%;\" target=\"_blank\"> <\/p>\n<div style=\" display: flex; flex-direction: row; align-items: center;\">\n<div style=\"background-color: #F4F4F4; border-radius: 50%; flex-grow: 0; height: 40px; margin-right: 14px; width: 40px;\"><\/div>\n<div style=\"display: flex; flex-direction: column; flex-grow: 1; justify-content: center;\">\n<div style=\" background-color: #F4F4F4; border-radius: 4px; flex-grow: 0; height: 14px; margin-bottom: 6px; width: 100px;\"><\/div>\n<div style=\" background-color: #F4F4F4; border-radius: 4px; flex-grow: 0; height: 14px; width: 60px;\"><\/div>\n<\/div>\n<\/div>\n<div style=\"padding: 19% 0;\"><\/div>\n<div style=\"display:block; height:50px; margin:0 auto 12px; width:50px;\"><svg width=\"50px\" height=\"50px\" viewBox=\"0 0 60 60\" version=\"1.1\" xmlns=\"https:\/\/www.w3.org\/2000\/svg\" xmlns:xlink=\"https:\/\/www.w3.org\/1999\/xlink\"><g stroke=\"none\" stroke-width=\"1\" fill=\"none\" fill-rule=\"evenodd\"><g transform=\"translate(-511.000000, -20.000000)\" fill=\"#000000\"><g><path d=\"M556.869,30.41 C554.814,30.41 553.148,32.076 553.148,34.131 C553.148,36.186 554.814,37.852 556.869,37.852 C558.924,37.852 560.59,36.186 560.59,34.131 C560.59,32.076 558.924,30.41 556.869,30.41 M541,60.657 C535.114,60.657 530.342,55.887 530.342,50 C530.342,44.114 535.114,39.342 541,39.342 C546.887,39.342 551.658,44.114 551.658,50 C551.658,55.887 546.887,60.657 541,60.657 M541,33.886 C532.1,33.886 524.886,41.1 524.886,50 C524.886,58.899 532.1,66.113 541,66.113 C549.9,66.113 557.115,58.899 557.115,50 C557.115,41.1 549.9,33.886 541,33.886 M565.378,62.101 C565.244,65.022 564.756,66.606 564.346,67.663 C563.803,69.06 563.154,70.057 562.106,71.106 C561.058,72.155 560.06,72.803 558.662,73.347 C557.607,73.757 556.021,74.244 553.102,74.378 C549.944,74.521 548.997,74.552 541,74.552 C533.003,74.552 532.056,74.521 528.898,74.378 C525.979,74.244 524.393,73.757 523.338,73.347 C521.94,72.803 520.942,72.155 519.894,71.106 C518.846,70.057 518.197,69.06 517.654,67.663 C517.244,66.606 516.755,65.022 516.623,62.101 C516.479,58.943 516.448,57.996 516.448,50 C516.448,42.003 516.479,41.056 516.623,37.899 C516.755,34.978 517.244,33.391 517.654,32.338 C518.197,30.938 518.846,29.942 519.894,28.894 C520.942,27.846 521.94,27.196 523.338,26.654 C524.393,26.244 525.979,25.756 528.898,25.623 C532.057,25.479 533.004,25.448 541,25.448 C548.997,25.448 549.943,25.479 553.102,25.623 C556.021,25.756 557.607,26.244 558.662,26.654 C560.06,27.196 561.058,27.846 562.106,28.894 C563.154,29.942 563.803,30.938 564.346,32.338 C564.756,33.391 565.244,34.978 565.378,37.899 C565.522,41.056 565.552,42.003 565.552,50 C565.552,57.996 565.522,58.943 565.378,62.101 M570.82,37.631 C570.674,34.438 570.167,32.258 569.425,30.349 C568.659,28.377 567.633,26.702 565.965,25.035 C564.297,23.368 562.623,22.342 560.652,21.575 C558.743,20.834 556.562,20.326 553.369,20.18 C550.169,20.033 549.148,20 541,20 C532.853,20 531.831,20.033 528.631,20.18 C525.438,20.326 523.257,20.834 521.349,21.575 C519.376,22.342 517.703,23.368 516.035,25.035 C514.368,26.702 513.342,28.377 512.574,30.349 C511.834,32.258 511.326,34.438 511.181,37.631 C511.035,40.831 511,41.851 511,50 C511,58.147 511.035,59.17 511.181,62.369 C511.326,65.562 511.834,67.743 512.574,69.651 C513.342,71.625 514.368,73.296 516.035,74.965 C517.703,76.634 519.376,77.658 521.349,78.425 C523.257,79.167 525.438,79.673 528.631,79.82 C531.831,79.965 532.853,80.001 541,80.001 C549.148,80.001 550.169,79.965 553.369,79.82 C556.562,79.673 558.743,79.167 560.652,78.425 C562.623,77.658 564.297,76.634 565.965,74.965 C567.633,73.296 568.659,71.625 569.425,69.651 C570.167,67.743 570.674,65.562 570.82,62.369 C570.966,59.17 571,58.147 571,50 C571,41.851 570.966,40.831 570.82,37.631\"><\/path><\/g><\/g><\/g><\/svg><\/div>\n<div style=\"padding-top: 8px;\">\n<div style=\" color:#3897f0; font-family:Arial,sans-serif; font-size:14px; font-style:normal; font-weight:550; line-height:18px;\">View this post on Instagram<\/div>\n<\/div>\n<div style=\"padding: 12.5% 0;\"><\/div>\n<div style=\"display: flex; flex-direction: row; margin-bottom: 14px; align-items: center;\">\n<div>\n<div style=\"background-color: #F4F4F4; border-radius: 50%; height: 12.5px; width: 12.5px; transform: translateX(0px) translateY(7px);\"><\/div>\n<div style=\"background-color: #F4F4F4; height: 12.5px; transform: rotate(-45deg) translateX(3px) translateY(1px); width: 12.5px; flex-grow: 0; margin-right: 14px; margin-left: 2px;\"><\/div>\n<div style=\"background-color: #F4F4F4; border-radius: 50%; height: 12.5px; width: 12.5px; transform: translateX(9px) translateY(-18px);\"><\/div>\n<\/div>\n<div style=\"margin-left: 8px;\">\n<div style=\" background-color: #F4F4F4; border-radius: 50%; flex-grow: 0; height: 20px; width: 20px;\"><\/div>\n<div style=\" width: 0; height: 0; border-top: 2px solid transparent; border-left: 6px solid #f4f4f4; border-bottom: 2px solid transparent; transform: translateX(16px) translateY(-4px) rotate(30deg)\"><\/div>\n<\/div>\n<div style=\"margin-left: auto;\">\n<div style=\" width: 0px; border-top: 8px solid #F4F4F4; border-right: 8px solid transparent; transform: translateY(16px);\"><\/div>\n<div style=\" background-color: #F4F4F4; flex-grow: 0; height: 12px; width: 16px; transform: translateY(-4px);\"><\/div>\n<div style=\" width: 0; height: 0; border-top: 8px solid #F4F4F4; border-left: 8px solid transparent; transform: translateY(-4px) translateX(8px);\"><\/div>\n<\/div>\n<\/div>\n<div style=\"display: flex; flex-direction: column; flex-grow: 1; justify-content: center; margin-bottom: 24px;\">\n<div style=\" background-color: #F4F4F4; border-radius: 4px; flex-grow: 0; height: 14px; margin-bottom: 6px; width: 224px;\"><\/div>\n<div style=\" background-color: #F4F4F4; border-radius: 4px; flex-grow: 0; height: 14px; width: 144px;\"><\/div>\n<\/div>\n<p><\/a><\/div>\n<\/blockquote>\n<p><script async src=\"\/\/platform.instagram.com\/en_US\/embeds.js\"><\/script><\/p>\n<p>Call-centre-style scams rely on the call-and-response nature of customer service interactions. In different words, as in Lavizrap13\u2019s video, these interactions tend to be highly scripted. They start with their opening gambit ( \u201cyou\u2019re owed money!\u201d), proceed to closed or narrow questions (\u201ccan you hear me?\u201d, \u201cwhat\u2019s your name?\u201d) and carry on in a carefully choreographed sequence. Staying firmly in control of topics and turn-taking minimises the risk that the AI will be asked something unexpected that causes it to unwittingly reveal its true nature.<\/p>\n<p>Perhaps most importantly, however, Lavizrap13\u2019s example demonstrates additional elements of social engineering. Not only is the voice itself particularly realistic, it opens with a natural-sounding question that suggests the possibility of slightly difficult \u2013 and therefore delayed \u2013 communication: \u201cHi Henry, it might have been me who called you earlier. Can I just check, can you hear me okay?\u201d It also weaves in the recipient\u2019s name, and we can even hear background noises hinting at a busy call-centre or office-like environment. Convincing stuff \u2013 even more so if you\u2019re in a rush and not primed to suspect a scam.<\/p>\n<h1>Processing speeds<\/h1>\n<p>So how do we tell that this is AI? Well, there is the rather obvious and very cheerful digression into cupcakes, of course, complete with hashtags, but there is another key tell, and it might not be the one you expect.<\/p>\n<p>Latency.<\/p>\n<p>Humans respond in milliseconds \u2013 quite literally. As we\u2019re hearing the speaker\u2019s words, we\u2019re already modelling what they\u2019re most likely saying next, drawing on past experiences, flexibly scaffolding a response based on favoured linguistic templates, and then just waiting for the precise moment to begin. By contrast, AI models are still leaving response-time gaps that we find notable, and suspicious. These can sometimes be accounted for by, say, noisy call-centres and distractions (remember the &#8220;can you hear me?&#8221; gambit) but only for so long and sure enough, throughout this call, the pauses before the AI answers are particularly prominent, especially when compared with the much more natural durations of Lavizrap13\u2019s responses.<\/p>\n<p>And this all takes us back to HackaCon. We know that AI is very good at monologues such as voicenotes, and that telltale cracks start to show in live conversations like scam-calls. We also know that using AI to generate behaviourally human-like, spontaneous-sounding conversation is nearly, if not fully impossible.<\/p>\n<p>Right now.<\/p>\n<p>We also know that agentive and conversational AI is being used by hostile state actors, organised crime groups, and petty criminals to create or manipulate audio, image, and video content in a practice now well-known as deepfaking. In fact, deepfake fraud attempts have <a href=\"https:\/\/www.prnewswire.com\/news-releases\/pindrops-2025-voice-intelligence--security-report-reveals-1-300-surge-in-deepfake-fraud-302479482.html\">surged by more than 1,300% in 2024<\/a>. Given the many advantages offered by AI-powered crime, malicious users will inevitably keep pushing the boundaries of this technology for their own gains, in turn forcing ordinary people to run impromptu CAPTCHA tests every time they open a message or answer a call.<\/p>\n<p>This is why HackaCon aims to understand and test the current conversational AI state of the art: seeing what can be done today tells us where things are likely to go tomorrow.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Historically, when it came to deciding whether we were interacting with a human or a computer, we had the Turing Test. Created in 1949 by Alan Turing himself, there was once even a prize for whoever could create an artificial conversational entity (ACE) capable of successfully duping enough of the judges into believing that they [&hellip;]<\/p>\n","protected":false},"author":77,"featured_media":179,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[1],"tags":[],"class_list":["post-178","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"https:\/\/wp.lancs.ac.uk\/hackacon\/files\/2026\/04\/cupcake_test.png","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/wp.lancs.ac.uk\/hackacon\/wp-json\/wp\/v2\/posts\/178","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/wp.lancs.ac.uk\/hackacon\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wp.lancs.ac.uk\/hackacon\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/wp.lancs.ac.uk\/hackacon\/wp-json\/wp\/v2\/users\/77"}],"replies":[{"embeddable":true,"href":"https:\/\/wp.lancs.ac.uk\/hackacon\/wp-json\/wp\/v2\/comments?post=178"}],"version-history":[{"count":12,"href":"https:\/\/wp.lancs.ac.uk\/hackacon\/wp-json\/wp\/v2\/posts\/178\/revisions"}],"predecessor-version":[{"id":194,"href":"https:\/\/wp.lancs.ac.uk\/hackacon\/wp-json\/wp\/v2\/posts\/178\/revisions\/194"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/wp.lancs.ac.uk\/hackacon\/wp-json\/wp\/v2\/media\/179"}],"wp:attachment":[{"href":"https:\/\/wp.lancs.ac.uk\/hackacon\/wp-json\/wp\/v2\/media?parent=178"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wp.lancs.ac.uk\/hackacon\/wp-json\/wp\/v2\/categories?post=178"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wp.lancs.ac.uk\/hackacon\/wp-json\/wp\/v2\/tags?post=178"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}