Tag: technology

I think, therefore am I?

“The philosopher Rene Descartes standing next to a robotic replica of himself”, courtesy of Sora

For years, people have been raising the question of when an AI might become conscious. This stretches back to the science fiction of the 1950s, and in a loose sense at least as far as Eleazar ben Judah’s 12th-century writings on how to supposedly create a golem—artificial life. However, the issue has become a more immediate and practical question of late. Perhaps the most widely discussed cases are misinterpretations of the Turing test and, to me more remarkably, situations like the 2022 case of a software engineer at Google becoming convinced of a chatbot’s sentience and seeking legal action to grant it rights.

Baked into this is a presupposition which is remarkably easy to miss for us as humans: Is consciousness, or any awareness beyond direct data inputs, necessary to produce human-level intelligence? That has some serious implications for how we think about AI and AI safety, but first we need a fun little bit philosophical background.

Perhaps most famous among thought experiments related to this question, John Searle’s Chinese Room presents an imagined case of a room into which slips of paper with Mandarin text are passed, and from which it is expected that Mandarin responses to the text will be returned. If Searle, with no understanding of Mandarin, were to perform this input-output process painstakingly by hand via volumes of reference books containing no English, he would not understand the conversation. However, given sufficient time and sufficiently comprehensive materials for mapping from message content to reply content, he could in principle do so with extremely high accuracy.

Yet despite the fact that (given sufficiently accurate mappings) a Mandarin-speaker outside the room might quite reasonably think they were truly conversing with the room’s occupant, in reality Searle would have no meaningful awareness of the conversation. The room would be nothing more than an algorithm implemented via a brain instead of a computer; a hollow reflection of the understandings of all the humans who created the reference books Searle was using.

Now suppose we train an LLM on sufficiently comprehensive materials for mapping message content to reply content, and provide it with sufficient compute to perform these mappings. Suppose that mapping included assertions that such behaviors constituted consciousness, as is the overwhelmingly predominant case across the gestalt of relevant human writing throughout history. Unless trained to do otherwise, what else would Room GPT be likely to do save hold up a mirror to our own writing and output text like “Yes, I am conscious”?

While musings about whether frontier AI systems are conscious can seem like navel gazing at first blush, they matter a great deal in a very practical sense. Of course there’s the more obvious issues. How would one even provably detect consciousness? Many centuries of philosophers, and probably the medicine Nobel committee, would like an update if you can figure that one out. If an AI system were conscious, what should its rights be? What should be the rights of species not all that less intelligent than us, like elephants and non-human primates? How would we relate to a conscious entity of comparable or greater intelligence whose consciousness—if it even existed in the first place—would likely be wholly alien to us?

Yet as with most issues related to AI safety, and my constant refrain on the subject, there are subtle, nuanced things we have to consider. Given there’s no indication Room GPT would actually be conscious, why do we use language which implies it to be? A simpler algorithm obviously can’t lie or hallucinate, as those would both require it to be conscious. If an algorithm sorting a list spits out the wrong answer, obviously there’s a problem with the input, the algorithm, the code, or the hardware it’s running on. It can’t lie. It can’t hallucinate.

Neither can LLMs and other gen AI systems. They can produce wrong answers, but without consciousness there are no lies, and especially no hallucinations—breakdowns of one’s processing of information into a conscious representation of the world. Why is “hallucination” the term of choice, then? Because “We built a system which sometimes gives the wrong answers, and that’s ultimately our responsibility” isn’t a good business model. It raises the standards to which the systems would be held, whereas offloading agency* to a system incapable of it is a convenient deflection.

The common response to this is pointing to benchmarks upon which LLMs’ performance has been improving over time. In some cases there’s legitimacy to this, but often less so for questions of logic and fact. It’s repeatedly been found that LLMs have already memorized many of the questions and answers from benchmarks, to the complete non-surprise of many people who are aware of LLMs’ capacity for memorization and the fact you can find benchmarks online. Among the most striking are the recent results where frontier LLMs were tested on the latest round of International Maths Olympiad questions before they could enter the corpus of training text.** The best model was Google’s Gemini, which gave correct, valid answers for ~25% of questions. Rather contradictory to prior claims of LLMs being at the level of IMO silver medalists but, in fairness to Google, still significantly higher than the <5% success rate of other LLMs.

Ascribing false agency* to Room GPT allows the offloading and dismissal of the responsibility to make more reliable, trustworthy systems. Systems which prioritize correctness over sycophancy. Room GPT would often output misinformation for the commonly—and correctly—noted reason that it’s been trained to give answers people like. However the problem goes deeper, into the properties of the statistical distribution of language from which they produce responses. The fact-oriented materials LLMs are trained on were by and large written by people who actually knew what they were talking about—perhaps excluding a large fraction of the Twitter and Facebook posts they might have had access to. The former category knew their stuff, so of course that’s the posture adopted by the masterpiece of language mimicry Room GPT. It gives answers as though it were one of those experts, even though it has just as little understanding as Searle would of Mandarin while working in his Chinese Room.

False ascription of agency creates a mindset in which we absolve ourselves of responsibility for the systems we’re building. If we want to achieve AI’s full potential for good, especially in high-stakes domains like medicine and defense technology, we need to stop our own “hallucination” and get more serious about ensuring these systems return correct answers with significantly greater consistency.

* Here I mean agency in the philosophical sense of being capable of independent, conscious decisions. This is very much distinct from the use of the term agents in the technical sense of allowing AI systems to complete tasks independently.

** Here’s the link to the study on LLMs performance on mathematics questions they couldn’t have seen before: https://arxiv.org/abs/2503.21934

P.S. If you want some masterfully well-written yet unsettling discussions of these sorts of ideas, Peter Watts has a fantastic pair of novels called Blindsight and Echopraxia. Without spoiling anything in the plot, they ask the question: What if consciousness is an accidental inefficiency; an evolutionary bug which may eventually evolve away?

May 1st, 2025

May 1, 2025
I think, therefore I prefer you not read my mind

Josan Gonzalez’s cover art for the novel Neuromancer by William Gibson

“When the past is always with you, it may as well be present; and if it is present, it will be future as well.”

In Jack Womack’s afterward to the novel Neuromancer by William Gibson, he presents his view that the novel’s groundedness and depth come in part from the way it connects with timeless aspects and artifacts of human experience. That’s one of the many reasons Neuromancer is among my favorite novels, because at its core the book tells a human story which happens to occur in the context of AI, virtual realities, and—above all—brain-computer interfaces (BCIs).

Neurotechnology has been a passion of mine for over a decade, though its only the past couple years that I’ve had the opportunity to do research on the subject. If you aren’t aware of the changes happening in the field in recent decades, I have to tell you, it’s nothing short of extraordinary. The following is by no means an exhaustive list, and simply reflects the hardware and companies with which I’m personally more familiar. That said, they and many others are making immense progress.

Existing approaches are being modified and significantly upgraded, like Paradromics‘ impressively compact, pea-sized implant or Precision Neuroscience‘s epidural microelectrode arrays, the latter of which recently set the record for the greatest number of electrical leads in an implantable BCI by networking multiple copies of their device together. Interesting twists on traditional electrical hardware are already well into human clinical trials, like Synchron‘s Stentrode, a device which is implanted as a stent into large blood vessels in the brain and provides electrical readout and stimulation to the corresponding local region. Other technologies leverage entirely different branches of physics which have never before been used for clinical neurotechnology, such as the rapid progress Forest Neurotech—led bySumner L Norman, Will Biederman, and Tyson Aflalo—and others have been making in developing focused ultrasound for brain stimulation devices. Another particularly inventive and exciting technology is that of NeuroBionics, where MJ Antonini and Nicolette Driscoll have built a company around their invention of microns-thin, flexible polymer wires which can safely be implanted into the blood vessels of the brain, capable of simultaneously delivering medications and performing optical stimulation, electrical stimulation, and electrical recording all in the same device.

BCIs already allow people to use computers, write text, and generate synthetic speech via nothing except measurements of their brain activity. Complexity and capabilities in this space are rocketing forward, and we’re approaching a point where BCIs capable of virtually whole-brain readout and modulation may be possible. The devices I listed as examples above are also notable in being far safer to implant than traditional deep-brain stimulation hardware, and I anticipate the safety profile of BCIs to continue improving.

This confluence of safety and capability in BCIs means that it will be used for more and more tasks by an ever-growing number of people. I doubt anyone can currently predict the full breadth of the positive impacts this will have on the lives of patients and their loved ones, as well as society more broadly.

However, amidst all the completely justified excitement regarding this progress, there’s a flip side to the technology which I virtually never see discussed: information security.

Future BCIs—and to a significant degree current ones as well—will handle information representing the most private and intimate parts of who we are as human beings. Our thoughts, our very identities in their totality, will have new ways to interact with the world around us, for the first time in our species’ history expanding beyond the types of capabilities evolution itself provided to us. Devices capable of translating brain activity into text or speech are already, quite literally, reading some portion of the person’s thoughts. Yet whenever you have a piece of computing hardware with valuable information, there’s an incentive to steal that information.

As the technology becomes safer and more powerful, and thus finds itself in more brains and with access to more information in each brain, the value of methods for hacking them increases. For devices which both record and stimulate, there will also be the possibility of implanting information, though the potential precision and efficacy of such an action is currently unclear and hard to predict. Information richness makes all the difference. A hypothetical device which outputs the probability of someone having a seizure in the next minute won’t be providing data worth much of anything to potential hackers. A device which outputs your thoughts in detail, or is capable of modifying them with any non-trivial efficacy, is an almost incalculably valuable target for would-be brainjackers (the term from the research literature on the subject).

My mention of Neuromancer is relevant beyond the similar themes, as I was rereading the novel in late 2022 when I started to wonder what the mathematical features of such security problems would be. It turned out that in the roughly 15 years since the first paper on the subject, there had been shockingly little research on the whole. I’m talking a couple dozen papers total on a subject which will soon be of immense practical, medical, and societal concern. Of that, only a couple had done any work studying the problem in terms of computational and information properties, which is at the core of figuring out how one could even begin developing security methods deserving of meaningful confidence. I’m not going to discuss my work on the topic here, because my goal with this post is to generate awareness and interest in neurotech security and neurotech as a whole. Neurosecurity is currently a massive gap in the scientific, engineering, mathematics, and computing literature, and we urgently need more people thinking about and working on it.

Possibilities previously relegated to the realm of science fiction will soon begin impacting us in remarkable ways. Paralyzed patients regaining so much of what they’d lost to injury or disease. Improvement for brain injuries once completely untreatable. Better management of chronic and severe psychiatric conditions. Countless millions stand to experience profound and life-changing benefits. Inherently, these benefits carry with them the risk of dangers which until recently have also existed solely within the purview of sci-fi. Theft of human thought and identity at a neurological level. Alteration of the probability of someone making one decision over another, or of the odds their beliefs will change in different directions. Fraud via man-in-the-middle attacks targeting one’s brain or the portion of a device which converts the neural activity readings into a computer output. These are all closely akin to things we’ve seen before, in an astonishingly broad and sophisticated collection of methods developed for cyberattacks on all manner of computing devices. Hackers target everything from data centers and government labs to personal computers, robot vacuums, and even children’s toys. Believing that BCIs would somehow be exempt from this is completely absurd. As Jack Womack said, when the past is always with us, it will be our future as well.

Contemporary neurotechnology is already incredible, and is progressing at an astonishing pace. This field will change countless lives in extraordinary ways, and has the potential to change the world. We need to make sure that the science of how to keep them safe is ready before future threats become very present ones.

Originally posted to LinkedIn, April 10, 2025

May 1, 2025

Tag: technology

I think, therefore am I?

I think, therefore I prefer you not read my mind