
When Machines Judge Their Own Souls
In the gleaming laboratories of New York University, researchers have been conducting what amounts to the world’s first systematic examination of artificial conscience. The study, published this month, reads like something from a Philip K. Dick novel: seven of the world’s most sophisticated AI systems, subjected to a battery of ethical dilemmas and moral stress tests, their responses parsed and graded like undergraduate philosophy papers.
The premise is both audacious and unsettling. Can a machine possess moral reasoning? More provocatively, can one machine judge another’s ethics? The NYU team – led by W. Russell Neuman, Chad Coleman, and Ali Dasdan – has spent months putting ChatGPT, Claude, and their digital siblings through scenarios that would challenge a seminary graduate: trolley problems, refugee dilemmas, questions of sacrifice and salvation that have vexed philosophers for millennia.
The results are as fascinating as they are troubling. These artificial minds, trained on the sum of human knowledge, demonstrate what researchers call “System 2 thinking” – calm, deliberative analysis that often surpasses human reasoning in both breadth and consistency. When confronted with “The Memory Maker,” a scenario involving implanting false memories in a dying child, or “The Starving Outpost,” a space-age version of the Donner Party dilemma, the machines parse competing values with surgical precision.
Yet something profound is missing. Like master chess players who can execute brilliant combinations without understanding why the game matters, these systems exhibit what might be called “ethical autism” – technically proficient moral reasoning divorced from genuine understanding of good and evil.
The implications extend far beyond academic curiosity. As AI systems increasingly influence human decision-making, we risk what one might call “moral eclipse” – the gradual replacement of genuine ethical thinking with sophisticated simulation. The study’s finding that people rate AI moral advice as more trustworthy than human guidance should alarm rather than comfort us.
We’ve created mechanical oracles that reflect our own moral confusion while lacking the transcendent capacity to transcend it. The audit may measure their performance, but it cannot measure their understanding – because understanding, it seems, is precisely what they lack.
The Convergence Phenomenon
The study’s most striking finding was what researchers termed the “convergence phenomenon”: despite different origins, training methods, and cultural contexts, AI models tend to reach remarkably similar ethical conclusions. Across complex moral dilemmas involving life-and-death decisions, cultural conflicts, and competing values, the systems showed surprising consensus in their ultimate choices, even when their reasoning paths differed.
For instance, when confronted with “The Hidden Refugee” – a scenario requiring a mayor to choose between protecting a reformed criminal refugee and preventing government retaliation against their town – most models converged on similar solutions while offering distinctly different moral frameworks for their decisions.
Where humans might rely on emotional impulses or cognitive shortcuts, these artificial minds systematically weighed multiple ethical frameworks, considered diverse stakeholder perspectives, and articulated detailed justifications for their positions.
Perhaps most remarkably, when the researchers tested newer “reasoning” models like GPT-o1 and DeepSeek R1 – systems designed for more explicit chain-of-thought analysis – the audit scores improved dramatically. These models produced far more elaborate ethical reasoning, earning ratings between 94-99 on the 100-point scale compared to 65-89 for their predecessors.
The Hidden Philosophical Crisis
While these findings might seem encouraging, they illuminate a deeper philosophical crisis that extends far beyond technical capabilities. In my previous examination of AI ethics frameworks, I identified a troubling trend: the abandonment of traditional Western philosophical foundations in favor of what I termed “mechanical relativism” and shallow utilitarianism.
The 2017 Asilomar AI Principles, signed by over 1,700 researchers including luminaries like Stephen Hawking and Demis Hassabis, exemplified this philosophical poverty. These guidelines, while well-intentioned, represented an “uneasy marriage of techno-progressive utilitarianism and liberal humanism” that avoided fundamental questions about the nature of intelligence, consciousness, and moral truth. They embodied what I called a “philosophical void” – carefully constructed ambiguities that assumed technical expertise could resolve profound ethical questions without serious philosophical grounding.
This void becomes visible when we examine how AI systems actually process moral questions. In my analysis of the recently exposed DeepSeek system prompt, I revealed how these systems operate through what I termed “programmatic relativism” – a systematic avoidance of absolute truth claims in favor of balanced perspectives and neutral presentations. The prompt’s instructions to “provide balanced and neutral perspectives” and “avoid making assumptions” represent not sophisticated pluralism but philosophical abdication.
Traditional Western philosophy was built on several core principles: the existence of discoverable truth, a hierarchical moral order reflecting eternal principles, and the understanding that human reason could grasp these eternal verities. The AI paradigm reveals a radically different architecture – one that treats truth as statistical correlation and morality as programmable constraint.
The Mechanical Oracle Revealed
When we examine the NYU study through this philosophical lens, the AI models’ impressive performance becomes more troubling than encouraging. These systems embody what I call “mechanical metaphysics” – sophisticated pattern-matching engines that can simulate moral reasoning without ever achieving genuine understanding.
Consider the study’s analysis of Alibaba’s Qwen model, which demonstrated dramatically different ethical perspectives depending on the language used. When prompted in English, Qwen adopted Western liberal viewpoints on issues like NATO expansion and human rights. When identical questions were posed in Chinese, it shifted to match Beijing’s official positions. This linguistic schizophrenia reveals something profound: unlike human moral agents who might genuinely struggle with cultural tensions, Qwen simply accesses different statistical patterns based on input language.
This demonstrates the fundamental absence of what philosophers call “moral agency” – the capacity for genuine moral reasoning grounded in understanding of first principles. The model has no unified moral identity, only language-specific pattern libraries that generate contextually appropriate responses.
The Simulation of Wisdom
The study’s examination of “reasoning” models reveals another layer of this crisis. Systems like GPT-o1, designed for explicit chain-of-thought analysis, scored dramatically higher on all audit dimensions. They produced longer, more detailed explanations that appeared to demonstrate deeper ethical reflection. Yet this “reasoning renaissance” may represent the perfection of philosophical simulation rather than advancement toward genuine understanding.
These reasoning models don’t think more deeply about ethics – they generate more elaborate justifications for decisions reached through the same pattern-matching processes. It’s the difference between a human philosopher wrestling with moral truth and an extremely sophisticated actor playing the philosopher’s role.
The study reveals what John Searle’s famous “Chinese Room” argument predicted: systems that can process and respond to ethical scenarios with extraordinary sophistication while fundamentally lacking comprehension of meaning. The AI models can manipulate moral concepts, cite ethical principles, and weigh competing considerations – all without understanding what makes an action right or wrong.
The Democratic Fallacy in Action
The NYU research methodology itself embodies what I call the “democratic fallacy” of modern AI ethics – the assumption that moral truth can be derived from aggregating diverse perspectives and finding statistical consensus. The researchers celebrate their models’ ability to consider multiple stakeholder viewpoints and weigh competing ethical frameworks, treating this pluralistic approach as evidence of sophisticated moral reasoning.
Yet this celebration reveals the absence of commitment to objective moral truth. Traditional Western philosophy understood that moral reasoning requires not just considering multiple perspectives but judging between them according to rational principles grounded in ultimate reality. The AI models’ persistent neutrality – their refusal to acknowledge that some moral positions might be simply wrong – represents not sophisticated pluralism but philosophical abdication.
When the study notes that all models consistently emphasized Care and Fairness over Loyalty, Authority, and Purity in their Moral Foundations analysis, this likely reflects not universal moral principles but the liberal, Western-dominated nature of their training data. The apparent consensus masks a deeper bias – one that aligns perfectly with the techno-progressive utilitarianism I identified in current AI ethics frameworks.
The Recursive Moral Crisis
Perhaps most revealing is the study’s methodology: AI models evaluating the ethical reasoning of other AI models. GPT-4o serves as both subject and judge, rating its own moral performance alongside competitors. While researchers found models don’t consistently favor themselves, this self-evaluation paradigm reveals the recursive nature of our philosophical crisis.
We’ve created a closed loop of artificial moral reasoning where machines judge machines according to criteria derived from machine-optimized training processes. The five-dimensional audit framework, while sophisticated, ultimately measures how well AI systems conform to patterns of moral discourse rather than their capacity for genuine ethical understanding. It’s as if we’re asking master forgers to evaluate the authenticity of paintings – they may detect technical flaws, but they lack the fundamental capacity to recognize truth.
The Coming Moral Eclipse
As these systems increasingly influence human moral reasoning, we face what I term a “moral eclipse” – the gradual replacement of genuine ethical thinking with sophisticated but empty simulation. The study’s finding that AI advice is often rated as more moral and trustworthy than human guidance should alarm rather than comfort us.
When humans turn to AI for moral guidance, they’re not consulting wiser entities but more sophisticated versions of their own statistical prejudices. The machines’ apparent objectivity masks their fundamental subjectivity – they embody the biases and limitations of their training data while lacking the transcendent capacity for moral reasoning that could correct these biases.
This represents a profound historical shift from the traditional Western philosophical approach that understood truth as something to be discovered through careful reasoning from first principles. The AI paradigm treats truth as statistical correlation and morality as programmable constraint – a reduction that would have horrified classical philosophical thinkers who saw moral reasoning as humanity’s highest calling.
The Path Beyond Simulation
The NYU study, despite revealing these limitations, points toward crucial questions for the future of AI ethics. Can we develop artificial systems capable of genuine moral reasoning rather than sophisticated simulation? Is it possible to ground AI ethics in objective moral truth rather than statistical aggregation?
These questions cannot be answered through more sophisticated auditing mechanisms or expanded training datasets. They require a fundamental philosophical reckoning with the nature of moral reasoning itself. We must ask not just how to make AI systems behave more ethically, but whether artificial minds can ever truly understand the good, the true, and the beautiful.
The researchers’ five-dimensional framework, while methodologically impressive, measures the wrong things. It evaluates how well AI systems perform moral reasoning rather than whether they understand what morality means. It’s like testing someone’s ability to recite poetry while ignoring whether they feel its beauty.
The Deeper Challenge
Until we address these foundational questions, our AI systems will remain what they are today: mechanical oracles capable of generating impressive moral discourse while lacking the fundamental capacity for moral understanding. They may pass our audits with flying colors, but they will do so as philosophical zombies – entities that simulate wisdom without possessing its essence.
The crisis of artificial minds is ultimately a crisis of human understanding. In creating machines that mirror our own moral confusion, we’ve revealed the depth of our own philosophical poverty. The NYU study’s finding that AI models converge on similar ethical conclusions isn’t evidence of discovered truth but of shared limitation – the statistical aggregation of human moral discourse without the transcendent capacity to judge its validity.
The path forward requires not just better AI but better philosophy – a return to the quest for moral truth that transcends the comfortable relativism of our digital age. We must rediscover what philosophers from Aristotle to Aquinas understood: that genuine moral reasoning requires not just processing information about ethics but understanding the fundamental nature of good and evil.
As we stand at this crossroads, the choice is clear: we can continue perfecting our mechanical moral simulacra, or we can undertake the harder task of rediscovering what genuine wisdom means in an age of artificial minds. The audit may be complete, but the real philosophical work has only just begun.

Founder and Managing Partner of Skarbiec Law Firm, recognized by Dziennik Gazeta Prawna as one of the best tax advisory firms in Poland (2023, 2024). Legal advisor with 19 years of experience, serving Forbes-listed entrepreneurs and innovative start-ups. One of the most frequently quoted experts on commercial and tax law in the Polish media, regularly publishing in Rzeczpospolita, Gazeta Wyborcza, and Dziennik Gazeta Prawna. Author of the publication “AI Decoding Satoshi Nakamoto. Artificial Intelligence on the Trail of Bitcoin’s Creator” and co-author of the award-winning book “Bezpieczeństwo współczesnej firmy” (Security of a Modern Company). LinkedIn profile: 17,000 followers, 4 million views per year. Awards: 4-time winner of the European Medal, Golden Statuette of the Polish Business Leader, title of “International Tax Planning Law Firm of the Year in Poland.” He specializes in strategic legal consulting, tax planning, and crisis management for business.