Today, most AI is being built on blind faith inside of black boxes. It requires users to have on unquestioning belief in something neither transparent nor understandable.
The industry is moving at warp speed, employing deep learning to tackle every problem, training on datasets that few people can trace, and hoping no one gets sued. The most popular AI models are developed behind closed doors, with unclear documentation, vague licensing, and limited visibility into the provenance of training data. It’s a mess—we all know it—and it’s only going to get messier if we don’t take a different approach.
This “train now, apologize later” mindset is unsustainable. It undermines trust, heightens legal risk, and slows meaningful innovation. We don’t need more hype. We need systems where ethical design is foundational.
The only way we will get there is by adopting the true spirit of open source and making the underlying code, model parameters, and training data available for anyone to use, study, modify, and distribute. Increasing transparency in AI model development will foster innovation and lay a stronger foundation for civic discourse around AI policy and ethics.
Open source transparency empowers users
Bias is a technical inevitability in the architecture of current large learning models (LLMs). To some extent, the entire process of “training” is nothing but computing the billions of micro-biases that align with the contents of the training dataset.
If we want to align AI with human values, instead of fixating on the red herring of “bias,” we must have transparency around training. The source datasets, fine-tuning prompts and responses, and evaluation metrics will reveal precisely the values and assumptions of the engineers who create the AI model.
Consider a high school English teacher using an AI tool to summarize Shakespeare for literary discussion guides. If the AI developer sanitizes the Bard for modern sensibilities, filtering out language they personally deem inappropriate or controversial, they’re not just tweaking output—they’re rewriting history.
It is impossible to make an AI system tailored for every single user. Attempting to do so has led the recent backlash against ChatGPT for being too “sycophantic.” Values cannot be unilaterally determined at a low technical level, and certainly not by just a few AI engineers. Instead, AI developers should provide transparency into their systems so that users, communities, and governments can make informed decisions about how best to align the AI with societal values.
Open source will foster AI innovation
Research firm Forrester has stated that open source can help firms “accelerate AI initiatives, reduce costs, and increase architectural openness,” ultimately leading to a more dynamic, inclusive tech ecosystem.
AI models consist of more than just software code. In fact, most models’ code is very similar. What uniquely differentiates them are the input datasets and the training regimen. Thus, an intellectually honest application of the concept of “open source” to AI requires disclosure of the training regimen as well as the model source code.
The open-source software movement has always been about more than just its tech ingredients. It’s about how people come together to form distributed communities of innovation and collective stewardship. The Python programming language—a foundation for modern AI—is a great example. Python evolved from a simple scripting language into a rich ecosystem that forms the backbone of modern data processing and AI. It did this through countless contributions from researchers, developers, and innovators—not corporate mandates.
Open source gives everyone permission to innovate, without installing any single company as gatekeeper. This same spirit of open innovation continues today, with tools like Lumen AI, which democratizes advanced AI capabilities, allowing teams to transform data through natural language without requiring deep technical expertise.
The AI systems we’re building are too consequential to stay hidden behind closed doors and too complex to govern without collaboration. However, we will need more than open code if we want AI to be trustworthy. We need open dialogue among the enterprises, maintainers, and communities these tools serve because transparency without ongoing conversation risks becoming mere performance. Real trust emerges when those building the technology actively engage with those deploying it and those whose lives it affects, creating feedback loops that ensure AI systems remain aligned with evolving human values and societal needs.
Open source AI is inevitable and necessary for trust
Previous technology revolutions like personal computers and the Internet started with a few proprietary vendors but ultimately succeeded based on open protocols and massively democratized innovation. This benefited both users and for-profit corporations, although the latter often fought to keep things proprietary for as long as possible. Corporations even tried to give away closed technologies “for free,” under the mistaken impression that cost is the primary driver of open source adoption.
A similar dynamic is happening today. There are many free AI models available, but users are left to wrestle with questions of ethics and alignment around these black-boxed, opaque models. For societies to trust AI technology, transparency is not optional. These powerful systems are too consequential to stay hidden behind closed doors, and the innovation space around them will ultimately prove too complex to be governed by a few centralized actors.
If proprietary companies insist on opacity, then it falls upon the open source community to create the alternative.
AI technology can and will follow the same commoditization trajectory as previous technologies. Despite all the hyperbolic press about artificial general intelligence, there is a simple, profound truth about LLMs: The algorithm to turn a digitized corpus can be turned into a thought-machine is straightforward, and freely available. Anyone can do this, given compute time. There are very few secrets in AI today.
Open communities of innovation can be built around the foundational elements of modern AI: the source code, the computing infrastructure, and, most importantly, the data. It falls upon us, as practitioners, to insist on open approaches to AI, and to not be distracted by merely “free” facsimiles.
Peter Wang is chief AI and innovation officer at Anaconda.
No comments