What do discrimination, harmful content, unexpected outcomes, and catastrophic business failure have in common? They are all potential consequences of misaligned AI. They are also largely avoidable. Misaligned AI erodes trust among employees, customers, government, and other stakeholders, and trusted AI holds the key to enterprise AI success. Without trust in AI, enterprises will not be able to reap the full benefits and transformational impact of AI.

Several colleagues and I recently undertook a research project to help technology and business leaders identify AI misalignment in their organization and reduce its risks. The resulting report outlines an approach called “align by design,” which can help organizations increase trust in AI and make the most of their AI investments.

What Is AI Misalignment?

The issue of AI misalignment stems from the fact that today’s machine learning algorithms rely on data, but data is often an incorrect representation of the world. Today’s AI systems are shackled inside Plato’s cave, experiencing a mere shadow representation of the real world through data. The result is misalignment between the intended and actual outcomes of the system. Sometimes, misalignment can be amusing, as when Google’s AI Overviews tool suggested that users put glue on pizza and eat rocks. Other times, misalignment can be harmful, as when the National Eating Disorders Association’s chatbot recommended counting calories. The dangers that misalignment poses are only going to grow as AI continues to advance in ability and agency.

Forrester has identified three ways that AI misalignment may harm your customers and your business:

  • Outer misalignment. This happens when data scientists specify a proxy because they lack a discrete variable that perfectly represents a model’s objective. The distance between the proxy variable and the business objective results in misalignment and can cause real harm.
  • Inner misalignment. This happens when subgoals that an AI system learns during its training diverge from the overall intended goal of the system.
  • User misalignment. This occurs when bad actors deliberately pull an AI system out of alignment with its goals or objectives.

All three types of misalignment can fundamentally jeopardize confidence in AI and widen the AI trust gap. To keep that from happening, organizations must proactively align AI and focus on key stakeholders.

A New Approach: Align By Design

Our research uncovered an uncomfortable reality that misalignment is inevitable. Fortunately, catastrophe is avoidable with the “align by design” approach. It’s time for organizations to take AI alignment seriously, right from the beginning of AI development and not after an “oops” moment when things go awry. We define align by design as:

A proactive approach to developing AI systems that meet intended business goals while adhering to company values, standards, and guidelines across the AI development lifecycle.

Aligning by design requires effort across all three sides of the golden triangle — people, technology, and processes:

  • People: Organizational alignment enables AI alignment. Misalignment is not just an AI problem; it is also an organizational problem that plagues companies today, as disparate lines of business possess myriad incentives and optimize for divergent KPIs. Companies (and, especially, technology teams) must align internally on objectives, standards, principles, and values to ensure AI alignment.
  • Technology: Balance helpfulness and harmlessness with alignment techniques. Overloading models with guardrails and tuning can diminish their effectiveness, while insufficient alignment may lead to harmful outputs or unintended actions. Achieve the right balance by leveraging an assortment of alignment techniques such as fine-tuning, prompt enrichment, and controlled generation to ensure that systems achieve intended (and not objectionable) objectives.
  • Processes: Plan for remediation. Despite best efforts, AI misalignment is an inevitability. Companies therefore need to be prepared to respond to unintended outcomes and to mitigate their negative impact. Have a remediation plan in place on day zero.

At a time when organizations are looking more closely at AI investments, an align by design approach to the development and deployment of AI can ensure that companies’ AI investments meet their intended objectives while adhering to organizational policies and principles.