Executive summary
The Hiroshima AI Process Reporting Framework represents a multilateral effort to promote transparency and accountability in the development of advanced AI systems. Launched in February 2025 as part of the Group of Seven (G7) Hiroshima AI Process (HAIP), the framework provides a mechanism for organizations developing advanced AI systems to voluntarily report on their risk mitigation practices, aiming to facilitate cross-organizational comparability and to identify and disseminate good practices. The multilateral agreement on the HAIP framework and the wide participation by companies suggest that it can be an important advance in international AI governance. It can propel improvements in risk mitigation by providing AI companies with insights into each other’s governance practices, allowing enterprise customers to compare AI offerings, and benchmarking best practices for researchers and eventually policymakers.
This report examines the framework’s first reporting cycle through analysis of submissions, stakeholder surveys, and targeted interviews. The analysis reveals that the framework’s flexibility enables diverse participation and provides internal benefits to submitters. At the same time, its current format limits comparability across submissions and creates challenges around clarity of purpose. We also find that while HAIP’s positioning in international governance efforts makes it a uniquely appealing forum to build consensus around risk mitigation practices, the potential impact of the framework can be enhanced by increasing awareness, certainty about the target audience for reports, and strengthening information verification. Based on these findings, we provide recommendations for enhancing future iterations of the framework.
Brief history of the Hiroshima AI Process
Launched at the May 2023 Group of Seven (G7) Hiroshima Summit in response to the rapid proliferation of generative AI models, the Hiroshima AI Process (HAIP) aims to promote transparency around the opportunities and risks presented by advanced AI systems and serve as a forum for collaboration on governance challenges. The process reflects core G7 commitments to the rule of law, privacy, human rights, accountability, and transparency, supporting discussions on critical issues including intellectual property protection, countermeasures against foreign information manipulation, and responsible AI use.
In a series of convenings following the Hiroshima Summit, G7 digital and tech ministers adopted HAIP’s Comprehensive Policy Framework in December 2023. This framework, which was subsequently endorsed by G7 leaders, includes several complementary elements, including:
- The International Guiding Principles, which establish a set of guiding principles applicable to all AI actors, covering the design, development, deployment, and use of advanced AI systems. They are designed to be consistent with and build upon pre-existing international principles.
- The International Code of Conduct, which provides more detailed, voluntary guidance for developers of advanced AI systems. It encourages organizations to take 11 actions across the AI lifecycle, including risk identification and mitigation, public reporting, content authentication, and the development of technical standards.
The Trento Declaration of March 2024 marked a step toward the implementation and operationalization of the HAIP’s Comprehensive Policy Framework, committing to develop monitoring mechanisms for the voluntary application of the International Code of Conduct with the support of the Organisation for Economic Co-operation and Development (OECD). In response, the OECD developed the Reporting Framework for the Code of Conduct, which was launched at the Paris AI Action Summit in February 2025, following a multistakeholder development process that included a pilot phase in summer 2024 with feedback from industry, academia, and civil society.
The reporting framework enables organizations to publish public-facing reports on their AI governance and risk management practices aligned with the Code of Conduct. The OECD, which serves as its primary administrative engine, receives and publishes the reports and analyzes the aggregated data for future iterations of the framework and related policy discussions.1
In parallel to the OECD’s launch of the reporting framework, G7 countries have worked to broaden the reach of HAIP through the HAIP Friends Group, which has grown to include 57 countries and one region that support the principles of safe, secure, and trustworthy AI. This was later widened to include the HAIP Partners Community, comprising 27 international organizations and private companies that support Friends Group governments in implementing HAIP and encourage AI developer participation in the reporting framework.
What is included in the HAIP Reporting Framework
The HAIP Reporting Framework was launched to “facilitate transparency and comparability of risk mitigation measures and contribute to identifying and disseminating good practices” among organizations developing advanced AI systems, with the goal of supporting the development of safe, secure, and trustworthy AI systems. Structured around seven thematic sections, the framework addresses responsible AI development practices through 39 questions (two to eight per section) across the following areas:
- Risk identification and evaluation: How do organizations identify, assess, and evaluate risk through testing, incident reporting, and stakeholder collaboration?
- Risk management and information security: What steps do organizations take to address security vulnerabilities, privacy incidents, and related risks?
- Transparency reporting on advanced AI systems: Do organizations publish clear reports or documentation on capabilities, domains, and limitations of AI systems for downstream users?
- Organizational governance, incident management, and transparency: What do organizational governance policies, risk management practices, and public communication about risk management look like within organizations?
- Content authentication and provenance mechanisms: How do organizations inform users that they are interacting with AI-generated content?
- Research and investment to advance AI safety and mitigate societal risks: How do organizations advance, collaborate, and invest in research related to AI safety and risk mitigation?
- Advancing human and global interests: How do organizations support efforts to maximize the benefits of AI and improve user understanding of AI systems?
Participation is voluntary and open to organizations developing or deploying2 advanced AI systems, which the OECD defines as machine-based systems that infer how to generate outputs (such as predictions, content, recommendations, or decisions that influence physical or virtual environments) from the inputs they receive. Eligible organizations must be based in OECD member countries, Global Partnership on AI (GPAI) member countries, or jurisdictions adhering to the OECD Recommendation on Artificial Intelligence. Responses to the framework questions are designed to go beyond yes-or-no answers, with organizations expected to explain their methodologies, describe their governance structures, detail their collaborative approaches, and provide concrete information about their policies and practices.
The first reporting period deadline was mid-April 2025, with rolling acceptance thereafter. At the beginning of May 2025, the OECD published the first 19 reports submitted under the framework. By late November 2025, five additional organizations had submitted reports, bringing the total to 24 submissions.
The OECD has encouraged participating organizations to update their reports annually. These submissions serve as an important transparency mechanism, allowing organizations to publicly demonstrate their alignment with the HAIP Code of Conduct. As the number and diversity of submissions increase, the framework has the potential to support more systematic comparison across organizations and over time.
While many risk management and documentation frameworks have focused on standardizing the process of capturing and communicating information about specific AI artifacts such as datasets, models, systems, and governance processes, the HAIP framework aims to provide a bird's-eye view of AI development and deployment behavior at an institutional level.
How the HAIP Reporting Framework complements other international governance, documentation, and disclosure efforts
The HAIP Reporting Framework exists alongside many other documentation and reporting frameworks, each of which is intended to encourage responsible conduct and shed light on different dimensions of the development and deployment of AI. Different governance processes occupy different levels on the stack and constitute varied focus areas that collectively contribute to a multi-layered governance structure to support responsible AI deployment and risk disclosure. While many risk management and documentation frameworks have focused on standardizing the process of capturing and communicating information about specific AI artifacts such as datasets, models, systems, and governance processes,3 the HAIP framework aims to provide a bird’s-eye view of AI development and deployment behavior at an institutional level.
Institutional reporting can include high-level information about how organizations source and annotate training or evaluation data, how organizations identify and prioritize risks, and general details about safety, privacy, legal, or ethical assessments conducted. It can also describe particular or general risk mitigation strategies adopted, how the institution engages with stakeholders and third parties, and the contours of organizational governance structures. These insights can reveal information not immediately evident from documentation about specific AI systems or their individual components because they link individual technical artifacts to the broader governance processes that shape how these systems are designed, deployed, and overseen. They highlight some of the various decisions that may have informed the organization’s development or adoption of AI systems and help observers ascertain an organization’s overall approach to and maturity in the area of AI governance. Finally, they also facilitate the observation of important cross-industry patterns that might otherwise require substantial effort to reveal.
The HAIP framework was intentionally designed to prevent the duplication of effort for global organizations by encouraging alignment with multiple risk management systems and prioritizing consistency with other international governance mechanisms. It advances a theory of change for global AI governance that ties the reporting process to desired outcomes around safe, secure, and trustworthy AI, articulated in the HAIP code of conduct. The framework serves as a link between high-level governance principles and operational practices and exists within a dynamic landscape of efforts that encourage companies to disclose and document their risk management. These efforts each target different audiences and goals.
Importantly, internal documentation of risk management processes and public disclosures about those processes should be understood as distinct activities, each with different utility. While documentation is essential for effective AI governance, its success depends on how well organizations tailor their documentation approaches to meet the diverse needs of stakeholders, including technical teams, organizational leaders, policymakers, users, and other downstream consumers of the documentation. Internal documentation, reflecting on the development of data, models, or systems, can play a role in facilitating the identification of potential risks and impacts, adoption of mitigation methods, and promotion of a risk-aware organizational culture. Public disclosure of such processes typically includes less detail. On its own, public reporting is unlikely to suffice to assure thorough and effective adoption of risk management practices. Yet, it can offer informative signals about the risks an organization has identified and prioritized, reveal its general orientation toward mitigating those risks, and facilitate comparison across organizations about how they are navigating important questions implicated by AI development and deployment. As a result, it can help organizations learn from each other and also help their customers, researchers, and policymakers gain insights into industry-wide approaches as well as unique approaches by individual organizations. Public reporting can also help organizations converge on shared terminology, definitions, and evaluations.
Importantly, internal documentation of risk management processes and public disclosures about those processes should be understood as distinct activities, each with different utility. While documentation is essential for effective AI governance, its success depends on how well organizations tailor their documentation approaches to meet the diverse needs of stakeholders, including technical teams, organizational leaders, policymakers, users, and other downstream consumers of the documentation.
Methodology and contribution
The completion of the first reporting cycle provides an opportunity to review the framework’s strengths and explore areas for improvement. Several studies have examined the HAIP reporting framework from different angles, each contributing valuable insights while leaving gaps that our work aims to address.
The OECD published a comprehensive analysis in September 2025, synthesizing practices across the seven thematic areas based on the first 20 submissions. This work provided important descriptive insights into how organizations address risk identification, transparency, and governance. In parallel, researchers at the University of Tokyo conducted interviews with 11 participating organizations, primarily Japanese companies and four U.S. firms, followed by a stakeholder consultation meeting. Their research revealed diverse motivations for participation. It also identified concerns about the flexibility-standardization trade-off and objections to a system proposed by the authors for ranking or scoring reports. Both studies were limited to understanding the perspectives of organizations that chose to participate in and complete the process. As a result, they primarily focused on the experiences and challenges of submitters rather than the broader landscape of potential participants. Additional blog posts and commentary pieces have provided accessible summaries of these findings for broader audiences.
Our report builds on and extends prior work by soliciting feedback and providing policy recommendations to inform ongoing efforts to strengthen the reporting framework. To do so, it uses multiple methods to assess the impact of the framework thus far and suggest recommendations for future iterations.
As a first step, we compiled and reviewed the 24 submissions from participating organizations and synthesized findings from existing analyses of these submissions.4 To complement our analysis of published reports and syntheses, we conducted a survey of the AI community. Through the survey, we sought to understand awareness of the HAIP framework among the AI community, identify what information stakeholders find most valuable, and identify what they feel is missing.
Where earlier analyses primarily drew on the experiences and insights of submitters, we reached out to policymakers, civil society organizations, and researchers in addition to AI developers and deployers. This broader involvement includes the voices of stakeholders who shape, implement, and monitor AI governance frameworks. Their insights can reveal how HAIP and its reporting framework function within the broader governance ecosystem.
We also deliberately sought input from individuals who were less familiar with HAIP, organizations that observed or advised on the framework’s development, and organizations familiar with HAIP that did not submit reports. This provided context for understanding the broader community’s awareness of HAIP, barriers to participation, and strategies to minimize these barriers in future iterations of the framework.5
Depending on respondents’ level of familiarity with HAIP and the framework and whether the organization they represent submitted a report, the survey included between 8 and 20 questions covering the report submission process, including their organizations’ thoughts on the logistical aspects of submitting to the framework; the content requested in the report, including incentives to submit and the ease of completing each section; and the framework’s perceived impact, including how organizations use the questions and responses in the framework to drive decisionmaking. The survey questions were primarily open-ended, allowing respondents to share their feedback broadly rather than be limited to our assumptions about each area of inquiry.
We received 110 total responses to the survey. For those who shared information on their organizational affiliation, researchers were the most represented group (Figure 1). We also received responses from public interest actors, deployers, developers, and policymakers.
Finally, we supplemented our survey findings with targeted interviews, including representatives involved in designing the framework, civil society representatives, and practitioners from major developers and deployers, in order to understand the value of the framework, current submission challenges, and unique considerations for different stakeholder populations. The findings presented below reflect a synthesis of these three distinct inputs: (1) individual company reports and existing HAIP framework syntheses; (2) survey results; and (3) supplementary interviews. Taken together, these sources provide evidence to inform the findings and recommendations that follow.
Findings
The HAIP Reporting Framework’s flexibility is both an asset and an obstacle
The HAIP Reporting Framework was designed to balance flexibility and uniformity. For example, it prescribes no page limits or structured format. It leaves questions open-ended for respondents to provide as much or as little detail as they are able to share. Organizations determine their reporting scope, whether to focus on specific models, company-wide practices, or a hybrid approach, and they can choose their level of technical detail based on their position in the AI value chain and target audience.
This open reporting framework can accommodate the diversity of the AI ecosystem, including differences in organizational size, geography, maturity, and business model, allowing each prospective submitter to participate in a meaningful way, rather than forcing a one-size-fits-all approach that may be irrelevant, impossible, or misleading. The first round of reports submitted to the framework reflected this diversity: Of the 24 organizations that submitted, nine are from Japan, seven from the United States, two each from Canada and Germany, and one each from Denmark, Israel, Romania, and South Korea. Using the OECD’s classification system for enterprises by business size, submitters included 18 large enterprises (250 or more employees), three medium-sized enterprises (50 to 249 employees), and three micro enterprises (fewer than 10 employees).6 Submitters also spanned the AI value chain, from foundation model developers and cloud platform providers to telecommunications operators, IT services and systems integrators, AI-specialized startups and research organizations, and data service providers.
These distinct organizations took different approaches in their responses to the framework. Submissions varied in length from 9 to 60 pages, with one-third making extensive use of hyperlinks to external sources in their submission. Not all organizations found every question relevant to their operations. Several organizations marked multiple questions as “not applicable,” and others provided limited responses to sections focused on model training, research investments, or technical safeguards that fell outside their scope. Approximately one-third of submissions relied on cross-references to their own earlier responses within the report rather than providing standalone answers to each question.
Although more—and different types of—companies may be incentivized to respond to a reporting framework with built-in flexibility, this flexibility can limit the comparability of responses across submissions. According to feedback provided to the OECD, companies reported uncertainty about whether to focus on organizational practices or specific AI systems, and whether to write from a developer or deployer perspective. Organizations split between reporting company-wide approaches (13) and hybrid strategies that combined specific model documentation with broader policies (10), with one taking a more model-specific approach. Eight wrote primarily from a developer perspective, 6 from a deployer perspective, and the remainder adopted hybrid or other approaches. Even the goals of submitting to the framework may differ by the size of an organization and the resources it has: While submitting to the framework may encourage organizations with less mature processes to improve their internal risk management, it may provide a helpful forcing function for organizations with more sophisticated risk management processes to reflect and coordinate internally.
These divergent motivations and approaches—and their resulting lack of comparability—may blunt the utility of the reporting framework and make it difficult to delineate a theory of change (i.e., a mapping of how reporting activities lead to desired outcomes). Observers who could theoretically use the content of the reports to understand, for example, risk control measures adopted, tests conducted when making decisions about with whom to work, and modifications to internal governance work may find it difficult to compare company-by-company practices across the reporting framework, given that some cover specific models while others focus on organization-wide policies.
Organizations that aim to build reputational trust through their HAIP reports may be less motivated to participate, given that the reputational benefits of having stronger practices when compared to their peers may be difficult for external observers to discern. Frameworks can initially help early movers signal responsible practices to regulators, partners, and end users, but their value expands when there is enough baseline comparability to distinguish “best in class” performers from those with significant room to improve. At the same time, companies may be reluctant to support stronger comparability mechanisms or requirements if that would expose gaps in their practices relative to peers.
We also heard from stakeholders that the flexibility built into the submission process could potentially disincentivize engagement among companies that otherwise might be interested in submitting by increasing the time commitment of deciding what to include and how to measure and present their governance and risk-management practices. SMEs in particular may lack the resources and policy knowledge to navigate the process, even if they wanted to.
SMEs and service providers face capacity and information challenges in the report submission process
SMEs and service providers’ resource constraints and strategic positioning shape their choices in how to report in distinct ways. Actors operating in narrow market segments or at specific points in the AI value chain (for example, niche application providers or advisory firms) often have limited visibility into upstream providers. For these stakeholders, questions about data quality, risk mitigation, research, and investment can be difficult to answer because relevant information is often distributed across clients, partners, or different parts of the organization that have limited interaction, in addition to constrained capacity to dedicate staff time.
Smaller developers exhibited distinct reporting patterns. Many provided detailed responses in areas related to their technical strengths, such as risk assessment and testing, but offered sparse coverage of governance structures, potentially reflecting resource constraints and less mature organizational processes. Their strategic approaches varied: Some produced detailed, audit-style submissions with explicit references to multiple standards and frameworks, using compliance breadth as a competitive differentiator to signal maturity beyond their size. Others mixed substantive content with promotional framing, treating the submission as an opportunity to showcase compliance expertise and build credibility while gaining visibility through the HAIP platform.
Questions about relevance varied significantly across the AI value chain. TELUS Digital, for example, marked multiple sections as not applicable because it does not deploy advanced AI systems. Consulting firms such as Data Privacy and AI used their submissions to outline advisory approaches for clients, reflecting their role in supporting rather than conducting AI development. B2B software providers whose customers control data inputs found questions about training data curation and model development less relevant to their position in the value chain. Some specialized actors who might have benefited from participation viewed the framework primarily as a forum for foundation model developers, suggesting uncertainty about whether the framework is designed for their type of contribution and where their value proposition might lie.
Responses, while informative, remain high-level and difficult to verify
Where an organization sits within the AI ecosystem impacts the content of its submissions and its interpretations of specific questions. Foundation model developers demonstrated different transparency approaches, likely shaped by their competitive positioning and regulatory exposure. Some offered expansive, governance-focused submissions that emphasized institutional frameworks and procedural thoroughness, yet relied heavily on discretionary, self-published materials (like preparedness frameworks, system cards, and policy documents) and remained vague about how those policies operate in practice. Others used their submission as a reference document with uniform, concise responses that pointed to external links rather than directly providing relevant information in their response. Taken together, these patterns suggest that when a public-facing framework is open to interpretation, companies that face intense public scrutiny but have the capacity to respond in greater detail may paradoxically provide less granular information and instead choose to curate transparency through managed documentation ecosystems.
… when a public-facing framework is open to interpretation, companies that face intense public scrutiny but have the capacity to respond in greater detail may paradoxically provide less granular information and instead choose to curate transparency through managed documentation ecosystems.
Enterprise technology companies responded to the framework’s procedural questions with comprehensive documentation of their practices. These companies produced detailed, technical submissions demonstrating sophisticated governance architectures with dedicated AI safety teams, automated testing pipelines, and frontier model evaluation frameworks, combined with extensive cross-standard alignment and ecosystem participation.
Telecommunications operators and IT services firms demonstrated a pattern of governance integration without AI-specific differentiation. These companies produced comprehensive, methodical submissions that embedded AI governance within existing enterprise risk management frameworks, including ethics committees, security protocols, and legal review processes. Their responses emphasized hierarchical oversight, cross-departmental coordination, and compliance with national regulations, reflecting institutional discipline but limited engagement with AI-specific challenges, such as frontier model risks.
Across the submissions, the use of external references illustrated choices about what to include directly versus what to reference externally. Organizations fell into three categories: those using external links as supplements that offered deeper detail about information provided in their responses, those linking to external information as replacements for substantive answers, and those providing primarily inline responses with minimal external references. This pattern correlates with both organizational size and competitive sensitivity: Larger firms with extensive public documentation infrastructures tended toward heavy external referencing, and smaller organizations with less existing documentation provided more direct, contextually integrated responses. On one hand, organizations’ collating information into durable transparency resources (which were in some cases linked in their submissions) reflects established practices for sharing information with external stakeholders and can be a result of organizations attempting to align with a variety of disclosure frameworks. At the same time, extensive linking within the HAIP framework submissions is suboptimal for enabling comparisons across organizations and for understanding the evolution of changes over time, particularly if the referenced webpages are dynamic and regularly updated.
Although the lack of detail in some responses partly reflected respondent choices, it also stemmed from how questions were framed. Many questions in the reporting framework asked some form of “Does your organization do X?” but did not elicit information on what companies specifically do in more detail. In the case of privacy practices, for example, companies tended to provide more general answers about having privacy policies and practices, but they provided little detail on the nature of those practices (e.g., measures to obfuscate personal details in training data or to guard against disclosure of personal information learned via prompts). This was even more evident in questions relating to key issues around accountability and risk, which elicited little about what is done to either assign and assure responsibility or implement risk mitigations.
When companies did report their practices, most responses, even when relatively detailed in comparison to peers, remained high-level and difficult to verify. Most organizations described governance processes without clearly distinguishing between implemented practices and planned or piloted measures, making it difficult to assess current versus aspirational states. The reports also provided limited quantitative evidence to substantiate qualitative claims: Organizations described red teaming programs and safety evaluations but rarely included metrics such as error rates or benchmark performance that would allow observers to assess the effectiveness of these practices. Because the framework was designed to document and compare organizational approaches rather than evaluate their effectiveness, it naturally encouraged broad descriptions of governance arrangements rather than detailed self-assessment, which can make it difficult to understand the implementation challenges and emerging risks organizations face, as well as limit its utility for accountability.
The HAIP Reporting Framework provides significant value, but its distinguishing features and audience could be clarified further
In a crowded landscape of AI governance efforts, the HAIP reporting framework offers several useful features. Its connection to the G7 and G20 and its global reach contribute to its prominence; plus the framework’s multistakeholder development process, openness to any interested entity, and efforts to complement existing national frameworks help situate it within broader global AI governance discussions.
Yet we also found limitations in knowledge about the framework. Among the AI community that responded to our survey, approximately half of all respondents indicated familiarity with the framework but not detailed knowledge about it, suggesting that better awareness of the reporting process — and how it differs from other related efforts — could encourage more engagement with and impact of the reporting framework. This may be particularly true for deployers, including those who refine frontier models, who either do not view the HAIP submission process as relevant for their work or are unaware of its potential to inform business choices. Some observers have described the framework as a “quiet revolution” in AI governance, noting its potential to influence transparency reporting behaviors if awareness and uptake increase.
Benefits vary for different entities in the ecosystem
Organizations that participate in the reporting framework may reap both internal and external benefits. On the internal side, organizations may already seek to strengthen their AI governance and find that the reporting process provides a useful structure for clarifying responsibilities, identifying gaps, consolidating existing policies, and aligning stakeholders around expectations and shared language. Externally, incentives can include increased public trust, improved accountability, and the opportunity to shape the future of governance frameworks. The process of documenting and reporting can provide value for companies seeking to reinforce their reputation for responsible AI practices with consumers, civil society, or prospective talent. The reporting framework’s voluntary participation model can encourage organizations to demonstrate social responsibility and build trust rather than simply meeting regulatory requirements.
At the same time, it is also possible that, without an understanding of how companies actually implement the practices they document, the reports may be viewed by both observers and participating organizations as one more compliance effort or public relations exercise.
The public-facing nature of reports, often written in accessible language, can provide value to a range of different audiences outside the participating organization. Access to public responses can help organizations assess the governance and management practices of potential partners, along with their AI safety initiatives. The reports provide information on how organizations characterize, identify, evaluate, and mitigate risk; the structure of their risk management teams; and the seriousness of their risk management practices. For business-facing AI providers, the reports can help inform business-to-business clients and partners about risk controls and responsible AI measures, which can clarify contractual expectations and accountability. For downstream deployers and users, actionable information related to risk management measures (e.g., red-teaming or how they handle bugs and incidents), safety testing and benchmark results, system limitations and appropriate use cases, evidence of external audits or validations, and data provenance and treatment practices can help them make better-informed decisions about which AI model best aligns with their values and goals, and which provider offers comparatively stronger safety, security, and accountability practices. Encouraging transparency through reporting lays the foundation for collaboration across organizations.
Recommendations
- Build on documented success and resolve ambiguities in the HAIP Reporting Framework’s role in the AI governance ecosystem: Although the reporting framework has demonstrated value and achieved participation from companies of different sizes and with different roles in the AI value chain, reaching the framework’s full potential depends on addressing concerns about costs, duplication, and strategic positioning. This necessitates deeper analysis of how the framework fits into existing governance mechanisms and a clear vision for its evolution. Better articulating where the HAIP reporting framework falls within the AI governance ecosystem could also help trace how these different processes are linked, clarify concepts, allow for better interoperability, and enable stakeholders to coordinate more effectively to identify opportunities to build on existing work.
- Revisit the balance between flexibility and uniformity: Flexibility functions simultaneously as an asset for and a barrier to participation in the reporting framework. The framework should provide clearer reporting guidance to companies of different sizes and to organizations occupying different positions in the AI value chain—distinguishing between model developers and deployers (including integrators), and between smaller and larger companies. Questions should map appropriately to different-sized companies and roles across the AI value chain, with differentiated guidance for model providers, application deployers, and advisory or research organizations, with a clear specification of which questions apply to which actors and clarity about the intended audience for each report. With more organizations across the AI value chain submitting to the framework, a mix of structured and open-ended questions would help with identifying where consensus between organizations is emerging. The addition of quantitative measurements could also help improve comparison across organizations, models, and systems, as well as provide a clearer sense of return on investment for internal stakeholders.
- Provide more explicit guidance for responding to open-ended questions: The OECD has already recognized that some companies reported uncertainty about their reporting approaches during the initial submission round. This uncertainty contributed to the high-level (and occasionally repetitive) nature of some responses, suggesting that the reporting framework would benefit from both more consistent, streamlined, and interpretable questions and additional guidance on how to approach open-ended responses. Companies should be encouraged to provide details, explain when and how particular interventions are applied, and answer questions directly within the framework (even if this involves copying and pasting from existing sources). Even if reported practices and data are evolving, providing information directly within responses recognizes that submissions reflect periodic snapshots that can enable high-level awareness of trends across actors and over time. When external links are provided, they should include brief explanations of their content, particularly given that links create stability risks if content is updated or deleted, and make trend analysis within organizations difficult.
To better guide responses, the framework could provide working examples of what “good reporting” looks like (short examples, by topic area), so responses become more consistent across submitters and jurisdictions. It could also identify a subset of minimum reporting expectations or provide subquestions that would elicit more relevant details for questions that appear to invite overly vague responses. For example, a response on the risk identification and evaluation might address the following questions, where submitters may choose to provide more information without revealing sensitive or security information:
○ Is there an explanation of risk prioritization methodology (not just claiming it exists)?
○ Does the report indicate the types of harms addressed by governance mechanisms (i.e., both frontier and current/everyday harms)
○ Does the report discuss the independence of safety teams from product teams and commercial pressures?
○ Does the report discuss decisionmaking authority and escalation paths for high-risk situations?
○ Does the report explain where in the product or model lifecycle risk identification, documentation, and approval steps occur (for example, at design, pre-deployment testing, and post-deployment monitoring), and which roles are responsible at each stage?
○ Does the report discuss oversight by the Board?
○ Does it provide documentation of measures that exceed minimum regulatory requirements?
This level of detail helps relevant stakeholders understand which approaches appear to be fit for purpose and most effective for their particular use case without having to expand beyond reports to gather relevant details. It is also important to strike the right balance, as eliciting too many details may discourage participation, especially for smaller companies.
- Improve processes for comparing across submissions: Organizations view the HAIP reporting framework as a resource for understanding how other organizations are approaching internal governance, risk management, and decisionmaking. They also allow companies to compare practices when making business-to-business decisions. Making these comparisons more user-friendly, for example, by enabling users to filter submission by organization type (deployers or developers), company size, or specific question topics, would allow stakeholders to more efficiently identify relevant best practices, determine emerging areas of consensus that would inform internal policies and improve broader AI governance, and benchmark their own approaches for risk management against those of similar organizations. Being able to compare organizations of a similar size, sector, or business model could help encourage participation and justify the time and resource investment needed to submit reports.
- Enhance audit transparency: To better understand how companies implement the practices they document, stakeholders would benefit from additional insights into whether parts of submissions have been independently audited, including who conducted the reviews, when they occurred, and what findings emerged. This would distinguish components of a report that were externally validated from those that solely rely on self-reporting, which is particularly important for audiences who may use or incorporate the information in submissions. Clearer documentation of audit practices would help stakeholders assess the reliability of reported information, understand the level of scrutiny to apply to different submissions or sections of submissions, and strengthen trust in the reporting framework.
- Clarify audiences: A clearer articulation of its target audiences can enhance the HAIP framework’s reach and impact across the AI value chain. While developers have engaged with the process, deployers and users represent new audiences who could benefit significantly but remain largely unfamiliar with the framework. Deployers themselves represent distinct categories with different needs: traditional deployers (such as end-user organizations adopting AI systems) primarily serve as audiences seeking information to make adoption decisions, while integrators and application developers who build on foundation models may serve as both contributors and consumers of the framework. Currently, deployers may struggle with AI adoption if they cannot access adequate information from model developers beyond publicly available model and system cards. Identification of end users could clarify the value proposition for conveying this information and the level of detail or technical sophistication required. In practice, this may point toward layered reporting, with accessible summaries and more technical annexes aimed at different readers. Rather than requiring separate reports for developers, deployers, or end users, the framework could encourage organizations to specify their primary and secondary target audiences and offer guidance on how to structure content—for example, by providing high-level explanations of governance practices for non-technical audiences alongside more detailed information and metrics for expert readers.
- Better delineate a theory of change: HAIP’s impact as a step toward accountability can be strengthened if the framework’s theory of change is further developed. Making it clearer how the sharing of best practices through public reporting can encourage self-examination, allow companies to compare their practices against each other, gradually improve internal processes, and open pathways for moving the international AI governance community toward shared standards. Companies could be encouraged to clarify how their governance processes have changed over time, either as a result of the HAIP Reporting Framework or other initiatives, and to distinguish between implemented and planned measures.
- Strengthen participation incentives: While most developer and deployer organizations in the HAIP Partners Community submitted reports, fewer than half of the 17 organizations that participated in the task force to inform the development of the reporting framework offered responses. Incentives to participate can be strengthened, particularly for organizations that have deep knowledge of the framework and were involved in its development. To strengthen these incentives, governments participating in the HAIP Friends Group, as well as large deployers, can reflect on how meaningful HAIP reports can be a positive factor in procurement, partnership, or funding decisions. Developers may benefit from a better understanding of whether and how policymakers may use their responses to influence governance conversations. Civil society could play a role by developing a means for recognizing companies that provide sufficient reporting and have robust governance in place.
- Expand the reach of the reporting framework: The OECD’s continued efforts to enlist governments and stakeholders from the HAIP Friends Group to promote its use, particularly with SMEs, can expand the reach of the reporting framework. Currently, organizations from 13 HAIP Friends Group countries are excluded7 from the reporting framework despite their governments’ commitment to HAIP principles. Expanding eligibility criteria to include all Friends Group members could broaden participation to include significant AI development activity in countries like Indonesia, the UAE, Thailand, and Nigeria. Forthcoming Friends of HAIP meetings offer concrete venues to encourage participating governments to champion the framework domestically and to signal its importance in shaping responsible AI practices. It may also be useful to develop a separate, HAIP-lite reporting framework to broaden the process to stakeholders who may struggle to respond to certain questions where capacity or capabilities are limited, rather than lowering expectations for the main framework.
- Make the framework iterative: Consistency and stability in report formats and content will enable comparisons over time. However, flexibility to adapt to changes in markets and government policy and feedback can increase the framework’s utility. A format combining a stable core based on the HAIP principles with more modular, updatable components may help reconcile these objectives. The HAIP reporting framework could also tie updates to major system changes (new deployment context, significant capability jump, or major incident) rather than rigid reporting cycles.
Download the appendix: Table of submissions and survey questions (PDF)
In Partnership With
-
Acknowledgements and disclosures
The authors would like to thank Amanda Craig, Sophie Fallaha, Samir Jain, Emily McReynolds, and Shaundra Watson for their feedback and Esther Lee Rosen and Derek Belle for publication assistance.
Google, Microsoft, and Amazon are general, unrestricted donors to the Brookings Institution. The findings, interpretations, and conclusions posted in this piece are solely those of the authors and are not influenced by any donation.
Josh Meltzer contributed to this report while a Senior Fellow at The Brookings Institution. Meltzer has since left Brookings and as of publication is a Principal, Global AI Policy at Amazon Web Services.
-
Footnotes
- Previously, the OECD had established AI principles in 2019 and updated them in 2024 to reflect generative AI models like ChatGPT.
- While the HAIP Code of Conduct is titled “for Organizations Developing Advanced AI Systems,” the foundational G7 documents explicitly include deployers and other actors in their eligibility language, stating organizations “may include, among others, entities from academia, civil society, the private sector, and/or the public sector.” The OECD’s HAIP Reporting Framework, which monitors application of the Code of Conduct, reflects this broader scope.
- To understand emerging trends in research on documentation, the Center for Democracy & Technology has identified several existing approaches in academic literature to documenting AI at each level. Data-level documentation, such as Data Cards and Datasheets for Datasets, can help identify potential issues or biases present in datasets, reducing the likelihood that models are deployed in unintended contexts. Model-level documentation, such as model cards, aims to offer visibility into the model’s development process, performance metrics, and associated risks. System-level documentation, such as system cards, calls for documenting a system overview with goals, inputs, outputs, component interactions, and evaluation methods to enable surfacing potential risks and identifying unsuitable usage contexts. They aim to make systems explainable, and provide deployers, policymakers, and individuals with an understanding of how systems operate at scale and affect users. Some of these approaches have informed disclosures and risk management practices referenced by the National Institute for Standards and Technology (NIST) Risk Management Framework and the European Union’s AI Act, in particular, Section II and Annex IV and the General-Purpose AI Code of Practice, Safety and Security Chapter. These efforts aim to improve AI documentation’s accuracy, utility and interoperability.
- We include a table of organizations who submitted to the reporting framework along with links to their submission and other relevant details in the Appendix.
- Respondents were recruited using convenience sampling, including LinkedIn, organizational and topical listservs, and relevant Slack channels. We include the questionnaire we used in the Appendix available via the sidebar download.
- Small and medium-sized enterprises (SMEs) are defined as organizations with fewer than 250 employees.
- The 13 countries are: Brunei, Bulgaria, Cambodia, Croatia, Cyprus, Indonesia, Kenya, Laos, Malaysia, Nigeria, Thailand, United Arab Emirates, and Vietnam. Bulgaria, Croatia, and Cyprus are EU member states where EU institutional adherence to the OECD AI Recommendation may or may not qualify organizations for participation; Indonesia and Thailand are in OECD accession processes.
The Brookings Institution is committed to quality, independence, and impact.
We are supported by a diverse array of funders. In line with our values and policies, each Brookings publication represents the sole views of its author(s).