Home / Publications / A framework for evaluating rapidly...

A framework for evaluating rapidly developing digital and related technologies: AI, large language models and beyond

This discussion paper provides the outline of an initial framework to inform the multiple global and national discussions taking place related to AI.

It has been proposed by a number of academics and policy experts that the International Science Council – with its pluralistic membership from the social and natural sciences – establish a process to produce and maintain an annotated framework/checklist of the risks, benefits, threats and opportunities associated with rapidly moving digital technologies, including – but not limited to – AI. The purpose of the checklist would be to inform all stakeholders – including governments, trade negotiators, regulators, civil society and industry – of potential future scenarios, and would frame how they might consider the opportunities, benefits, risks and other issues.

The ISC is pleased to present this discussion paper on evaluating rapidly developing digital and related technology. Artificial Intelligence, synthetic biology and quantum technologies are prime examples of innovation, informed by science, emerging at an unprecedented pace. It can be challenging to systematically anticipate not only their applications, but also their implications.

Evaluating the societal aspects of generative AI such as large language models, which predictably represent the bulk of this discussion paper, is a needed bridge within the current discourse – sometimes panic-driven, other times not profound enough in thought – and the necessary courses of action we can take. The ISC is convinced that an analytical framework between the social acceptance of such new technologies and their possible regulation is required to facilitate the multistakeholder discussions that are needed to take informed and responsible decisions on how to optimize the social benefits of this rapidly emerging technology.

The ISC is open to reactions from our community through this discussion paper in order to assess how best to continue being part of and contributing to the debate around technology.
Salvatore Aricò, CEO

An ISC discussion paper

A framework for evaluating rapidly developing digital and related technologies: AI, large language models and beyond

This discussion paper provides the outline of an initial framework to inform the multiple global and national discussions taking place related to AI.

Download the report

New! Read the 2024 version for policy-makers with a downloadable framework for your organization.

A guide for policy-makers: Evaluating rapidly developing technologies including AI, large language models and beyond

This discussion paper provides the outline of an initial framework to inform the multiple global and national discussions taking place related to AI.

Learn more

Read the 2023 ISC discussion paper online or in your preferred language

The rapid emergence of a technology with the complexity and implications of AI is driving many claims of great benefits. However, it also provokes fears of significant risks, from individual to geo-strategic level. Much of the discussion tends to take place at the extreme ends of the spectrum of views, and a more pragmatic approach is needed. AI technology will continue to evolve and history shows that virtually every technology has both beneficial and harmful uses. The question is, therefore: how can we achieve beneficial outcomes from this technology, while reducing the risk of harmful consequences, some of which could be existential in magnitude?

The future is always uncertain, but there are sufficient credible and expert voices regarding AI and generative AI to encourage a relatively precautionary approach. In addition, a systems approach is needed, because AI is a class of technologies with broad use and application by multiple types of users. This means that the full context must be taken into account when considering the implications of AI for individuals, social life, civic life, societal life and in the global context.

Unlike most past technologies, digital and related technologies have a very short period of time from development to release, largely driven by the interests of the production companies or agencies. AI is rapidly pervasive; some properties may only become apparent after release, and the technology could have both malevolent and benevolent applications. Important values dimensions will influence how any use is perceived. Furthermore, there may be geo-strategic interests at play.

To date, the regulation of a virtual technology has largely been seen through the lens of “principles” and voluntary compliance. More recently, however, the discussion has turned to issues of national and multilateral governance, including the use of regulatory and other policy tools. The claims made for or against AI are often hyperbolic and – given the nature of the technology – difficult to assess. Establishing an effective global or national technology regulation system will be challenging, and multiple layers of risk-informed decision-making will be needed along the chain, from inventor to producer, to user, to government and to the multilateral system.

While high-level principles have been promulgated by UNESCO, OECD and the European Commission, amongst others, and various high-level discussions are underway regarding issues of potential regulation, there is a large ontological gap between such principles and a governance or regulatory framework. What is the taxonomy of considerations that a regulator might need to consider? A narrowly focused framing would be unwise, given the broad implications of these technologies. This potential has been the subject of much commentary, both positive and negative.

The ISC is the primary global NGO integrating natural and social sciences. Its global and disciplinary reach means it is well placed to generate independent and globally relevant advice to inform the complex choices ahead, particularly as the current voices in this arena are largely from industry or from the major technological powers. Following extensive discussion over recent months, including the consideration of a non-governmental assessment process, the ISC concluded that its most useful contribution would be to produce and maintain an adaptive analytic framework that can be used as the basis for discourse and decision-making by all stakeholders, including during any formal assessment process that emerges.

This framework would take the form of an overarching checklist that could be used by both government and non-governmental institutions. The framework identifies and explores the potential of a technology such as AI and its derivatives through a wide lens that encompasses human and societal wellbeing, as well as external factors, such as economics, politics, the environment and security. Some aspects of the checklist may be more relevant than others, depending on the context, but better decisions are more likely if all domains are considered. This is the inherent value of a checklist approach.

The proposed framework is derived from previous work and thinking, including the International Network for Governmental Science Advice’s (INGSA) digital wellbeing report¹ and the OECD AI Classification Framework² to present the totality of the potential opportunities, risks and impacts of AI. These previous products were more restricted in their intent given their time and context, there is a need for an overarching framework that presents the full range of issues both in the short and longer-term.

While developed for the consideration of AI, this analytical framework could be applied to any rapidly emerging technology. The issues are broadly grouped into the following categories for further examination:

Wellbeing (including that of individuals or self, society and social life, and civic life)
Trade and economy
Environmental
Geo-strategic and geo-political
Technological (system characteristics, design and use)

A list of considerations for each of the above categories is included along with their respective opportunities and consequences. Some are relevant for specific instances or applications of AI while others are generic and agnostic of platform or use. No single consideration included here should be treated as a priority and, as such, all should be examined.

Initial draft of the dimensions that might need to be considered when evaluating a new technology
Dimensions of impact	Criteria	Examples of how this may be reflected in analysis
Individual/ self	Users’ AI competency	How competent and aware of the system’s properties are the likely users who will interact with the system? How will they be provided with the relevant user information and cautions?
	Impacted stakeholders	Who are the primary stakeholders that will be impacted by the system (i.e., individuals, communities, vulnerable, sectoral workers, children, policy-makers, professionals)?
	Optionality	Are users provided with an option to opt-out of the system; should they be given opportunities to challenge or correct the output?
	Risks to human rights and democratic values	Could the system impact (and in what direction) on human rights, including, but not limited to, privacy, freedom of expression, fairness, risk of discrimination, etc.?
	Potential effects on people’s wellbeing	Could the system impact (and in what direction) the individual user’s wellbeing (i.e., job quality, education, social interactions, mental health, identity, environment)?
	Potential for human labour displacement	Is there a potential for the system to automate tasks or functions that were being executed by humans? If so, what are the downstream consequences?
	Potential for identity, values or knowledge manipulation	Is the system designed to or potentially able to manipulate the user’s identity or values set, or spread disinformation? Is there a potential for false or unverifiable claims of expertise?
	Measures of self-worth	Is there pressure to portray an idealized self? Could automation replace a sense of personal fulfilment? Is there pressure to compete with the system in the workplace? Is individual reputation made harder to protect against disinformation?
	Privacy	Are there diffused responsibilities for safeguarding privacy and are there any assumptions being made on how personal data is utilized?
	Autonomy	Could the system affect human autonomy by generating over-reliance on the technology by end-users?
	Human development	Is there an impact on acquisition of key skills for human development such as executive functions, interpersonal skills, changes in attention time affecting learning, personality development, mental health concerns, etc.?
	Personal health care	Are there claims of personalized health care solutions? If so, are they validated to regulatory standards?
	Mental health	Is there a risk of increased anxiety, loneliness or other mental health issues, or can the technology mitigate such impacts?
	Human evolution	Could the technology lead to changes in human evolution?
Dimensions of impact	Criteria	Description
Society/social life	Societal values	Does the system fundamentally change the nature of society or enable the normalization of ideas previously considered anti-social, or does it breach the societal values of the culture in which it is being applied?
	Social interaction	Is there an effect on meaningful human contact, including emotional relationships?
	Equity	Is the application/technology likely to reduce or enhance inequalities (i.e., economic, social, educational, geographical)?
	Population health	Is there a potential for the system to advance or undermine population health intentions?
	Cultural expression	Is an increase in cultural appropriation or discrimination likely or more difficult to address? Does reliance on the system for decision-making potentially exclude or marginalize sections of society?
	Public education	Is there an effect on teacher roles or education institutions? Does the system emphasize or reduce inequity among students and the digital divide? Is the intrinsic value of knowledge or critical understanding advanced or undermined?
	Distorted realities	Are the methods we use to discern what is true still applicable? Is the perception of reality compromised?
Economic context (trade)	Industrial sector	Which industrial sector is the system deployed in (i.e., finance, agriculture, health care, education, defence)?
	Business model	In which business function is the system employed, and in what capacity? Where is the system used (private, public, non-profit)?
	Impact on critical activities	Would a disruption of the system’s function or activity affect essential services or critical infrastructures?
	Breath of deployment	How is the system deployed (narrowly within an organization vs widespread nationally/internationally)?
	Technical maturity (TRL)	How technically mature is the system?
	Technological sovereignty	Does the technology drive greater concentration of technological sovereignty.
	Income redistribution and national fiscal levers	Could the core roles of the sovereign state be compromised (i.e., Reserve Banks)? Will the state’s ability to meet citizens’ expectations and implications (i.e., social, economic, political) be advanced or reduced?
Dimensions of impact	Criteria	Description
Civic life	Governance and public service	Could governance mechanisms and global governance systems be affected positively or negatively?
	News media	Is public discourse likely to become more or less polarized and entrenched at a population level? Will there be an effect on the levels of trust in the media? Will conventional journalism ethics and integrity standards be further affected?
	Rule of law	Will there be an effect on the ability to identify individuals or organizations to hold accountable (i.e., what kind of accountability to assign to an algorithm for adverse outcomes)? Does this create a loss of sovereignty (i.e., environmental, fiscal, social policy, ethics)?
	Politics and social cohesion	Is there a possibility of more entrenched political views and less opportunity for consensus building? Is there the possibility of further marginalizing groups? Are adversarial styles of politics made more or less likely?
Geo-strategic/ geo-political context	Precision surveillance	Are the systems trained on individual behavioural and biological data, and if so, could they be used to exploit individuals or groups?
	Digital colonization	Are state or non-state actors able to harness systems and data to understand and control other countries’ populations and ecosystems, or to undermine jurisdictional control?
	Geo-political competition	Does the system affect competition between nations and technology platforms for access to individual and collective data for economic or strategic purposes?
	Trade and trade agreements	Does the system have implications for international trade agreements?
	Shift in global powers	Is the status of nation-states as the world’s primary geo-political actors under threat? Will technology companies wield power once reserved for nation-states and are they becoming independent sovereign actors?
	Disinformation	Is it easier for state and non-state actors to produce and disseminate disinformation that impacts social cohesion, trust and democracy?
Environmental	Energy and resource consumption (carbon footprint)	Does the system and requirements increase uptake of energy and resource consumption over and above the efficiency gains obtained through the application?
Dimensions of impact	Criteria	Description
Data and input	Detection and collection	Are the data and input collected by humans, automated sensors or both?
	Provenance of the data	With regards to the data are these provided, observed, synthetic or derived? Are there watermark protections to confirm provenance?
	Dynamic nature of the data	Are the data dynamic, static, updated from time to time or updated in real-time?
	Rights	Are data proprietary, public or personal (i.e., related to identifiable individuals)?
	Identifiability of personal data	If personal data, are they anonymized or pseudonymized?
	Structure of the data	Are the data structured, semi-structured, complex structured or unstructured?
	Format of the data	Is the format of the data and metadata standardized or non-standardized?
	Scale of the data	What is the dataset’s scale?
	Appropriateness and quality of the data	Is the dataset fit for purpose? Is the sample size adequate? Is it representative and complete enough? How noisy is the data? Is it error prone?
Model	Information availability	Is information about the system’s model available?
	Type of AI model	Is the model symbolic (human-generated rules), statistical (uses data) or hybrid?
	Rights associated with model	Is the model open source, or proprietary, self- or third-party managed?
	Single or multiple models	Is the system composed of one model or several interlinked models?
	Generative or discriminative	Is the model generative, discriminative or both?
	Model building	Does the system learn based on human-written rules, from data, through supervised learning or through reinforcement learning?
	Model evolution (AI drift)	Does the model evolve and/or acquire abilities from interacting with data in the field?
	Federated or central learning	Is the model trained centrally or in several local servers or “edge” devices?
	Development and maintenance	Is the model universal, customizable or tailored to the AI actor’s data?
	Deterministic or probabilistic	Is the model used in a deterministic or probabilistic manner?
	Model transparency	Is information available to users to allow them to understand model outputs and limitations or use constraints?
	Computational limitation	Are there computational limitations to the system? Can we predict capability jumps or scaling laws?
Dimensions of impact	Criteria	Description
Task and output	Task(s) performed by system	What tasks does the system perform (i.e., recognition, event detection, forecasting)?
	Combining tasks and actions	Does the system combine several tasks and actions (i.e., content generation systems, autonomous systems, control systems)?
	System’s level of autonomy	How autonomous are the system’s actions and what role do humans play?
	Degree of human involvement	Is there some human involvement to oversee the overall activity of the AI system and the ability to decide when and how to use the system in any situation?
	Core application	Does the system belong to a core application area such as human language technologies, computer vision, automation and/or optimization, or robotics?
	Evaluation	Are standards or methods available to evaluate system output or deal with unforeseen emergent properties?

Key to sources of the descriptors

Plain text:
Gluckman, P. and Allen, K. 2018. Understanding wellbeing in the context of rapid digital and associated transformations. INGSA. https://ingsa.org/wp-content/uploads/2023/01/INGSA-Digital-Wellbeing-Sept18.pdf

Bold text:
OECD. 2022. OECD Framework for the Classification of AI systems. OECD Digital Economy Papers, No. 323, OECD Publishing, Paris. https://oecd.ai/en/classification

Italic text:
New descriptors (from multiple sources)

Join the discussion

Please enable JavaScript in your browser to complete this form.

Name

First

Last

Organization

Position at organization

Gender

Female
Male
Other
Prefer not to say

Which country are you based in?

Tell us your thoughts on the ISC's discussion paper 'A framework for evaluating rapidly developing digital and related technologies: AI, Large Language Models and beyond'

Tell us your thoughts on the development of an analytical framework and the dimensions the ISC is proposing

Data protection: Respondents should be aware that the ISC will hold the information submitted for the duration of the initiative (see: https://council.science/privacy-policy)

A framework for evaluating rapidly developing digital and related technologies: AI, large language models and beyond

Read the 2023 ISC discussion paper online or in your preferred language

Join the discussion

Related publications

The Contextualization Deficit: Reframing Trust in Science for Multilateral Policy

Position paper for the 2023 High-Level Political Forum

The future of research evaluation: A synthesis of current debates and developments

A framework for evaluating rapidly developing digital and related technologies: AI, large language models and beyond

Read the 2023 ISC discussion paper online or in your preferred language

Join the discussion

Share this publication

Related publications

The Contextualization Deficit: Reframing Trust in Science for Multilateral Policy

Position paper for the 2023 High-Level Political Forum

The future of research evaluation: A synthesis of current debates and developments