AI in Collaborative Environments: a Double-Edged Sword

VISITING EXPERT. Cybersecurity issues are becoming increasingly complex.

The excitement for AI is very high. These super assistants can produce reports and executive summaries, make suggestions and even propose strategies for your business. We can say that the advent of generative artificial intelligence will transform the way businesses create and use information. However, this technological revolution also brings increased exposure to several risks induced by this technology. Cybersecurity specialists and risk managers face an increasing complexity of the issues that require resolution.

Explosion in the volume of unstructured information

The use of generative AI, such as Microsoft Copilot, in collaborative environments leads to a rapid increase in the volume of unstructured data. Presently, about 90% of the data generated is unstructured (text, video, images, etc.), and this proportion increases further with generative AI content creation capabilities. This proliferation is linked to the ongoing production of new content, such as reports, documents or automated real-time conversations.

Experts estimate that this volume of unstructured data could grow by 28% yearly, doubling every two to three years in some sectors. This creates challenges for data management, including its storage, security, and efficient operation in advanced parsing systems.

Companies need robust strategies to manage these increasing volumes of potentially sensitive data; cybersecurity solutions tailored to this reality must be implemented quickly to ensure optimal use and avoid costly cybersecurity and privacy incidents.

Some definitions

Before we go into more detail on the risks to monitor when implementing a solution like Microsoft Copilot or Google Gemini, let us look at the definition of some important AI concepts.

An artificial intelligence model is a mathematical structure that enables predictions or content generation based on input data. In the case of LLM (large language model), such as ChatGPT, the model is a network of neurons designed to understand and generate text coherently. The model is trained based on huge amounts of text to learn the relationships between words and sentences.

Parameters are internal model variables adjusted during training (or learning) to optimize predictions. LLMs have billions of parameters that determine how the model interprets a text input and generates an output. Parameters are adjusted during model training to minimize prediction errors regarding training data.

Training refers to the process by which a model adjusts its parameters in response to training data to improve its predictions. In LLM, the model is exposed to examples and learns to make more accurate predictions based on the errors it makes (through methods such as gradient descent). This learning process usually occurs on powerful computing infrastructures.

A concrete example

Imagine a merger or acquisition project where a team has to produce a report that includes intellectual property, sensitive personal information and critical financial information.

Today, the team can automate many tasks with the help of AI. For example, Copilot can automatically generate meeting transcripts, synthesize key points and integrate this information into the report directly. AI can also analyze complex datasets, provide relevant tables, and write summaries and recommendations based on the project’s sensitive information and strategic objectives.

When a user queries Copilot, the command is enriched with information regarding the user’s context before questioning the model (LLM). This pre-processing process is called “grounding.” It anchors model responses into organization-specific data. Before generating a response, the AI performs a contextual search in documents, emails, and other resources the user can access. This allows the AI to understand the question in a specific context and provide relevant responses based on internal company information rather than generic responses. This phase ensures results are linked to organizational realities. This phase allows you to invisibly inject company-specific information into the system prompt, which could be returned to the user later.

After the AI generates an initial response, the “post-processing” phase refines the output to make it clearer and more concise and format it according to the user’s needs. Post-processing filters sensitive information (if properly configured) and adjusts tone and language to match professional style.

Risks to manage

Security teams have increasing difficulty managing discretionary access, that is, access granted by an individual because they share a document or grant access to an MS-TEAMS team to a colleague or business partner. AI exacerbates this challenge by facilitating the creation, sharing, and access to information in a way that goes beyond traditional access management capabilities. The impact of accesses managed with laxity will be even greater because, in pre-processing, the AI can access all the information the user can access in their environment. However, the AI will be significantly more effective in finding documents relevant to the research context. This suggests that sensitive data exfiltration by malicious subjects will be even simpler to execute. Here is a table that lists other potentially problematic situations.

Description of the riskExample of a mergers and acquisitions project with Copilot
Prompt injection: Manipulate AI through commands to generate harmful or inappropriate responses.A malicious employee could inject negative information into Copilot to influence responses, suggestions, etc.
Insecure output handling: The AI generates potentially risky results without validation.Copilot could generate a summary containing sensitive merger data, which is shared without prior human review.
Supply chain vulnerabilities: Use of unsecured plugins or components.A malicious plugin built into Microsoft 365 could access sensitive discussions about the acquisition project using Copilot.
Sensitive information disclosure: The AI reveals confidential data without adequate protection.Copilot could accidentally reveal confidential acquisition details to an employee who does not have the necessary permissions but has been granted discretionary access.
Excessive autonomy: Giving the AI too much decision-making power.If Copilot is too autonomous, it may propose acquisition decisions that do not align with the leadership team’s objectives.

(From OWASP Top 10 for LLM)

The impact of cybersecurity legislation

At the same time, recent legislation such as Law 25 in Quebec and Bills C-26 and C-27 in Canada impose strict data protection requirements. These laws reinforce the need to frame the management of unstructured information

to comply with security standards and avoid severe penalties. Managing unstructured information is no longer just a good practice, it is a legal obligation that must be taken seriously by all business leaders. The ability of LLMs to use available information to provide a contextualized response significantly increases the risk of disclosure.

The urgency of action

Lax discretionary access management can be exploited on a large scale, particularly with the rise of AI agents such as Copilot. These AI tools can quickly identify and access documents available within the organization, increasing the risk of leaks of sensitive information. In this context, it becomes imperative to implement robust cybersecurity mechanisms to manage access and protect unstructured information.

Access management should no longer be left to chance or handled ad hoc. On the contrary, it must be systematized, automated and integrated into a comprehensive cybersecurity strategy.

Costly setbacks

Decision-makers must capture the duality between AI’s opportunity and the risks it creates. On the one hand, generative AI offers a unique opportunity to improve competitiveness by automating tasks, accelerating content creation and optimizing collaboration. On the other hand, this same technology increases vulnerabilities, particularly regarding the cybersecurity of unstructured information. Ignoring these risks can lead to costly financial and regulatory challenges. Therefore, it is crucial to manage access and protect sensitive data proactively and rigorously, while leveraging AI capabilities to remain competitive. Finding this balance between innovation and protection is key to ensuring long-term resilience.