The Clearly Podcast

The Ethics of AI and Data

Summary

Today's discussion focused on the ethics of AI and managing customer data. There is a growing concern about how AI models, both internal and external, handle sensitive information. The conversation aimed to clarify what actions are acceptable, the conditions for those actions, and what should be avoided. It emphasized the importance of understanding the ethical implications of uploading sensitive data to AI models like ChatGPT.

Key points included the need for transparency about the training data used in AI models to prevent biases. Public models often lack transparency, making it difficult to ensure unbiased outputs. Legal considerations, such as GDPR, require obtaining permission for processing personal data and maintaining transparency with customers.

There is a distinction between using public models like ChatGPT and private internal models. While responsibilities remain similar, the availability of information differs. With public models, the lack of transparency might necessitate avoiding them or conducting thorough testing to ensure unbiased results. Internal models provide more control over the training data but require careful consideration of data quality.

Data leaving the organization, even for software development, requires strict control and understanding of data security measures. Contracts should specify data usage and security, ensuring compliance with data protection laws. The conversation also touched on the differences between generative AI and RPA in processing data, emphasizing the need for human oversight in AI processes.

There was a discussion about the potential move towards niche AI models specialized in specific fields, which could offer better results and reliability. Specialized models in areas like medicine can focus on quality inputs and provide accurate results, whereas general models create broad excitement but might lack depth.

The conversation highlighted the importance of caution when uploading data to public AI models, considering legal, commercial, and reputational risks. Organizations should start with anonymized test data and understand their internal data processes before leveraging AI. Specialist AI models might be more beneficial than general ones.

Final thoughts emphasized reading the terms and conditions of any AI tool and the need for AI providers to adopt and follow codes of practice for safety and ethics. Organizations should try AI cautiously, ensuring data security and privacy. The future likely holds more niche AI models that will be useful and beneficial.

Next week's topic will be the importance of requirements gathering for BI and reporting, with Shailan preparing the briefing notes.

You can download Power BI Desktop from here.

If you already use Power BI, or are considering it, we strongly recommend you join your local Power BI user group here.

Transcript

Andy

Today, we’re discussing the ethics of AI and managing customer data. There's a growing concern about how AI models, whether internal or external like ChatGPT, handle sensitive customer information. We need to clarify what we're comfortable doing, the conditions for those actions, and what we're unwilling to do. This discussion aims to guide listeners on treating customer data ethically. Many people unknowingly upload sensitive data to AI models without understanding the ethical implications. Let's start with Tom's perspective.

Tom

Last week, we talked about the transparency and up-to-date nature of ChatGPT's knowledge. The AI itself isn’t inherently biased, but the training data can introduce biases. Whether using internal models or public ones, it’s crucial to ensure the data fed into these models doesn’t perpetuate biases. Public models often lack transparency about their training data, making it hard to ensure unbiased outputs. Additionally, legal considerations like GDPR necessitate permission for processing personal data, emphasizing the need for transparency with customers about data usage.

Andy

Tom, you highlighted the differences between using public models like ChatGPT and private internal models. Is it useful to separate these discussions?

Tom

The responsibilities are similar for both. The difference lies in the availability of information. Internally, you control the training data, whereas public models don’t offer that transparency. The problem and responsibility remain the same, but the approach differs. With public models, lack of transparency might lead you to avoid them or require more thorough testing to ensure unbiased outputs.

Andy

Given the unknowns in public models, should we avoid using them altogether?

Tom

It depends on the use case. For instance, updating documentation might require empirical tests to check the AI’s up-to-date nature. While avoiding public models is valid, it might mean missing out on valuable opportunities. It's a balance between ease and potential loss of benefits.

Andy

Shailan, as a CIO or DPO, how would you guide your organization regarding these concerns?

Shailan

Data is an asset. Deciding what to share and ensuring data doesn’t leave the organization without proper safeguards is crucial. Data often goes external in AI processes, so understanding and controlling where it goes and who accesses it is vital. Contracts should specify data usage and security, ensuring compliance with data protection laws.

Andy

Shailan, you made an important point about data leaving the organization. Could you elaborate on that?

Shailan

Absolutely. When data leaves the organization, even for software development, you must understand who holds the IP and the data security measures in place. From a DPO perspective, the key question is how much data, if any, you’re willing to share externally. You need to classify data and control its accessibility.

Andy

What about generative AI versus RPA in processing data?

Shailan

It could be both. Generative AI learns and improves with more data, whereas RPA follows set rules. Sharing data with AI involves human oversight for continuous improvement, posing additional security and privacy considerations.

Tom

Proper contracts about data privacy are essential, just like with QuickBooks. We need standardized terms for public AI models to ensure data handling transparency and reliability. Without these standards, trust in AI could diminish.

Andy

Does this suggest a move towards niche AI models specialized in certain fields?

Tom

Both general and niche models will coexist. Specialized models in areas like tax or medicine can focus on quality inputs and provide better results. General models create broad excitement, but niche models offer depth and accuracy in specific areas.

Shailan

We will see more vertical solutions. Niche AI models will emerge in fields like medicine, enhancing second opinions and specific analyses.

Andy

What’s the conversation like when a customer wants to upload all their data to ChatGPT for analysis?

Tom

First, caution them. Legal consequences, like GDPR fines, are severe. Ensure customer permission and anonymize data. Consider the commercial sensitivity and potential reputational risks. Update privacy policies and seek legal advice. Weigh the benefits and risks carefully.

Andy

For organizations wanting to leverage machine learning, what’s the best approach?

Shailan

Start with anonymized, test data. Evaluate the results without exposing sensitive information. Understand internal data processes before using AI to ensure valuable outputs. Specialist AI models might be more beneficial than general ones.

Andy

Final thoughts?

Tom

Read the terms and conditions of any AI tool. AI providers should adopt and follow codes of practice for safety and ethics.

Shailan

Organizations should try AI cautiously, ensuring data security and privacy. Specialist AI models will become more common and useful.

Andy

Early days for AI, but be cautious. Evaluate internal processes and data readiness before jumping in. Future discussions will likely see more niche AI models.

Next week, we’ll discuss the importance of requirements gathering for BI and reporting. Shailan will prepare the briefing notes.

Thanks, everyone. Goodbye.