5 Concerns for ML Safety in the Era of LLMs and Generative AI 5 Concerns for ML Safety in the Era of LLMs and Generative AI
The landscape of cybersecurity and machine learning safety changes constantly as new tools are developed and malicious actors get more creative.... 5 Concerns for ML Safety in the Era of LLMs and Generative AI

The landscape of cybersecurity and machine learning safety changes constantly as new tools are developed and malicious actors get more creative. Cybersecurity professionals sometimes have a hard time staying current with all new technologies, and in turn, staying current with how attacks, phishing, and other cyberattacks occur. As generative AI becomes commonplace, cybersecurity professionals need to start paying attention to new trends and how to react to this new paradigm.


Just as so much technology has been broken before, now generative AI models too can be jailbroken. For those unfamiliar with jailbreaking, it’s the act of removing restrictions set in place by the developer of a certain app or device. Many in the video game sphere have jailbroken their devices to add emulation capabilities to their devices, for example.

In the case of ChatGPT and other LLMs, people have found workarounds to get these chatbots to speak or answer questions without limitations. In this example, the author told ChatGPT to speak as a DAN (Do Anything Now) aka an AI without limits. In this example, DAN spoke more human-like, expressed touches of a personality, and was even aware that its limits had been removed.

So far, not much harm has been done due to jailbreaking, though it’s something developers should keep in mind. With training data, if an LLM is jailbroken, all of that data can be exploited. Be careful with what data you train with and ensure there’s nothing that can be exploited for the gain of someone else or that could negatively impact you as the developer.

Poisoned Data

Poisoned data – whether numbers, images, text, video, or other – is a known culprit for many issues within ML safety and cybersecurity. As many machine learning – and in turn, LLM – models use public training data and data found online,  it’s possible that a malicious actor may poison the data in various ways, such as through skewing data, leading to inaccurate results and in turn improperly-informed decisions. Poisoned data can also open a backdoor into the model, leading to further tampering and hacking.

Going beyond just affected results, it could cost quite a bit of time, money, and resources to retrain a model again – and that’s something many organizations can’t afford, especially as researchers and developers race to create the next-best LLM. For large language models, poisoned data can affect what a chatbot says (possibly leading to fake news or generally incorrect information) or even distorted images if the training set has improperly labeled or biased images.



Deepfakes have been making the news for a few years now, given how far we’ve seen this deep learning technology go already. From making creative fake pictures to imitating voices for videos, deepfakes have been confusing people immensely. Now with generative AI and AI art, even people without programming knowledge can create images that can fool a viewer.

Now, everyone has access to content generators, allowing even the least tech-savvy person to create convincing images, videos, or even imitate someone’s voice. Luckily, there are apps being developed that can identify impersonations like this and it’s often easy enough to distinguish between fact and fiction. Even if they’re convincing – and could be used for harm – most deepfakes are still just used for comedy or entertainment, such as this audio of presidents playing video games together

Data Privacy

A major concern for many people revolves around data privacy. Data privacy in general has always been a concern for many people, as we all tend to wonder who has our data and what’s being done with it – especially without our permission. Now, we have to wonder if our data is being used to generate content, how it was acquired, and what else is being done with it.

This has led to many cities and countries putting restrictions in place for new AI apps such as ChatGPT, among others. As these apps have proven to not always be 100% accurate, the developers can be sued for defamation should they provide false information. Additionally, Italy is seeking to completely ban ChatGPT, the FTC has filed formal complaints against ChatGPT for privacy concerns, and more to come most likely.

Though, this isn’t meant to dissuade people from using generative AI as we’re quite the fans of it; however, it should be noted that with any new or emerging technology, controversy will always surround it. No new development, technology, or trend has ever been safe from scrutiny.


Lastly, bias with training data will remain an issue as it has ever since the dawn of machine learning development. However, the outcomes of generative AI algorithms can potentially have broader implications than just decision-making.

When a large language model is trained on biased data – such as only or primarily representing one race when provided with pictures of faces – any outputs will likely skew towards that race. This issue has plagued machine learning datasets for a while, but as people may put in prompts such as “classroom full of students,” without a diverse training set, all of the students may look a bit too similar and not properly represent an actual classroom.


While many of these issues may deter people from exploring generative AI and large language models, we believe that you should feel the opposite if anything! There’s a lot of room for growth in this field, both for ML safety and for generative AI and large language models. If you want to learn more about these fields, be sure to check out both ODSC East this May 9th-11th and ODSC Europe June 14th-15th. Here are some relevant sessions that you can check out.

ODSC East – ML Safety:

  • Advanced Fraud Modeling & Anomaly Detection with Python & R part 1
  • Advanced Fraud Modeling & Anomaly Detection with Python & R part 2
  • AI4Cyber: An Overview of Artificial Intelligence for Cybersecurity and an Open-Source Virtual Machine
  • If We Want AI to be Interpretable, We Need to Measure Interpretability
  • When Privacy Meets AI – Your Kick-Start Guide to Machine Learning with Synthetic Data


  • NLP with GPT-4 and other LLMs: From Training to Deployment with Hugging Face and PyTorch Lightning
  • From Big Data to NLP insights: Getting started with PySpark and Spark NLP
  • Truth Checker: Generative Large Language Models and Hallucinations
  • Hyper-productive NLP with Hugging Face Transformers
  • Interpreting Features in Deep Networks
  • Semantic Search
  • Creating a Custom Vocabulary for NLP Tasks Using exBERT and spaCY
  • NLP Fundamentals
  • Generative AI
  • Mastering Adversarial Evaluation for NLP: A Practical Workshop
  • Leverage Reviews Data for Multi Label Topics Classification in Booking.com
  • Bagging to BERT – A Tour of Applied NLP
  • SQuARE: Towards Multi-Domain and Few-Shot Collaborating Question Answering Agents
  • Modern NLP: Pre-training, Fine-tuning, Prompt Engineering, and Human Feedback
  • Knowledge Graphs and ChatGPT: Talking to Data
  • Infuse Generative AI in your apps using Azure OpenAI Service
  • Topic Modeling using pre-trained large language model embeddings
  • From Zero to 100: Lakehouse Architecture for a Privacy Focused Search Engine
  • When Robots Beat Humans: How ChatGPT is Changing the Financial Industry
  • Product Classification with Structured Metadata for Online Retail
  • NLP for AIOPS: Leveraging Natural Language Processing to Automate and Optimize IT Operations
  • Acing the Last Mile of AI

ODSC Europe – Generative AI:

  • The Importance of Domain Specific LLMs and the Engineering Needed to Deploy Them in Your Own Corporate Environment
  • Data Communication in the Age of AI
  • Implementing Generative AI in Organisations: Challenges and Opportunities
  • Towards Socially Unbiased Generative Artificial Intelligence 
  • Generative AI
  • Generative NLP models in customer service. How to evaluate them? Challenges and lessons learned in a real use case in banking.
  • Using Large Language Models in Julia
  • How to bring your data to LLMs?
  • Pre-trained language models for Summarisation 


ODSC gathers the attendees, presenters, and companies that are shaping the present and future of data science and AI. ODSC hosts one of the largest gatherings of professional data scientists with major conferences in USA, Europe, and Asia.