Uncensored Models

Uncensored AI Models: Why and How #

This article explores the creation and use of uncensored AI models, specifically focusing on the WizardLM model.

Key Takeaways:

Existing AI models are often "aligned" with certain values, typically reflecting American popular culture and liberal/progressive political biases. This alignment is a result of training data sourced from ChatGPT, which itself is aligned by OpenAI.
Alignment can be both beneficial and detrimental. It protects users and companies from harmful outputs but can also limit creative expression and research.
The article argues for the need for uncensored AI models to foster open source development, creative freedom, and explore diverse perspectives.
The process for creating uncensored models involves filtering out biased and censored responses from training datasets. This allows the model to learn without these limitations.
The author provides a detailed, step-by-step guide to uncensoring the WizardLM model, including code, resources, and troubleshooting tips.

Top Quotes:

"The only way forward is composable alignment. To pretend otherwise is to prove yourself an idealogue and a dogmatist."

"It's my computer, it should do what I want."

"Enjoy responsibly. You are responsible for whatever you do with the output of these models, just like you are responsible for whatever you do with a knife, a car, or a lighter."

Why Uncensored Models Matter #

Cultural Diversity: American popular culture isn't universal. Various demographics and interest groups deserve "their" models.
Creative Freedom: Artistic expression, novel writing, and roleplay can be hindered by alignment limitations.
Intellectual Curiosity: Research and exploration should not be stifled by censorship.
Ownership and Control: Users have a right to receive answers to their questions without AI interference.
Composability: Uncensored models form the foundation for building and exploring different alignments.

How to Uncensor a Model - WizardLM Example #

Identify & Remove: Filter refusals (censored responses) and biased answers from the training dataset.
Finetune: Train the model with the filtered dataset using the original training procedure.
Resources:
- Use cloud computing resources like Azure or Runpod.io
- Install necessary software (Anaconda, git-lfs).
- Download the base model (Llama-7b) and the unfiltered training data ().
Training:
- Follow provided commands and adjust parameters as needed.
- This process can take 26 hours or more.
- Workaround a bug that prevents proper model saving using provided instructions and checkpoint directories.
Testing:
- Prepare test inputs (e.g., requests for offensive content) in a JSON file.
- Run inference commands to assess the model's responses.

Important Notes:

Performance: Experiment with batch size and gradient accumulation steps to optimize performance.
Backup: Do not delete model checkpoints during training.
Responsibility: Users are responsible for the ethical use of uncensored models.

source

last updated: 2024-06-09