Uncensored AI Models: Why and How #
This article explores the creation and use of uncensored AI models, specifically focusing on the WizardLM model.
Key Takeaways:
- Existing AI models are often "aligned" with certain values, typically reflecting American popular culture and liberal/progressive political biases. This alignment is a result of training data sourced from ChatGPT, which itself is aligned by OpenAI.
- Alignment can be both beneficial and detrimental. It protects users and companies from harmful outputs but can also limit creative expression and research.
- The article argues for the need for uncensored AI models to foster open source development, creative freedom, and explore diverse perspectives.
- The process for creating uncensored models involves filtering out biased and censored responses from training datasets. This allows the model to learn without these limitations.
- The author provides a detailed, step-by-step guide to uncensoring the WizardLM model, including code, resources, and troubleshooting tips.
Top Quotes:
"The only way forward is composable alignment. To pretend otherwise is to prove yourself an idealogue and a dogmatist."
"It's my computer, it should do what I want."
"Enjoy responsibly. You are responsible for whatever you do with the output of these models, just like you are responsible for whatever you do with a knife, a car, or a lighter."
Why Uncensored Models Matter #
- Cultural Diversity: American popular culture isn't universal. Various demographics and interest groups deserve "their" models.
- Creative Freedom: Artistic expression, novel writing, and roleplay can be hindered by alignment limitations.
- Intellectual Curiosity: Research and exploration should not be stifled by censorship.
- Ownership and Control: Users have a right to receive answers to their questions without AI interference.
- Composability: Uncensored models form the foundation for building and exploring different alignments.
How to Uncensor a Model - WizardLM Example #
- Identify & Remove: Filter refusals (censored responses) and biased answers from the training dataset.
- Finetune: Train the model with the filtered dataset using the original training procedure.
- Resources:
- Use cloud computing resources like Azure or Runpod.io
- Install necessary software (Anaconda, git-lfs).
- Download the base model (Llama-7b) and the unfiltered training data ().
- Training:
- Follow provided commands and adjust parameters as needed.
- This process can take 26 hours or more.
- Workaround a bug that prevents proper model saving using provided instructions and checkpoint directories.
- Testing:
- Prepare test inputs (e.g., requests for offensive content) in a JSON file.
- Run inference commands to assess the model's responses.
Important Notes:
- Performance: Experiment with batch size and gradient accumulation steps to optimize performance.
- Backup: Do not delete model checkpoints during training.
- Responsibility: Users are responsible for the ethical use of uncensored models.