gpt-2/README.md

109 lines
3.1 KiB
Markdown
Raw Normal View History

2019-02-10 20:22:00 -08:00
# gpt-2
2019-02-14 08:43:50 -08:00
Code and samples from the paper ["Language Models are Unsupervised Multitask Learners"](https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf).
For now, we have only released a smaller (117M parameter) version of GPT-2.
See more details in our [blog post](https://blog.openai.com/better-language-models/).
2019-02-10 20:22:00 -08:00
## Installation
2019-02-19 18:05:57 -08:00
Git clone this repository, and `cd` into directory for remaining commands
```
git clone https://github.com/openai/gpt-2.git && cd gpt-2
```
2019-02-19 20:40:59 -08:00
Then, follow instructions for either native or Docker installation.
### Native Installation
Download the model data
2019-02-10 20:22:00 -08:00
```
2019-02-14 09:12:05 -08:00
sh download_model.sh 117M
2019-02-10 20:22:00 -08:00
```
2019-02-19 17:48:19 -08:00
The remaining steps can optionally be done in a virtual environment using tools such as `virtualenv` or `conda`.
Install tensorflow 1.12 (with GPU support, if you have a GPU and want everything to run faster)
```
pip3 install tensorflow==1.12.0
```
or
```
pip3 install tensorflow-gpu==1.12.0
```
Install other python packages:
2019-02-10 20:22:00 -08:00
```
2019-02-14 09:12:05 -08:00
pip3 install -r requirements.txt
2019-02-10 20:22:00 -08:00
```
### Docker Installation
Build the Dockerfile and tag the created image as `gpt-2`:
```
docker build --tag gpt-2 -f Dockerfile.gpu . # or Dockerfile.cpu
```
Start an interactive bash session from the `gpt-2` docker image.
You can opt to use the `--runtime=nvidia` flag if you have access to a NVIDIA GPU
and a valid install of [nvidia-docker 2.0](https://github.com/nvidia/nvidia-docker/wiki/Installation-(version-2.0)).
```
docker run --runtime=nvidia -it gpt-2 bash
```
2019-02-19 17:57:01 -08:00
## Usage
2019-02-10 20:22:00 -08:00
| WARNING: Samples are unfiltered and may contain offensive content. |
| --- |
Some of the examples below may include Unicode text characters. Set the environment variable:
```
export PYTHONIOENCODING=UTF-8
```
to override the standard stream settings in UTF-8 mode.
2019-02-19 17:57:33 -08:00
### Unconditional sample generation
2019-02-10 20:22:00 -08:00
To generate unconditional samples from the small model:
```
python3 src/generate_unconditional_samples.py | tee /tmp/samples
2019-02-10 20:22:00 -08:00
```
There are various flags for controlling the samples:
```
python3 src/generate_unconditional_samples.py --top_k 40 --temperature 0.7 | tee /tmp/samples
2019-02-10 20:22:00 -08:00
```
2019-02-14 00:17:55 -08:00
To check flag descriptions, use:
```
python3 src/generate_unconditional_samples.py -- --help
```
2019-02-19 17:57:01 -08:00
### Conditional sample generation
To give the model custom prompts, you can use:
```
python3 src/interactive_conditional_samples.py --top_k 40
```
2019-02-14 00:17:55 -08:00
To check flag descriptions, use:
```
python3 src/interactive_conditional_samples.py -- --help
```
2019-02-19 00:43:31 -08:00
## GPT-2 samples
2019-02-19 17:57:33 -08:00
| WARNING: Samples are unfiltered and may contain offensive content. |
| --- |
2019-02-19 00:43:31 -08:00
While we have not yet released GPT-2 itself, you can see some samples from it in the `gpt-2-samples` folder.
We show unconditional samples with default settings (temperature 1 and no truncation), with temperature 0.7, and with truncation with top_k 40.
2019-02-19 17:21:46 -08:00
We show conditional samples, with contexts drawn from `WebText`'s test set, with default settings (temperature 1 and no truncation), with temperature 0.7, and with truncation with top_k 40.
2019-02-19 00:43:31 -08:00
2019-02-14 00:17:55 -08:00
## Future work
We may release code for evaluating the models on various benchmarks.
We are still considering release of the larger models.