从源码安装

Install dependencies
```
pip install -r requirements.txt
```

Configure the environment

Create an .env file in the root directory
```
cp .env.example .env
```

Set the following environment variables:

# Synthesizer is the model used to construct KG and generate data
SYNTHESIZER_MODEL=your_synthesizer_model_name
SYNTHESIZER_BASE_URL=your_base_url_for_synthesizer_model
SYNTHESIZER_API_KEY=your_api_key_for_synthesizer_model
# Trainee is the model used to train with the generated data
TRAINEE_MODEL=your_trainee_model_name
TRAINEE_BASE_URL=your_base_url_for_trainee_model
TRAINEE_API_KEY=your_api_key_for_trainee_model

(Optional) If you want to modify the default generated configuration, you can edit the content of the configs/graphgen_config.yaml file.

# configs/graphgen_config.yaml
# Example configuration
data_type: "raw"
input_file: "resources/examples/raw_demo.jsonl"
# more configurations...

Run the generation script
```
bash scripts/generate.sh
```
Get the generated data
```
ls cache/data/graphgen
```

最后更新于4个月前