Commit 15d4520f authored by Thibault Ehrhart's avatar Thibault Ehrhart

Update README with configuration properties

parent fc5be870
# news-kb
# converter
Tools for building the ASRAEL news KB including news collector (converting AFP XML press releases in RDF following the rNews ontology), named entity annotator (using ADEL), source annotation (using the LIMSI source extractor).
......@@ -15,6 +15,24 @@ To compile and install the modules, use the following command:
mvn clean install
```
## Configuration
Before running any of the modules, it is important to set up the configuration variables.
Copy the file [props/default.yml](props/default.yml) into `props/config.yml` and open it to set up the properties:
| Property | Module | Description | Default Value |
|---|---|---|---|
| `resourcesPath` | all | Path to the resources directory. | `./data/resource/agencefrancepresse` |
| `dumpPath` | all | Path to the dump directory. | `./data/dump/agencefrancepresse` |
| `annotatorPath` | _brat-annotator_ | Path to the src folder of the news-annotations project. | `/path/to/news-annotations/src` |
| `annotatorFile` | _brat-annotator_ | Name of the script file used to annotate articles. | `annotate_article.py` |
| `forbiddenKeywords` | _news-collector_ | List of keywords used to ignore certain news. | see [default.yml](props/default.yml) |
| `forbiddenGenres` | _news-collector_ | List of keywords used to ignore certain news. | see [default.yml](props/default.yml) |
| `classificationJsonPath` | _classification-converter_ | Path to the JSON output file generated during classification | `./data/resource/classification/out.pkl.json` |
| `classificationXmlPath` | _classification-converter_ | Path to the XML directory used during classification | `./data/resource/classification/xml` |
| `clustersPath` | _clusters-converter_ | Path to the output from the clustering program. | `./data/resource/clustering/out` |
## Run
### News Collector
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment