recipe-graph/README.md

115 lines
2.3 KiB
Markdown
Raw Normal View History

2022-07-18 11:13:53 -04:00
# Recipe Graph
## Setup
Prerequisits
- Docker compose
- Python
Install python requirements
```sh
python -m pip installl -r requirements.txt
```
Environment (`.env`)
```sh
POSTGRES_URL=0.0.0.0
POSTGRES_USER=rgraph
POSTGRES_PASSWORD=rgraph
POSTGRES_DB=rgraph
```
2022-07-18 11:13:53 -04:00
Start database
```sh
2023-04-19 18:42:56 -04:00
docker-compose -p recipe-dev up
2022-07-18 11:13:53 -04:00
```
2023-04-19 17:44:57 -04:00
Example `sites.json`
```json
[
{
"name": "Example Site Name",
"ingredient_class": "example-ingredients-item-name",
"name_class" : "example-heading-content",
"base_url" : "https://www.example.com/recipe/"
}
]
```
2022-07-18 11:13:53 -04:00
Initialize database and recipe sites
```sh
python src/db.py
2022-08-17 18:57:58 -04:00
python src/insert_sites.py data/sites.json
2022-07-18 11:13:53 -04:00
```
2023-04-19 18:42:56 -04:00
Shutdown database
```sh
docker-compose -p recipe-dev down
```
2022-07-18 11:13:53 -04:00
## Usage
2022-07-24 21:01:31 -04:00
### Scrape
2022-07-18 11:13:53 -04:00
import new recipes
```sh
2022-07-24 21:01:31 -04:00
python src/scrape.py <SiteName> -id <RecipeIdentifier>
2022-07-18 11:13:53 -04:00
```
2022-07-24 21:01:31 -04:00
To scrape only one recipe.
2022-07-18 11:13:53 -04:00
2022-07-24 21:01:31 -04:00
or
```sh
python src/scrape.py <SiteName> -a <N>
```
To scrape `<N>` recipes
By default it will start at id `0` or the greatest value of id alread in the
database. To start at another value please use both `-id` and `-a`.
```
Scrape a recipe site for recipies
positional arguments:
site Name of site
options:
-h, --help show this help message and exit
-id ID, --identifier ID
url of recipe(reletive to base url of site) or commma seperated list
-a N, --auto N automaticaly generate identifier(must supply number of recipies to scrape)
-v, --verbose
```
2022-07-18 11:13:53 -04:00
2023-04-19 18:42:56 -04:00
## Testing
For testing create a new set up docker containers. Tests will fail if
the database is already initiated.
Starting testing db
```sh
docker-compose -p recipe-test up
```
running tests
```sh
pytest
```
**WARNINING**: If you get `ERROR at setup of test_db_connection` and
`ERROR at setup of test_db_class_creation`, please check if testing database is
already initiated. Testing is destructive and should be done on a fresh database.
Shutting down testing db
```sh
docker-compose -p recipe-test down
```
2022-07-18 11:13:53 -04:00
## TODO
2022-07-24 21:01:31 -04:00
> ☑ automate scraping\
> ☑ extracting quantity and name (via regex)\
2022-08-17 18:57:58 -04:00
> ☑ creating adjacency list\
> ☐ api for web frontend\
> ☐ random ingredient list generation\
> ☐ visualization(web frontend)\
2022-08-05 12:11:09 -04:00
> ☐ create ontology of ingredients\
2022-08-17 18:57:58 -04:00
> ☐ extend importing funcionality to more websites
>