match CREATE_MODE

This commit is contained in:
Elijah McMorris 2024-02-06 10:05:58 -08:00
parent 305bf5273b
commit 38fdee5728
Signed by: NexVeridian
SSH key fingerprint: SHA256:bsA1SKZxuEcEVHAy3gY1HUeM5ykRJl0U0kQHQn0hMg8
7 changed files with 87 additions and 60 deletions

View file

@ -3,7 +3,7 @@ A tool for converting Wikidata dumps to a [SurrealDB](https://surrealdb.com/) da
The surrealdb database is ~2.6GB uncompressed or 0.5GB compressed, while the bz2 file is ~80GB, gzip file is ~130GB, and the uncompressed json file is over 1TB.
Querying the entire database takes ~2 seconds per query. Building the database on a 7600k takes ~55 hours, using a cpu with more cores should be faster.
Building the database on a 7600k takes ~55 hours, using ThreadedSingle, using a cpu with more cores should be faster.
# Getting The Data
https://www.wikidata.org/wiki/Wikidata:Data_access
@ -42,20 +42,32 @@ Create data folder next to docker-compose.yml and .env, place data inside, and s
DB_USER=root
DB_PASSWORD=root
WIKIDATA_LANG=en
FILE_FORMAT=bz2
FILE_NAME=data/latest-all.json.bz2
WIKIDATA_FILE_FORMAT=bz2
WIKIDATA_FILE_NAME=data/latest-all.json.bz2
# If not using docker file for Wikidata to SurrealDB, use 0.0.0.0:8000
WIKIDATA_DB_PORT=surrealdb:8000
THREADED_REQUESTS=true
WIKIDATA_BULK_INSERT=true
# true=overwrite existing data, false=skip if already exists
OVERWRITE_DB=false
INDIVIDUAL_WS=true
CREATE_MODE=ThreadedSingle
```
Env string CREATE_MODE must be in the enum CreateMode
```
pub enum CreateMode {
Single,
ThreadedSingle,
ThreadedBulk, // Buggy
}
```
# [Dev Install](./CONTRIBUTING.md#dev-install)
# How to Query
```
namespace = wikidata
database = wikidata
```
## See [Useful queries.md](./Useful%20queries.md)
# Table Schema