mirror of
https://github.com/NexVeridian/wikidata-to-surrealdb.git
synced 2025-09-02 01:49:13 +00:00
2.3 KiB
2.3 KiB
Wikidata to SurrealDB
A tool for converting Wikidata dumps to a SurrealDB database. Either From a bz2 or json file format.
Getting The Data
https://www.wikidata.org/wiki/Wikidata:Data_access
From bz2 file ~80GB
Dump: Docs
Download - latest-all.json.bz2
From json file
Linked Data Interface: Docs
https://www.wikidata.org/wiki/Special:EntityData/Q60746544.json
https://www.wikidata.org/wiki/Special:EntityData/P527.json
Install
Copy docker-compose.yml
Create data folder next to docker-compose.yml and .env, place data inside, and set the data type in .env
├── data
│ ├── Entity.json
│ ├── latest-all.json.bz2
│ └── surrealdb
├── docker-compose.yml
└── .env
docker compose up --pull always
Example .env
DB_USER=root
DB_PASSWORD=root
WIKIDATA_LANG=en
FILE_FORMAT=bz2
FILE_NAME=data/latest-all.json.bz2
# If not using docker file for Wikidata to SurrealDB, use 0.0.0.0:8000
WIKIDATA_DB_PORT=surrealdb:8000
View Progress
docker attach wikidata-to-surrealdb
Dev Install
How to Query
See Useful queries.md
Table Layout
Thing
pub struct Thing {
pub table: String,
pub id: Id,
}
Table: Entity, Property, Lexeme
pub struct EntityMini {
pub id: Option<Thing>,
pub label: String,
pub claims: Thing,
pub description: String,
}
Table: Claims
pub struct Claims {
pub id: Option<Thing>,
pub claims: Vec<Claim>,
}
Table: Claim
pub struct Claim {
pub id: Thing,
pub value: ClaimData,
}
ClaimData
pub enum ClaimData {
Thing(Thing),
ClaimValueData(ClaimValueData),
}
Similar Projects
License
All code in this repository is dual-licensed under either License-MIT or LICENSE-APACHE at your option. This means you can select the license you prefer. Why dual license.