3.4 KiB
3.4 KiB
title, status, type, priority, created_at, updated_at
| title | status | type | priority | created_at | updated_at |
|---|---|---|---|---|---|
| Refactor seeding to use PokeAPI CSV data via git submodule | draft | task | normal | 2026-02-05T18:01:09Z | 2026-02-05T18:06:04Z |
Summary
Replace the current seeding approach (which uses the pokebase Python library to hit the PokeAPI REST API, then writes intermediate JSON files) with reading static JSON data from the PokeAPI/api-data repository, pulled in as a git submodule.
The api-data repo contains a static copy of the full PokeAPI output as JSON files at data/api/v2/{endpoint}/{id}/index.json, mirroring the REST API structure exactly.
Motivation
- Eliminates network dependency: No more hitting the PokeAPI REST API (or running a local instance) during seed generation
- Faster: Reading local JSON files is instant vs. hundreds of HTTP requests (even with pokebase caching)
- Minimal code change: The JSON structure matches the API responses, so parsing logic stays similar to the current
fetch_pokeapi.py - More data available: The full dataset is available locally, not just what we query for
- Version-pinnable: The git submodule can be pinned to a specific commit for reproducible builds
- Removes
pokebasedependency: One less runtime/dev dependency to maintain
Current Approach
fetch_pokeapi.pyuses thepokebaselibrary to query the PokeAPI REST API- It processes responses and writes intermediate JSON files (
games.json,pokemon.json,firered.json, etc.) toseeds/data/ run.pyreads these JSON files and callsloader.pyto upsert into the database- Evolution data is also fetched from the API with an override mechanism (
evolution_overrides.json)
Proposed Approach
- Add git submodule: Add
https://github.com/PokeAPI/api-dataas a git submodule with--depth 1(e.g., atdata/pokeapi/orbackend/pokeapi-data/) - Rewrite
fetch_pokeapi.py: Replace API calls with local JSON file reads from the submodule. The data lives atdata/api/v2/{endpoint}/{id}/index.json. Key endpoints:pokemon/{id}/andpokemon-species/{id}/— Pokemon data & namestype/{id}/— Type dataregion/{id}/— Region data with location refslocation/{id}/— Locations with area refslocation-area/{id}/— Location areas with encounter dataversion/{id}/andversion-group/{id}/— Game/version dataevolution-chain/{id}/— Evolution chain data
- Keep the same output format: The rewritten script should still produce the same intermediate JSON files (
games.json,pokemon.json,firered.json, etc.) sorun.pyandloader.pyremain unchanged. - Keep the override mechanism:
evolution_overrides.jsonshould still work for manual corrections. - Remove
pokebasedependency: Remove frompyproject.toml/requirements.txt. - Update documentation: Update any setup/dev docs and the seed run command instructions.
Checklist
- Add
PokeAPI/api-datarepo as a git submodule (shallow clone) - Rewrite
fetch_pokeapi.pyto read local JSON files from the submodule instead of calling the API - Verify output JSON files match the current format (so
run.py/loader.pystay unchanged) - Preserve evolution override mechanism
- Remove
pokebasedependency - Test that seeding produces equivalent results
- Update dev setup docs / seed run instructions