nuzlocke-tracker/.beans/nuzlocke-tracker-ya2a--refactor-seeding-to-use-pokeapi-csv-data-via-git-s.md at 71a8f2e695904961cba6a9fd6a7af1fec1b651f2

Files

Julian Tabel 71a8f2e695 Add bean for refactoring seeding to use PokeAPI CSV data via git submodule

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-05 19:01:56 +01:00

3.4 KiB

Raw Blame History

title, status, type, created_at, updated_at

title	status	type	created_at	updated_at
Refactor seeding to use PokeAPI CSV data via git submodule	draft	task	2026-02-05T18:01:09Z	2026-02-05T18:01:09Z

Summary

Replace the current seeding approach (which uses the pokebase Python library to hit the PokeAPI REST API, then writes intermediate JSON files) with direct CSV parsing from the PokeAPI/pokeapi repository's data/v2/csv/ directory, pulled in as a git submodule.

Motivation

Eliminates network dependency: No more hitting the PokeAPI REST API (or running a local instance) during seed generation
Faster: Reading local CSVs is instant vs. hundreds of HTTP requests (even with pokebase caching)
More data available: The CSVs contain the complete dataset, not just what we query for
Version-pinnable: The git submodule can be pinned to a specific commit for reproducible builds
Removes pokebase dependency: One less runtime/dev dependency to maintain

Current Approach

fetch_pokeapi.py uses the pokebase library to query the PokeAPI REST API
It processes responses and writes intermediate JSON files (games.json, pokemon.json, firered.json, etc.) to seeds/data/
run.py reads these JSON files and calls loader.py to upsert into the database
Evolution data is also fetched from the API with an override mechanism (evolution_overrides.json)

Proposed Approach

Add git submodule: Add https://github.com/PokeAPI/pokeapi as a git submodule (e.g., at backend/pokeapi-data/ or a top-level data/pokeapi/ directory)
Write CSV parser: Create a new module that reads the relevant CSVs directly. Key CSV files include:
- pokemon.csv, pokemon_species.csv, pokemon_types.csv — Pokemon data
- locations.csv, location_areas.csv, location_names.csv — Location/route data
- encounters.csv, encounter_slots.csv, encounter_methods.csv — Encounter data
- versions.csv, version_groups.csv, version_names.csv — Game/version data
- pokemon_evolution.csv, evolution_chains.csv, evolution_triggers.csv — Evolution data
- types.csv, type_names.csv — Type data
- regions.csv — Region data
Replace fetch_pokeapi.py: The new CSV parser replaces the API-fetching script. It should produce the same (or equivalent) output that loader.py expects, or loader.py should be updated to accept the new data format.
Keep the override mechanism: evolution_overrides.json should still work for manual corrections.
Remove intermediate JSON files: The generated JSON files in seeds/data/ can be removed from version control since data now comes from the submodule.
Remove pokebase dependency: Remove from pyproject.toml / requirements.txt.
Update documentation: Update any setup/dev docs and the seed run command instructions.

Checklist

Add PokeAPI repo as a git submodule
Identify and document all needed CSV files from the PokeAPI data
Write CSV parsing module to replace fetch_pokeapi.py
Update run.py and/or loader.py to work with the new data source
Preserve evolution override mechanism
Remove intermediate JSON seed data files from version control
Remove pokebase dependency
Test that seeding produces equivalent results
Update dev setup docs / seed run instructions

3.4 KiB Raw Blame History