Files
nuzlocke-tracker/.beans/nuzlocke-tracker-ya2a--refactor-seeding-to-use-pokeapi-csv-data-via-git-s.md

3.4 KiB

title, status, type, priority, created_at, updated_at
title status type priority created_at updated_at
Refactor seeding to use PokeAPI CSV data via git submodule draft task normal 2026-02-05T18:01:09Z 2026-02-05T18:06:04Z

Summary

Replace the current seeding approach (which uses the pokebase Python library to hit the PokeAPI REST API, then writes intermediate JSON files) with reading static JSON data from the PokeAPI/api-data repository, pulled in as a git submodule.

The api-data repo contains a static copy of the full PokeAPI output as JSON files at data/api/v2/{endpoint}/{id}/index.json, mirroring the REST API structure exactly.

Motivation

  • Eliminates network dependency: No more hitting the PokeAPI REST API (or running a local instance) during seed generation
  • Faster: Reading local JSON files is instant vs. hundreds of HTTP requests (even with pokebase caching)
  • Minimal code change: The JSON structure matches the API responses, so parsing logic stays similar to the current fetch_pokeapi.py
  • More data available: The full dataset is available locally, not just what we query for
  • Version-pinnable: The git submodule can be pinned to a specific commit for reproducible builds
  • Removes pokebase dependency: One less runtime/dev dependency to maintain

Current Approach

  1. fetch_pokeapi.py uses the pokebase library to query the PokeAPI REST API
  2. It processes responses and writes intermediate JSON files (games.json, pokemon.json, firered.json, etc.) to seeds/data/
  3. run.py reads these JSON files and calls loader.py to upsert into the database
  4. Evolution data is also fetched from the API with an override mechanism (evolution_overrides.json)

Proposed Approach

  1. Add git submodule: Add https://github.com/PokeAPI/api-data as a git submodule with --depth 1 (e.g., at data/pokeapi/ or backend/pokeapi-data/)
  2. Rewrite fetch_pokeapi.py: Replace API calls with local JSON file reads from the submodule. The data lives at data/api/v2/{endpoint}/{id}/index.json. Key endpoints:
    • pokemon/{id}/ and pokemon-species/{id}/ — Pokemon data & names
    • type/{id}/ — Type data
    • region/{id}/ — Region data with location refs
    • location/{id}/ — Locations with area refs
    • location-area/{id}/ — Location areas with encounter data
    • version/{id}/ and version-group/{id}/ — Game/version data
    • evolution-chain/{id}/ — Evolution chain data
  3. Keep the same output format: The rewritten script should still produce the same intermediate JSON files (games.json, pokemon.json, firered.json, etc.) so run.py and loader.py remain unchanged.
  4. Keep the override mechanism: evolution_overrides.json should still work for manual corrections.
  5. Remove pokebase dependency: Remove from pyproject.toml / requirements.txt.
  6. Update documentation: Update any setup/dev docs and the seed run command instructions.

Checklist

  • Add PokeAPI/api-data repo as a git submodule (shallow clone)
  • Rewrite fetch_pokeapi.py to read local JSON files from the submodule instead of calling the API
  • Verify output JSON files match the current format (so run.py/loader.py stay unchanged)
  • Preserve evolution override mechanism
  • Remove pokebase dependency
  • Test that seeding produces equivalent results
  • Update dev setup docs / seed run instructions