CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

Juicepyter is a Pokémon card generator pipeline that takes a natural language description, cleans it, extracts structured JSON metadata, and generates a card image using a LoRA-finetuned Stable Diffusion model. A Streamlit UI (app.py) ties it all together.

Architecture — Three-Stage Pipeline

The pipeline (prompt_to_card_pipeline.py) orchestrates three stages:

Text cleaning (text-cleaner/text_cleaning_pipeline.py): NLTK-based pipeline — lowercasing, punctuation/slang removal, stopword filtering, POS-aware lemmatization. Entry point: get_clean_text(raw_text) -> str.
Keyword extraction + JSON inference (clean-text-to-keywords/): spaCy + YAKE keyword extraction (keyword_extractor.py) → rule-based JSON inference (json_inference.py) that populates a TCG-style card template. CLI: infer_json_usage.py. No LLM calls — deterministic and rule-based.
Card image generation (card_generator_adapter.py): Loads runwayml/stable-diffusion-v1-5 with a LoRA adapter (PEFT) from pokemon_card_lora/, converts metadata to a SD prompt via metadata_to_conditioning(), runs inference. The generator module is pluggable via --generator-module.

fetch_card.py is a standalone data collection script that downloads real Pokémon TCG card images with embedded metadata using the TCGdex SDK.

Commands

Run the Streamlit app

streamlit run app.py

Run the full pipeline CLI

python prompt_to_card_pipeline.py "description text" \
  --text-cleaner-path text-cleaner/text_cleaning_pipeline.py \
  --infer-script-path clean-text-to-keywords/infer_json_usage.py \
  --checkpoint pokemon_card_lora \
  --template clean-text-to-keywords/json_template_example.json \
  --generator-module card_generator_adapter.py \
  --device cpu \
  --save-path generated_card.png \
  --print-json

Run keyword extraction + JSON inference only

cd clean-text-to-keywords
python infer_json_usage.py --template json_template_example.json "your pokemon description"

Tests

cd clean-text-to-keywords
python -m unittest -q

Dependencies

text-cleaner: nltk (punkt, stopwords, wordnet, averaged_perceptron_tagger)
clean-text-to-keywords: spacy>=3.7.0, yake>=0.4.2, spaCy model en_core_web_sm
card generation: diffusers, torch, peft, transformers, accelerate, safetensors
app: streamlit, Pillow
fetch_card: tcgdexsdk, Pillow

Python 3.13 or lower recommended (spaCy compatibility).

Key Design Decisions

The generator module pattern is pluggable: any module with build_pipeline(checkpoint_path, device) and optionally metadata_to_conditioning(meta) can be swapped in via --generator-module.
The JSON inference stage preserves non-empty fields in the provided template — only empty fields get populated.
The LoRA base model is runwayml/stable-diffusion-v1-5 with PEFT adapter weights in pokemon_card_lora/.

3.0 KiB Raw Blame History