Files
ppr-ng/backend/README_test_data.md
James Pattinson 023c238cee Test data script
2025-10-25 14:09:43 +00:00

4.9 KiB

Test Data Population Script

This script generates and inserts 30 random PPR (Prior Permission Required) records into the database for testing purposes.

Features

  • 30 Random PPR Records: Generates diverse test data with various aircraft, airports, and flight details
  • Real Aircraft Data: Uses actual aircraft registration data from the aircraft_data.csv file
  • Real Airport Data: Uses actual airport ICAO codes from the airports_data_clean.csv file
  • Random Status Distribution: Includes NEW, CONFIRMED, LANDED, and DEPARTED statuses
  • Realistic Timestamps: Generates ETA/ETD times with 15-minute intervals
  • Optional Fields: Randomly includes email, phone, notes, and departure details
  • Duplicate Aircraft: Some aircraft registrations appear multiple times for realistic testing

Usage

Prerequisites

  • Database must be running and accessible
  • Python environment with required dependencies installed
  • CSV data files (aircraft_data.csv and airports_data_clean.csv) in the parent directory

Running the Script

  1. Using the convenience script (recommended):

    cd /home/jamesp/docker/pprdev/nextgen
    ./populate_test_data.sh
    
  2. From within the Docker container:

    docker exec -it ppr-backend bash
    cd /app
    python populate_test_data.py
    
  3. From host machine (if database is accessible):

    cd /home/jamesp/docker/pprdev/nextgen/backend
    python populate_test_data.py
    

What Gets Generated

Each PPR record includes:

  • Aircraft: Random registration, type, and callsign from real aircraft data
  • Route: Random arrival airport (from Swansea), optional departure airport
  • Times: ETA between 6 AM - 8 PM, ETD 1-4 hours later (if departing)
  • Passengers: 1-4 POB for arrival, optional for departure
  • Contact: Optional email and phone (70% and 50% chance respectively)
  • Fuel: Random fuel type (100LL, JET A1, FULL) or none
  • Notes: Optional flight purpose notes (various scenarios)
  • Status: Random status distribution (NEW/CONFIRMED/LANDED/DEPARTED)
  • Timestamps: Random submission dates within last 30 days
  • Public Token: Auto-generated for edit/cancel functionality

Aircraft Distribution

  • Uses real aircraft registration data from aircraft_data.csv
  • Includes various aircraft types (C172, PA28, BE36, R44, etc.)
  • Some aircraft appear multiple times for realistic duplication

Airport Distribution

  • Uses real ICAO airport codes from airports_data_clean.csv
  • Arrival airports are distributed globally
  • Departure airports (when included) are different from arrival airports

Data Quality Notes

  • Realistic Distribution: Aircraft and airports are selected from actual aviation data
  • Time Constraints: All times are within reasonable operating hours (6 AM - 8 PM)
  • Status Balance: Roughly equal distribution across different PPR statuses
  • Contact Info: Realistic email patterns and UK phone numbers
  • Flight Logic: Departures only occur when a departure airport is specified

Assumptions

  • Database schema matches the PPRRecord model in app/models/ppr.py
  • CSV files are present and properly formatted
  • Database connection uses settings from app/core/config.py
  • All required dependencies are installed in the Python environment

Sample Output

Loading aircraft and airport data...
Loaded 520000 aircraft records
Loaded 43209 airport records
Generating and inserting 30 test PPR records...
Generated 10 records...
Generated 20 records...
Generated 30 records...
✅ Successfully inserted 30 test PPR records!
Total PPR records in database: 42

Status breakdown:
  NEW: 8
  CONFIRMED: 7
  LANDED: 9
  DEPARTED: 6

Safety Notes

  • Non-destructive: Only adds new records, doesn't modify existing data
  • Test Data Only: All generated data is clearly identifiable as test data
  • Easy Cleanup: Can be easily removed with SQL queries if needed

Current Status

The script is working correctly! It has successfully generated and inserted test data. As of the latest run:

  • Total PPR records in database: 93
  • Status breakdown:
    • NEW: 19
    • CONFIRMED: 22
    • CANCELED: 1
    • LANDED: 35
    • DEPARTED: 16

Troubleshooting

  • Database Connection: Ensure the database container is running and accessible
  • CSV Files: The script uses fallback data when CSV files aren't found (which is normal in containerized environments)
  • Dependencies: Ensure all Python requirements are installed
  • Permissions: Script needs database write permissions

Recent Fixes

  • Fixed SQLAlchemy 2.0 func.count() import issue
  • Script now runs successfully and provides status breakdown
  • Uses fallback aircraft/airport data when CSV files aren't accessible

Cleanup (if needed)

To remove all test data:

DELETE FROM submitted WHERE submitted_dt > '2025-01-01'; -- Adjust date as needed