4.9 KiB
4.9 KiB
Test Data Population Script
This script generates and inserts 30 random PPR (Prior Permission Required) records into the database for testing purposes.
Features
- 30 Random PPR Records: Generates diverse test data with various aircraft, airports, and flight details
- Real Aircraft Data: Uses actual aircraft registration data from the
aircraft_data.csvfile - Real Airport Data: Uses actual airport ICAO codes from the
airports_data_clean.csvfile - Random Status Distribution: Includes NEW, CONFIRMED, LANDED, and DEPARTED statuses
- Realistic Timestamps: Generates ETA/ETD times with 15-minute intervals
- Optional Fields: Randomly includes email, phone, notes, and departure details
- Duplicate Aircraft: Some aircraft registrations appear multiple times for realistic testing
Usage
Prerequisites
- Database must be running and accessible
- Python environment with required dependencies installed
- CSV data files (
aircraft_data.csvandairports_data_clean.csv) in the parent directory
Running the Script
-
Using the convenience script (recommended):
cd /home/jamesp/docker/pprdev/nextgen ./populate_test_data.sh -
From within the Docker container:
docker exec -it ppr-backend bash cd /app python populate_test_data.py -
From host machine (if database is accessible):
cd /home/jamesp/docker/pprdev/nextgen/backend python populate_test_data.py
What Gets Generated
Each PPR record includes:
- Aircraft: Random registration, type, and callsign from real aircraft data
- Route: Random arrival airport (from Swansea), optional departure airport
- Times: ETA between 6 AM - 8 PM, ETD 1-4 hours later (if departing)
- Passengers: 1-4 POB for arrival, optional for departure
- Contact: Optional email and phone (70% and 50% chance respectively)
- Fuel: Random fuel type (100LL, JET A1, FULL) or none
- Notes: Optional flight purpose notes (various scenarios)
- Status: Random status distribution (NEW/CONFIRMED/LANDED/DEPARTED)
- Timestamps: Random submission dates within last 30 days
- Public Token: Auto-generated for edit/cancel functionality
Aircraft Distribution
- Uses real aircraft registration data from
aircraft_data.csv - Includes various aircraft types (C172, PA28, BE36, R44, etc.)
- Some aircraft appear multiple times for realistic duplication
Airport Distribution
- Uses real ICAO airport codes from
airports_data_clean.csv - Arrival airports are distributed globally
- Departure airports (when included) are different from arrival airports
Data Quality Notes
- Realistic Distribution: Aircraft and airports are selected from actual aviation data
- Time Constraints: All times are within reasonable operating hours (6 AM - 8 PM)
- Status Balance: Roughly equal distribution across different PPR statuses
- Contact Info: Realistic email patterns and UK phone numbers
- Flight Logic: Departures only occur when a departure airport is specified
Assumptions
- Database schema matches the PPRRecord model in
app/models/ppr.py - CSV files are present and properly formatted
- Database connection uses settings from
app/core/config.py - All required dependencies are installed in the Python environment
Sample Output
Loading aircraft and airport data...
Loaded 520000 aircraft records
Loaded 43209 airport records
Generating and inserting 30 test PPR records...
Generated 10 records...
Generated 20 records...
Generated 30 records...
✅ Successfully inserted 30 test PPR records!
Total PPR records in database: 42
Status breakdown:
NEW: 8
CONFIRMED: 7
LANDED: 9
DEPARTED: 6
Safety Notes
- Non-destructive: Only adds new records, doesn't modify existing data
- Test Data Only: All generated data is clearly identifiable as test data
- Easy Cleanup: Can be easily removed with SQL queries if needed
Current Status ✅
The script is working correctly! It has successfully generated and inserted test data. As of the latest run:
- Total PPR records in database: 93
- Status breakdown:
- NEW: 19
- CONFIRMED: 22
- CANCELED: 1
- LANDED: 35
- DEPARTED: 16
Troubleshooting
- Database Connection: Ensure the database container is running and accessible
- CSV Files: The script uses fallback data when CSV files aren't found (which is normal in containerized environments)
- Dependencies: Ensure all Python requirements are installed
- Permissions: Script needs database write permissions
Recent Fixes
- ✅ Fixed SQLAlchemy 2.0
func.count()import issue - ✅ Script now runs successfully and provides status breakdown
- ✅ Uses fallback aircraft/airport data when CSV files aren't accessible
Cleanup (if needed)
To remove all test data:
DELETE FROM submitted WHERE submitted_dt > '2025-01-01'; -- Adjust date as needed