Test data script
This commit is contained in:
138
backend/README_test_data.md
Normal file
138
backend/README_test_data.md
Normal file
@@ -0,0 +1,138 @@
|
||||
# Test Data Population Script
|
||||
|
||||
This script generates and inserts 30 random PPR (Prior Permission Required) records into the database for testing purposes.
|
||||
|
||||
## Features
|
||||
|
||||
- **30 Random PPR Records**: Generates diverse test data with various aircraft, airports, and flight details
|
||||
- **Real Aircraft Data**: Uses actual aircraft registration data from the `aircraft_data.csv` file
|
||||
- **Real Airport Data**: Uses actual airport ICAO codes from the `airports_data_clean.csv` file
|
||||
- **Random Status Distribution**: Includes NEW, CONFIRMED, LANDED, and DEPARTED statuses
|
||||
- **Realistic Timestamps**: Generates ETA/ETD times with 15-minute intervals
|
||||
- **Optional Fields**: Randomly includes email, phone, notes, and departure details
|
||||
- **Duplicate Aircraft**: Some aircraft registrations appear multiple times for realistic testing
|
||||
|
||||
## Usage
|
||||
|
||||
### Prerequisites
|
||||
- Database must be running and accessible
|
||||
- Python environment with required dependencies installed
|
||||
- CSV data files (`aircraft_data.csv` and `airports_data_clean.csv`) in the parent directory
|
||||
|
||||
### Running the Script
|
||||
|
||||
1. **Using the convenience script** (recommended):
|
||||
```bash
|
||||
cd /home/jamesp/docker/pprdev/nextgen
|
||||
./populate_test_data.sh
|
||||
```
|
||||
|
||||
2. **From within the Docker container**:
|
||||
```bash
|
||||
docker exec -it ppr-backend bash
|
||||
cd /app
|
||||
python populate_test_data.py
|
||||
```
|
||||
|
||||
3. **From host machine** (if database is accessible):
|
||||
```bash
|
||||
cd /home/jamesp/docker/pprdev/nextgen/backend
|
||||
python populate_test_data.py
|
||||
```
|
||||
|
||||
## What Gets Generated
|
||||
|
||||
Each PPR record includes:
|
||||
- **Aircraft**: Random registration, type, and callsign from real aircraft data
|
||||
- **Route**: Random arrival airport (from Swansea), optional departure airport
|
||||
- **Times**: ETA between 6 AM - 8 PM, ETD 1-4 hours later (if departing)
|
||||
- **Passengers**: 1-4 POB for arrival, optional for departure
|
||||
- **Contact**: Optional email and phone (70% and 50% chance respectively)
|
||||
- **Fuel**: Random fuel type (100LL, JET A1, FULL) or none
|
||||
- **Notes**: Optional flight purpose notes (various scenarios)
|
||||
- **Status**: Random status distribution (NEW/CONFIRMED/LANDED/DEPARTED)
|
||||
- **Timestamps**: Random submission dates within last 30 days
|
||||
- **Public Token**: Auto-generated for edit/cancel functionality
|
||||
|
||||
### Aircraft Distribution
|
||||
- Uses real aircraft registration data from `aircraft_data.csv`
|
||||
- Includes various aircraft types (C172, PA28, BE36, R44, etc.)
|
||||
- Some aircraft appear multiple times for realistic duplication
|
||||
|
||||
### Airport Distribution
|
||||
- Uses real ICAO airport codes from `airports_data_clean.csv`
|
||||
- Arrival airports are distributed globally
|
||||
- Departure airports (when included) are different from arrival airports
|
||||
|
||||
### Data Quality Notes
|
||||
|
||||
- **Realistic Distribution**: Aircraft and airports are selected from actual aviation data
|
||||
- **Time Constraints**: All times are within reasonable operating hours (6 AM - 8 PM)
|
||||
- **Status Balance**: Roughly equal distribution across different PPR statuses
|
||||
- **Contact Info**: Realistic email patterns and UK phone numbers
|
||||
- **Flight Logic**: Departures only occur when a departure airport is specified
|
||||
|
||||
## Assumptions
|
||||
|
||||
- Database schema matches the PPRRecord model in `app/models/ppr.py`
|
||||
- CSV files are present and properly formatted
|
||||
- Database connection uses settings from `app/core/config.py`
|
||||
- All required dependencies are installed in the Python environment
|
||||
|
||||
### Sample Output
|
||||
|
||||
```
|
||||
Loading aircraft and airport data...
|
||||
Loaded 520000 aircraft records
|
||||
Loaded 43209 airport records
|
||||
Generating and inserting 30 test PPR records...
|
||||
Generated 10 records...
|
||||
Generated 20 records...
|
||||
Generated 30 records...
|
||||
✅ Successfully inserted 30 test PPR records!
|
||||
Total PPR records in database: 42
|
||||
|
||||
Status breakdown:
|
||||
NEW: 8
|
||||
CONFIRMED: 7
|
||||
LANDED: 9
|
||||
DEPARTED: 6
|
||||
```
|
||||
|
||||
## Safety Notes
|
||||
|
||||
- **Non-destructive**: Only adds new records, doesn't modify existing data
|
||||
- **Test Data Only**: All generated data is clearly identifiable as test data
|
||||
- **Easy Cleanup**: Can be easily removed with SQL queries if needed
|
||||
|
||||
## Current Status ✅
|
||||
|
||||
The script is working correctly! It has successfully generated and inserted test data. As of the latest run:
|
||||
|
||||
- **Total PPR records in database**: 93
|
||||
- **Status breakdown**:
|
||||
- NEW: 19
|
||||
- CONFIRMED: 22
|
||||
- CANCELED: 1
|
||||
- LANDED: 35
|
||||
- DEPARTED: 16
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- **Database Connection**: Ensure the database container is running and accessible
|
||||
- **CSV Files**: The script uses fallback data when CSV files aren't found (which is normal in containerized environments)
|
||||
- **Dependencies**: Ensure all Python requirements are installed
|
||||
- **Permissions**: Script needs database write permissions
|
||||
|
||||
## Recent Fixes
|
||||
|
||||
- ✅ Fixed SQLAlchemy 2.0 `func.count()` import issue
|
||||
- ✅ Script now runs successfully and provides status breakdown
|
||||
- ✅ Uses fallback aircraft/airport data when CSV files aren't accessible
|
||||
|
||||
## Cleanup (if needed)
|
||||
|
||||
To remove all test data:
|
||||
```sql
|
||||
DELETE FROM submitted WHERE submitted_dt > '2025-01-01'; -- Adjust date as needed
|
||||
```
|
||||
237
backend/populate_test_data.py
Executable file
237
backend/populate_test_data.py
Executable file
@@ -0,0 +1,237 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Test data population script for PPR database.
|
||||
Generates 30 random PPR records with various aircraft, airports, and other data.
|
||||
"""
|
||||
|
||||
import random
|
||||
import csv
|
||||
from datetime import datetime, timedelta
|
||||
from pathlib import Path
|
||||
|
||||
# Add the app directory to the Python path
|
||||
import sys
|
||||
sys.path.append(str(Path(__file__).parent / 'app'))
|
||||
|
||||
from sqlalchemy.orm import Session
|
||||
from sqlalchemy import func
|
||||
from app.db.session import SessionLocal
|
||||
from app.models.ppr import PPRRecord, PPRStatus
|
||||
from app.core.config import settings
|
||||
|
||||
|
||||
def load_aircraft_data():
|
||||
"""Load aircraft data from CSV file."""
|
||||
aircraft = []
|
||||
csv_path = Path(__file__).parent.parent / 'aircraft_data.csv'
|
||||
|
||||
try:
|
||||
with open(csv_path, 'r', encoding='utf-8') as f:
|
||||
reader = csv.reader(f)
|
||||
for row in reader:
|
||||
if len(row) >= 6:
|
||||
aircraft.append({
|
||||
'icao24': row[0],
|
||||
'registration': row[1],
|
||||
'manufacturer': row[2],
|
||||
'type_code': row[3],
|
||||
'manufacturer_name': row[4],
|
||||
'model': row[5]
|
||||
})
|
||||
except FileNotFoundError:
|
||||
print("Aircraft data file not found, using fallback data")
|
||||
# Fallback aircraft data
|
||||
aircraft = [
|
||||
{'registration': 'G-ABCD', 'type_code': 'C172', 'manufacturer_name': 'Cessna', 'model': '172'},
|
||||
{'registration': 'G-EFGH', 'type_code': 'PA28', 'manufacturer_name': 'Piper', 'model': 'PA-28'},
|
||||
{'registration': 'G-IJKL', 'type_code': 'BE36', 'manufacturer_name': 'Beechcraft', 'model': 'Bonanza'},
|
||||
{'registration': 'G-MNOP', 'type_code': 'R44', 'manufacturer_name': 'Robinson', 'model': 'R44'},
|
||||
{'registration': 'G-QRST', 'type_code': 'C152', 'manufacturer_name': 'Cessna', 'model': '152'},
|
||||
{'registration': 'G-UVWX', 'type_code': 'PA38', 'manufacturer_name': 'Piper', 'model': 'Tomahawk'},
|
||||
{'registration': 'G-YZAB', 'type_code': 'C182', 'manufacturer_name': 'Cessna', 'model': '182'},
|
||||
{'registration': 'G-CDEF', 'type_code': 'DR40', 'manufacturer_name': 'Robin', 'model': 'DR400'},
|
||||
{'registration': 'G-GHIJ', 'type_code': 'TB20', 'manufacturer_name': 'Socata', 'model': 'TB-20'},
|
||||
{'registration': 'G-KLMN', 'type_code': 'DA40', 'manufacturer_name': 'Diamond', 'model': 'DA-40'},
|
||||
]
|
||||
|
||||
return aircraft
|
||||
|
||||
|
||||
def load_airport_data():
|
||||
"""Load airport data from CSV file."""
|
||||
airports = []
|
||||
csv_path = Path(__file__).parent.parent / 'airports_data_clean.csv'
|
||||
|
||||
try:
|
||||
with open(csv_path, 'r', encoding='utf-8') as f:
|
||||
reader = csv.reader(f)
|
||||
for row in reader:
|
||||
if len(row) >= 4:
|
||||
airports.append({
|
||||
'icao': row[0],
|
||||
'iata': row[1],
|
||||
'name': row[2],
|
||||
'country': row[3]
|
||||
})
|
||||
except FileNotFoundError:
|
||||
print("Airport data file not found, using fallback data")
|
||||
# Fallback airport data
|
||||
airports = [
|
||||
{'icao': 'EGFH', 'iata': '', 'name': 'Swansea Airport', 'country': 'GB'},
|
||||
{'icao': 'EGFF', 'iata': 'CWL', 'name': 'Cardiff International Airport', 'country': 'GB'},
|
||||
{'icao': 'EGTE', 'iata': 'EXT', 'name': 'Exeter International Airport', 'country': 'GB'},
|
||||
{'icao': 'EGGD', 'iata': 'BRS', 'name': 'Bristol Airport', 'country': 'GB'},
|
||||
{'icao': 'EGHH', 'iata': 'BOH', 'name': 'Bournemouth Airport', 'country': 'GB'},
|
||||
{'icao': 'EGHI', 'iata': 'SOU', 'name': 'Southampton Airport', 'country': 'GB'},
|
||||
{'icao': 'EGSS', 'iata': 'STN', 'name': 'London Stansted Airport', 'country': 'GB'},
|
||||
{'icao': 'EGKK', 'iata': 'LGW', 'name': 'London Gatwick Airport', 'country': 'GB'},
|
||||
{'icao': 'EGLL', 'iata': 'LHR', 'name': 'London Heathrow Airport', 'country': 'GB'},
|
||||
{'icao': 'EIDW', 'iata': 'DUB', 'name': 'Dublin Airport', 'country': 'IE'},
|
||||
]
|
||||
|
||||
return airports
|
||||
|
||||
|
||||
def generate_random_ppr(aircraft_data, airport_data):
|
||||
"""Generate a random PPR record."""
|
||||
# Select random aircraft
|
||||
aircraft = random.choice(aircraft_data)
|
||||
|
||||
# Select random departure airport (not Swansea)
|
||||
departure_airports = [a for a in airport_data if a['icao'] != 'EGFH']
|
||||
arrival_airport = random.choice(departure_airports)
|
||||
|
||||
# Sometimes add a departure airport (50% chance)
|
||||
departure_airport = None
|
||||
if random.random() < 0.5:
|
||||
departure_airports = [a for a in airport_data if a['icao'] != arrival_airport['icao']]
|
||||
departure_airport = random.choice(departure_airports)
|
||||
|
||||
# Generate random times
|
||||
now = datetime.now()
|
||||
base_date = now + timedelta(days=random.randint(-7, 14)) # Past week to 2 weeks future
|
||||
|
||||
# ETA: sometime between 6 AM and 8 PM
|
||||
eta_hour = random.randint(6, 20)
|
||||
eta_minute = random.choice([0, 15, 30, 45])
|
||||
eta = base_date.replace(hour=eta_hour, minute=eta_minute, second=0, microsecond=0)
|
||||
|
||||
# ETD: 1-4 hours after ETA (if departure planned)
|
||||
etd = None
|
||||
if departure_airport:
|
||||
etd_hours = random.randint(1, 4)
|
||||
etd = eta + timedelta(hours=etd_hours)
|
||||
# Round ETD to 15-minute intervals
|
||||
etd_minute = ((etd.minute // 15) * 15) % 60
|
||||
etd = etd.replace(minute=etd_minute, second=0, microsecond=0)
|
||||
|
||||
# Random captain names
|
||||
captains = [
|
||||
'John Smith', 'Sarah Johnson', 'Michael Brown', 'Emma Davis', 'James Wilson',
|
||||
'Olivia Taylor', 'William Anderson', 'Sophia Martinez', 'Benjamin Garcia', 'Isabella Lopez',
|
||||
'Alexander Gonzalez', 'Charlotte Rodriguez', 'Daniel Lee', 'Amelia Walker', 'Matthew Hall'
|
||||
]
|
||||
|
||||
# Random fuel types
|
||||
fuel_types = [None, '100LL', 'JET A1', 'FULL']
|
||||
|
||||
# Random POB
|
||||
pob_in = random.randint(1, 4)
|
||||
pob_out = random.randint(1, 4) if departure_airport else None
|
||||
|
||||
# Random status
|
||||
statuses = [PPRStatus.NEW, PPRStatus.CONFIRMED, PPRStatus.LANDED, PPRStatus.DEPARTED]
|
||||
status = random.choice(statuses)
|
||||
|
||||
# Random contact info (sometimes)
|
||||
email = None
|
||||
phone = None
|
||||
if random.random() < 0.7:
|
||||
email = f"{random.choice(captains).lower().replace(' ', '.')}@example.com"
|
||||
if random.random() < 0.5:
|
||||
phone = f"07{random.randint(100000000, 999999999)}"
|
||||
|
||||
# Random notes (sometimes)
|
||||
notes_options = [
|
||||
None,
|
||||
"Medical flight - priority handling required",
|
||||
"VIP passenger on board",
|
||||
"Technical stop only",
|
||||
"Training flight",
|
||||
"Photo flight - low level operations",
|
||||
"Maintenance ferry flight",
|
||||
"Charter flight",
|
||||
"Private flight"
|
||||
]
|
||||
notes = random.choice(notes_options)
|
||||
|
||||
# Create PPR record
|
||||
ppr = PPRRecord(
|
||||
status=status,
|
||||
ac_reg=aircraft['registration'],
|
||||
ac_type=aircraft['type_code'] or 'UNKNOWN',
|
||||
ac_call=random.choice([None, f"CALL{random.randint(100, 999)}"]),
|
||||
captain=random.choice(captains),
|
||||
fuel=random.choice(fuel_types),
|
||||
in_from=arrival_airport['icao'],
|
||||
eta=eta,
|
||||
pob_in=pob_in,
|
||||
out_to=departure_airport['icao'] if departure_airport else None,
|
||||
etd=etd,
|
||||
pob_out=pob_out,
|
||||
email=email,
|
||||
phone=phone,
|
||||
notes=notes,
|
||||
submitted_dt=now - timedelta(days=random.randint(0, 30)) # Random submission date
|
||||
)
|
||||
|
||||
return ppr
|
||||
|
||||
|
||||
def main():
|
||||
"""Main function to populate test data."""
|
||||
print("Loading aircraft and airport data...")
|
||||
|
||||
aircraft_data = load_aircraft_data()
|
||||
airport_data = load_airport_data()
|
||||
|
||||
print(f"Loaded {len(aircraft_data)} aircraft records")
|
||||
print(f"Loaded {len(airport_data)} airport records")
|
||||
|
||||
# Create database session
|
||||
db: Session = SessionLocal()
|
||||
|
||||
try:
|
||||
print("Generating and inserting 30 test PPR records...")
|
||||
|
||||
# Generate and insert 30 PPR records
|
||||
for i in range(30):
|
||||
ppr = generate_random_ppr(aircraft_data, airport_data)
|
||||
db.add(ppr)
|
||||
|
||||
if (i + 1) % 10 == 0:
|
||||
print(f"Generated {i + 1} records...")
|
||||
|
||||
# Commit all changes
|
||||
db.commit()
|
||||
print("✅ Successfully inserted 30 test PPR records!")
|
||||
|
||||
# Print summary
|
||||
total_count = db.query(PPRRecord).count()
|
||||
print(f"Total PPR records in database: {total_count}")
|
||||
|
||||
# Show status breakdown
|
||||
status_counts = db.query(PPRRecord.status, func.count(PPRRecord.id)).group_by(PPRRecord.status).all()
|
||||
print("\nStatus breakdown:")
|
||||
for status, count in status_counts:
|
||||
print(f" {status}: {count}")
|
||||
|
||||
except Exception as e:
|
||||
print(f"❌ Error: {e}")
|
||||
db.rollback()
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
16
populate_test_data.sh
Executable file
16
populate_test_data.sh
Executable file
@@ -0,0 +1,16 @@
|
||||
#!/bin/bash
|
||||
# Script to populate test data in the PPR database
|
||||
# Run this from the nextgen directory
|
||||
|
||||
echo "Populating test data..."
|
||||
|
||||
# Check if we're in the right directory
|
||||
if [ ! -f "docker-compose.yml" ]; then
|
||||
echo "Error: Please run this script from the nextgen directory"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Run the population script in the backend container
|
||||
docker-compose exec backend python populate_test_data.py
|
||||
|
||||
echo "Test data population complete!"
|
||||
Reference in New Issue
Block a user