Data Analytics for Air Transport Workshop Overview

Peter Jackson

8/28/2021

Introduction to R (Pre-Work)

  • We expect you already have familiarity with the R programming language

  • But, if you really are starting from scratch, start with Introduction to R

Introduction to RStudio (Pre-Work)

Introduction to Databases

  • The workshop starts here
  • Most online tutorials assume your data is in spreadsheet (CSV) format files
  • Projects based on CSV files easily become disorganized
  • I prefer to organize all data into relational databases
  • Relational databases support querying capability through SQL (Structured Query Language)
  • You can start simply, just using SQLite databases
  • This session is to get you started with SQL and SQLite using R
  • Introduction to Databases

Scrape the Web

  • There are large amounts of data for your project available publicly on the web
  • R has nice tools for web-scraping
  • This session is to get you started with web-scraping and storing your data in a relational database
  • Scrape the Web

Visualize Geographical Data

  • A first principle in statistics and data science is to always find some way to visualize your data
  • We are working with geographical data (data annotated with latitude and longitude)
  • R offers multiple ways to view geographical data: we use the leaflet package
  • This session is to get you started with pulling data from your database and displaying it on a map
  • Visualize Geographical Data

Capture Live Data

  • Many websites offer the ability to access their data using an Application Programming Interface (API)
  • The Open Sky Network collects ADS-B transmissions from aircraft around the world for research purposes
  • The OpenSky REST API is accessible from R using the httr package
  • This session gets you to capture live data in an infinite loop from OpenSky and archive the data to your local database
  • Capture Live Data

Animate Geographical Data

  • Having collected several minutes of live aircraft location data, you will want to display an animation of aircraft movements
  • One way to do this is to create an interactive browser-based application consisting of a map and a slider bar
  • As the user moves the slider bar, representing time, the map can interactively update with the aircraft locations for that time
  • R supports browser-based applications with the R Shiny package
  • This session introduces you to R Shiny and integrates the previous sessions into a database-driven interactive animation application
  • Animate Geographical Data

Debugging in R

  • Many students give up on programming simply because they lack debugging skill
  • Many bugs are caused by typos so learn to scan a line of code character by character
  • Debugging is detective work: proceed methodically
  • When your program crashes, concentrate your effort on finding the line where it crashed
  • Debugging in R

Workshop Wrap-Up

  • The goal of the workshop was to open the door to a powerful set of data manipulation and visualization tools connected to the R programming language
  • We have motivated the workshop with air transport research projects
  • Once started, you will find abundant online resources to take you deeper into these toolsets
  • We have demonstrated the following useful R packages
  • DBI (for database connections)
  • httr (for internet connections)
  • rvest (for HTML document processing)
  • leaflet (for geographical displays)
  • dplyr (for data wrangling)
  • stringr (for string manipulation)