Introduction
Nowadays airplanes are very important medium of transportation which used for moving people and cargo from one place to another place. But because of different circumstances, the flights can be delayed. The flight delay problem can cause a huge waste of valuable time and resources. If the responsible authority can identify the problems and put enough afford to solve the problems, it will save a huge amount of valuable time of the passengers. The passengers can also take a feasible decision about taking the flight from one place to another properly which will help them to reduce the risk of flight delay. The causes and pattern of flight delay can be identified by this research and used for feasible solution and decision making. The overall analysis is based on a dataset named “Flight Information for January 2020”. By using that dataset, the overall analysis and problem-solving process of causes and pattern of flight delay are done. The analysis results can be helpful for a diverse group of people who use flight or anyhow related to flight.
Data Preparation
Name of the dataset used here is “Flight Information for January 2020”. There are 5 sheets in the dataset and the main sheet named Flight Information. The Flight Information sheet contains 26 variables and 607346 rows of observations. Other sheets of the dataset contain some valuable information like Day of Week, OP Unique Carrier, Airport. Those sheets contain common code of Flight Information dataset. So, the Day of Week, OP Unique Carrier, Airport variables are added to the main dataset by selecting relevant key variables and using left join. Thus, we finally get some extra variables joint in the Flight Information dataset like Day, Unique Carrier, DestAirport, and OriginAirport.
Analysis
By analyzing the overall dataset and different variables from a different perspective, we can see that there are some top Airlines, Airports, States, Days where a significant amount of delays occurred. From different visualizations and analyses, we have drawn those insights.
Key Insight 1: Different Airlines show a different amount of delays in January 2020. To identify the most departure and arrival delays occurred by any Airlines, we can see from Figure 1 that SkyWest Airlines Inc. faces the most departure and arrival delays. The total amount of departure delays occurs in SkyWest Airlines Inc. is 757939 minutes (early departure minutes are deducted from the total) in January 2020. The total amount of arrival delays occurs in SkyWest Airlines Inc. is 967129 minutes in January 2020s. After SkyWest Airlines Inc. the second most delays occur in American Airlines Inc.
Key Insight 2: We know that weekdays are Monday to Friday and weekend are Saturday to Sunday. From Figure 2, we can see the days with the most Departure and Arrival delays and it is clearly visible. We can see that the most departure and arrival delays happen on Friday and which is not a weekend. The total departure and arrival delays on Fridays of January 2020 is respectively 800438 (early departure minutes are deducted from the total) and 1138014 minutes. But the second most Departure and Arrival delays happen on Saturdays of January 2020 which are respectively 758286 and 1005809 minutes. As Friday is the last weekday and Saturday is the first day of the weekend, the situation is assumable.
Key Insight 3: From Figure 3, we can see that for different states the Departure and Arrival delays act differently. For better visualization, the dashboard of Figure 3 is created for the top 10 states by Departure and Arrival delays. It can be seen from Figure 3 that the origin state Texas shows the most Departure delays which are 461143 minutes in January 2020. As a destination state, Texas also shows the most Arrival delays which are 684785 minutes in January 2020. Here, we can see that California is in the 2nd place, Florida is in the 3rd place, and Illinois is in the 4th place. From 5th to 8th place, there are variations for Departure and Arrival delays. North Carolina in the 5th position for Departure delays but New York in the 5th position of Arrival delays.
Key Insight 4: There are different types of the reason behind those delays. We have information about five of them from the Flight Information for January 2020 dataset. Those are Carrier Delay, Weather Delay, National Air System Delay, Security Delay, and Late Aircraft Delay in minutes. From Figure 4, we can see that Carrier Delay is the main reason for most of the delays in January 2020. The total amount of delay occurred in January 2020 by Carrier delay is 2032137 minutes, by Late Aircraft delay is 1691916 minutes, by National Air System delay is 1173609 minutes, by Weather delay is 378095 minutes, and by Security delay is 7493 minutes. Here, the Security delay is the least cause of Departure and Arrival delay.
Key Insight 5: From Figure 4, we have seen that the total amount for Weather delay is 378095 minutes. The Weather delay has a different pattern for different cities for different days of the week. In Figure 5, the destination cities are sorted in descending order by total Weather delays. From Figure 5, we can see that Chicago City of State Illinois shows the most Weather delay on Friday (16284 minutes), Dallas/Fort Worth City of State Texas shows the most Weather delay on Monday (12299 minutes), and Atlanta City of State Georgia shows the most Weather delay on Saturday (8916 minutes). Thus, it can be clearly seen that the Weather delays for destination cities in different days do not show any specific pattern for each city. Because of different weather pattern in different destination cities the Weather delay do not follow any specific day to day pattern.
The Weather delays of departure cities can be seen from Figure 6. In Figure 6, the departure cities are sorted in descending order by total Weather delays. From Figure 6, we can see that Chicago City of State Illinois shows the most Weather delay on Friday (13139 minutes), Dallas/Fort Worth City of State Texas shows the most Weather delay on Friday (10673 minutes), and Atlanta City of State Georgia shows the most Weather delay on Saturday (8661 minutes). Thus, it can be clearly seen that the Weather delays for departure cities on different days do not show any specific pattern for each city. Because of different weather pattern in different departure cities the Weather delay do not follow any specific day to day pattern. As Weather is a seasonal pattern, it can vary entirely for different months of years.
Key Insight 6: Security delays may vary from Airport to Airport. From Figure 4, we can see that the total amount of Security delay listed here is 7493 minutes in January 2020. In Figure 6, a dashboard is created for observing Top 10 total Security delays of Departure Airports and Destination Airports in January 2020. From the dashboard of Figure 6, we can see that Charlotte Douglas International Airport is the Departure Airport where most of the Security delays occurs. The total amount of Security delays occurs in that Departure Airport is 779 minutes in January. From the dashboard of Figure 6, we can see that Dallas/Fort Worth International Airport is the Destination Airport where most of the Security delays occurs. The total amount of Security delays occurs in that Destination Airport is 376 minutes in January. If we compare both of those bar chart of the dashboard from Figure 6, we can see that comparatively more Security delays occurred in Departure Airports then the Destination Airports.
Key Insight 7: We have seen significant Security delays in Departure and Destination Airports from Figure 6. From Figure 7, we can see that some airports are showing more Departure delays than others. The treemap of Figure 7 is showing higher Departure delays with red color and lower Departure delays with green color. From that treemap, we can see that in January 2020 the most Departure delays occurred in Dallas/Fort Worth International Airport which is 244664 minutes. The second most Departure delays occurred in Chicago O’Hare International Airport which is 203476 minutes.
Key Insight 8: As well as Departure delays, we can see that some airports are showing more Arrival delays than others too. The treemap of Figure 8 is showing higher Arrival delays with red color and lower Departure delays with green color. From that treemap, we can see that in January 2020 the most Arrival delays occurred in Dallas/Fort Worth International Airport which is 351637 minutes. The second most Arrival delays occurred in Chicago O’Hare International Airport which is 327296 minutes.
Key Insight 9: As we have a flight date variable, we can see Departure delays and Arrival delays over the days of January 2020. In Figure 9 dual-line diagram is drawn for Departure delays and Arrival delays comparison overtime. From the dual-line diagram of Figure 9, we can see that the highest Departure delays and Arrival delays occurred on January 16, 2020, which were respectively 279002 and 339664 minutes. From the dual-line diagram of Figure 9, we can also see that the lowest Departure delays and Arrival delays occurred on January 29, 2020, which were respectively 3643 and 96184 minutes.
There is some sharp pattern in the dual-line diagram of Departure delays and Arrival delays in January 2020. We can see from Figure 9 that there is a high increment in the total delays on Jan 4, Jan 11, Jan 16, Jan 23, Jan 27, and Jan 30. Those increments show some pattern till January 16 but after that, some random fluctuations are observed. This type of pattern can be observed for the decrement of Departure delays and Arrival delays too. We might get some specific seasonal pattern if we get the data of the next months. For identifying the daily pattern, we can create such dual-line diagram on daily basis.
Key Insight 10: The previous dual-line diagram of Figure 9 is separated by days of the week in Figure 10. The visualization of Figure 10 is mainly created to check the pattern for each day of the week. From the visualizations of Figure 10, we can see that for each day of the week there are some rise and fall of Departure delays and Arrival delays over the month. But for some days of the week the Departure delays and Arrival delays show overall high increment than others. Those days are Friday, Saturday, and Thursday. On the contrary, Tuesday and Wednesday show comparative decrement for the Departure delays and Arrival delays.
Key Insight 11: In Figure 11, top 10 departure Airports’ different delays are compared with Departure delays of those Airports. We have seen in Figure 7 that the most Departure Delays occurred in the Dallas/Fort Worth International Airport which is 244664 minutes. We can check its other delays like Carrier Delay, Weather Delay, National Air System Delay, Security Delay, and Late Aircraft Delay. It is clearly visible from Figure 11 that the highest 115180 minutes of Late Aircraft delay has occurred in Dallas/Fort Worth International Airport among Carrier Delay, Weather Delay, National Air System Delay, Security Delay, and Late Aircraft Delay. Thus, we can say Late Aircraft Delay is the highest responsible for Departure delays from Dallas/Fort Worth International Airport. The second most Departure delays occurred in Chicago O’Hare International Airport which is 203476 minutes. But for Chicago O’Hare International Airport the highest responsible for Departure delays is Carrier delay which is 92917 minutes.
Key Insight 12: In Figure 12, the top 10 destination Airports’ different delays are compared with Arrival delays of those Airports. We have seen in Figure 7 that the most Arrival delays occurred in the Dallas/Fort Worth International Airport which is 351637 minutes. We can check its other delays like Carrier Delay, Weather Delay, National Air System Delay, Security Delay, and Late Aircraft Delay. It is clearly visible from Figure 12 that the highest 118974 minutes of Late Aircraft delay has occurred in the Dallas/Fort Worth International Airport among Carrier Delay, Weather Delay, National Air System Delay, Security Delay, and Late Aircraft Delay. Thus, we can say Late Aircraft Delay is the highest responsible for Arrival delays to Dallas/Fort Worth International Airport. The second most Arrival delays occurred in Chicago O’Hare International Airport which is 327296 minutes. But for Chicago O’Hare International Airport the highest responsible for Arrival delays is Carrier delay which is 114535 minutes.
Conclusion
Several insights are drawn from the analysis which redirects us to some specific key insights. The key insights can be used from a diverse perspective. Some airlines, day of the week, and states show a significant amount of the total delays. We have seen SkyWest Airlines Inc., Friday, Texas State show possible higher amount of delays. But there is no specific pattern of the delays for each city on different days of the week. Most of the delays are occurred by Carrier delays but Late Aircraft delays are also responsible for 2nd most delays. If anyone wants to avoid Security delays, he/she should be careful from the Charlotte Douglas International and Dallas/Fort Worth International Airports. As most of the Arrival and Departure delays occurred in Dallas/Fort Worth International Airport, any passenger using this Airport can keep that in mind and also local authorities of Texas should try to solve the Late Aircraft delay problem of this Airport.
Research Provided by Andrey Fateev
Appendix
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Comments