As the Senior Director of IT Operations & Reliability, Sean Kelly has been involved in the design & reliability of FlightAware's infrastructure and the adoption of Site Reliability Engineering.
We are excited to share the second project in our series showcasing the work from our class of Summer 2023 interns. Our summer internships aren't limited to just software engineering. This week, we will take a look at a project done on our Site Reliability Engineering team. This is the second year that we've had an SRE intern. Mentoring someone on the ways of Site Reliability Engineering, something not really covered in coursework, is gratifying and rewarding. Especially when they bring the same excitement and interest to the table. With that, let's take a look at Will's project.
Hello! My Name is William Burns, and I am a Site Reliability Engineering Intern on the Systems team at Flight Aware. I am a Senior at Arizona State University studying Information Technology and plan on graduating with my bachelor’s degree in December 2023. Having had an interest in aviation and aerospace, FlightAware has provided me with the perfect opportunity to apply my technical skills toward a field I have a lot of passion for.
During my time here, I have been given the opportunity to work alongside talented engineers and has brought me experience that will surely benefit me for the rest of my career. Post-graduation, I would like to work as a Site Reliability Engineer and hope to return to FlightAware.
The primary objective of my project this summer was to centralize and automate FlightAware's management of firewall policies across its on-premise network infrastructure. A current focus for the Systems Team and FlightAware is having the ability to automate our network configurations. This project lays the groundwork for this goal by providing an easily extensible solution to firewall management.
I devised a solution in Python3 that extracts IPv4 and IPv6 network prefixes or IP addresses from NetBox, our data center infrastructure management software. Subsequently, these are formatted to be compatible with an Access Control List (ACL) generation tool, Capirca. This allows for the automatic generation of ACLs suitable for our Juniper SRX devices. Once produced, these ACLs are deployed directly onto the SRX devices. This automation alleviates the previously manual process of writing and applying ACLs, especially when introducing new network prefixes to the SRX devices drastically reducing the manual overhead.
For anyone not familiar with Access Control Lists, they are critical security components for network devices such as routers and switches. ACLs, contain sets of rules used to control and manage access to network resources based on specific criteria. These criteria can include factors like source and destination IP addresses, port numbers, and the protocol in use. Typically implemented on routers and switches, ACLs are an integral part of network security, providing a mechanism to explicitly allow or deny traffic based on predefined conditions. Whether it's to prevent unauthorized access, segment internal networks, or filter incoming and outgoing traffic, ACLs offer granular control at various layers of the Open Systems Interconnection (OSI) model, most notably the Network (Layer 3) and Transport (Layer 4) layers.
At FlightAware the nature of the services we offer and the vast amount of data we handle make network security a crucial part of our operations. ACLs allow us to finely control data traffic, ensuring that only authorized requests access the right data. This granularity is crucial given the volume and diversity of data we handle. This ensures not just security, but also optimal data traffic flow, helping maintain the responsiveness and accuracy of FlightAware's services. In essence, for the seamless operation of our technical infrastructure, ACLs serve as critical tools in FlightAware's network management arsenal.
The project is organized into several distinct components, each with its specific role: NetBox, Capirca, Docker, and the deployment of the ACLs to a chosen Juniper SRX device.
The NetBox components are tasked with pulling network prefixes from FlightAware's NetBox instance using the pynetbox library, searching for matches based on the "Tenant" or "Description" values. Once identified, these prefixes are recorded in a file named NETWORK.net, setting them up for being ingested Capirca.
Capirca's role is crucial in generating the ACLs. It utilizes the network prefixes from NetBox, along with a services object called SERVICES.svc that enumerates lists of ports and protocols, and policy objects (.pol) which tell Capirca how to generate the final security policy or ACL configuration.
Following the preparation and generation of the ACLs, the final but crucial phase is deploying them to the Juniper SRX device. For this, we utilize Juniper's PyEZ library that facilitates automation of network devices through the NETCONF protocol. Once connected to the targeted device, the prepared ACLs are then pushed to the device.
The Docker component provides the necessary environment for the entire process to run seamlessly. Using Docker Compose, we encapsulate all the dependencies and configurations into a consistent environment. Docker acts the cohesive bridge, integrating NetBox, Capirca, and the Juniper SRX device operations, guaranteeing that the system dependencies remain consistent irrespective of where the script is run. This not only simplifies deployment but also ensures reproducibility across various platforms and systems.
Overall, this project while fairly small in scope provided lots of exposure to new technologies and provided its fair share of challenges. One significant aspect was being introduced to Agile methodologies and using JIRA to track progress and issues. Both tools proved invaluable in streamlining our processes and staying organized. While an integral part of this internship was ultimately to complete our project, I think the takeaway extends far beyond that. A critical lesson learned was recognizing when I was going down an unproductive path and the need to pivot quickly. Being able to assess and redirect efforts efficiently was a standout skill I developed during this project. Being part of the Systems team provided firsthand experience of a professional work environment. Beyond just completing tasks, I gained insights into teamwork, process management, and the real-world application of engineering principles.
Thank you FlightAware!