The FluxNinja team had the opportunity to demo Aperture open source at the November 2022 edition of the Kubernetes Pune meetup, organized at Slack (Salesforce) India’s office.
Kubernetes Pune is a group for all who want to learn and share experiences about Kubernetes. This meetup group is for all skill levels, from beginners to experienced professionals. Every month, we meet and discuss various aspects of the Kubernetes ecosystem, such as service discovery, load balancing, networking, storage, and more.
As part of the Aperture talk, we explored several topics in reliability management:
- Cascading failures and how they affect modern cloud native applications
- Available solutions to handle cascading failures and their shortcomings
- Capabilities of Aperture as an all-in-one open source solution
- Demo of Aperture using Playground
Below are the key themes from our discussion on using Aperture for reliability management. If you are just getting started with reliability management using FluxNinja Aperture, we recommend first reading the following blog post and watching this intro video.
Discussion on reliability management with Aperture
Managing Crawler traffic
Top of mind for several attendees was the problem of crawler traffic. Crawlers made up nearly two-thirds of all internet traffic in 2021, with malicious crawlers accounting for 40% of all traffic, according to research published by Barracuda Networks. So, it isn't surprising that a top use case application owners want to deploy Aperture against is crawler traffic.
Aperture provides powerful rate limiting tools for crawler traffic, such as Static or Rate Limiting Escalation. These policies allow you to specify how and when you want to drop requests coming from crawlers. Aperture more broadly also enables re-configuring the allow rates for different types of service traffic dynamically, using signals received from the application.
Internal compared to user-facing services
The next discussion topic was the complex nature of modern application architecture, which often involves a mix of user-facing and internal microservices. For large applications, there are often only a few services which directly interact with end-user traffic, while others are internal facing only. If an internal service, which doesn’t directly face customer traffic is getting overloaded, what kinds of reliability management interventions should be deployed?
If an internal service is getting overloaded, Aperture Agents notify end user-facing services to prioritize traffic and start load shedding. Aperture does this using reliability policies through a component called the FluxMeter. FluxMeter's collect signals from services, analyze them, and make recommendations to end-user-facing services on how to drop traffic based on assigned user priorities.
Prioritizing types of user traffic
Not only is application architecture complex, but today, application traffic also comes from a variety of sources. A popular growth technique for many applications today is to use redirects. But in a service failure or overload scenario, can an application prioritize traffic from users who directly visited the application compared to those who came through redirects?
Aperture includes a powerful classification engine which by default, extracts labels from request headers. In this scenario, Aperture will be able to dynamically extract the label of the website the user traffic is coming from. You can then use this label to write a policy where you decide how you want to prioritize these users compared to users who directly visit your application.
Finally, attendees at the meetup discussed the importance of cross-platform compatibility in a reliability management solution. While this was a Kubernetes meetup, not all applications operated by attendees were using Kubernetes yet. Folks wanted to know, could they use FluxNinja Aperture if their application was not deployed on Kubernetes?
Aperture has SDKs for Golang, Java and Node.js. These SDKs provide an insertion mechanism for Flow Control similar to AuthAPI based insertion with Envoy. The SDKs need to connect to Aperture Agents and Controller to function. Agents can run anywhere, not just in Kubernetes. We will soon be launching Debian and rpm packages for running Aperture Agents in non-containerized environments. For installing the Aperture Controller, you will need to spin up a small Kubernetes cluster. Learn more about installing Aperture Agents and Controller here.
Overall, we were thrilled to get feedback from the Kubernetes Pune community on Aperture and feel encouraged by all the use cases for how companies can use this technology. A common theme during the meetup discussion was around leveraging Aperture to automate reliability management, in contrast to the status quo of manual interventions during service failures. This automation has several advantages in both reducing costs and improving end-user experience.
Aperture is open source via the GNU General Public License (“GPL”) and is under active development and maintenance by the FluxNinja team. We would love to get your feedback and contributions to Aperture. You can sign up to become a contributor to the Aperture repository on GitHub here and read our documentation here.
If you are looking to get started with FluxNinja Aperture or have any questions, join our Slack community for best practices, questions, and discussions on reliability management.