Skip to main content

· 7 min read
Sudhanshu Prajapati

Graceful degradation and managing failures in complex microservices are critical topics in modern application architecture. Failures are inevitable and can cause chaos and disruption. However, prioritized load shedding can help preserve critical user experiences and keep services healthy and responsive. This approach can prevent cascading failures and allow for critical services to remain functional, even when resources are scarce.

To help navigate this complex topic, Tanveer Gill, the CTO of FluxNinja, got the opportunity to give a talk at Chaos Carnival 2023 (March 15-16), which happened virtually, the sessions were pre-recorded. Though, attendees could interact with speakers since they were present all the time during the session.

· 14 min read
Sudhanshu Prajapati

Cover Image

Service meshes are becoming increasingly popular in cloud-native applications as they provide a way to manage network traffic between microservices. Istio, one of the most popular service meshes, uses Envoy as its data plane. However, to maintain the stability and reliability of modern web-scale applications, organizations need more advanced load management capabilities. This is where Aperture comes in. It offers several features, including:

· 19 min read
Sudhanshu Prajapati

Graceful Degradation

In today's world of rapidly evolving technology, it is more important than ever for businesses to have systems that are reliable, scalable, and capable of handling increasing levels of traffic and demand. Sometimes, even the most well-designed micro-services systems can experience failures or outages. There are several examples in the past where companies like Uber, Amazon, Netflix, and Zalando faced massive traffic surges and outages. In the case of Zalando (Shoes & Fashion Company), the whole cluster went down; one of the attributes was high latency, causing critical payment methods to stop working and impacting both parties, customers, and the company. This outage caused them a monetary loss. Later on, companies started adopting the graceful degradation paradigm.

· 7 min read
Tanveer Gill
Charu Jangid

A robust reliability automation strategy is essential for the successful management of cloud applications. It not only sets top-performing apps apart from the rest but also establishes trust with end customers and drives business success. Whether you are a small or large organization, investing in reliability management is crucial for ensuring the availability, performance, and consistency of your services.

In this blog, we will introduce you to the fundamental principles of reliability automation, known as the Reliability Spectrum. Comprised of three key pillars - prevention, protection, and escalation & recovery - the Reliability Spectrum provides a comprehensive framework for maintaining a reliable cloud application. Join us as we delve into the details of each pillar and explore the essential components of a successful reliability automation strategy.

· 10 min read
Hasit Mistry

Rate Limiting is a common requirement for any API service to protect itself from malicious or accidental abuse. Aperture provides a powerful policy engine that can be used to implement rate limiting using Rate Limiter Component and Flow Classifier Rego Rules.

In this blog post, we will specifically look at how to implement rate limiting for GraphQL queries. Let us begin by discussing what GraphQL is and why rate limiting on GraphQL queries is required.

· 5 min read
Hardik Shingala
Sudhanshu Prajapati

FluxNinja at Kubernetes Pune

The FluxNinja team had the opportunity to demo Aperture open source at the November 2022 edition of the Kubernetes Pune meetup, organized at Slack (Salesforce) India’s office.

Kubernetes Pune is a group for all Kubernauts who want to learn and share experiences about Kubernetes. This meetup group is for all skill levels, from beginners to experienced professionals. Every month, we meet and discuss various aspects of the Kubernetes ecosystem, such as service discovery, load balancing, networking, storage, and more.

· 10 min read
Sudhanshu Prajapati
Tanveer Gill

Highly available and reliable Services are a hallmark of any thriving business in today’s digital economy. As a Service owner, it is important to ensure that your Services stay within SLAs. But when bugs make it into production or user traffic surges unexpectedly, services can slow down under a large volume of requests and fail. If not addressed in time, such failures tend to cascade across your infrastructure, sometimes resulting in a complete outage.

At FluxNinja, we believe that adaptive concurrency limits are the most effective way to ensure services are protected and continue to perform within SLAs.