5 min read

Autonomous CloudOps at Scale: MontyCloud's DAY2 Platform and AWS Serverless Technologies

At MontyCloud we envisioned a future where organizations could focus on innovating in the cloud rather than being encumbered by the complexities of operating it. We predicted that a growing skills and tooling gap would create a new market category called CloudOps.

We purpose built a modern, cloud native and serverless, Autonomous CloudOps platform which powers our flagship product, MontyCloud DAY2 Cloud Manager. Our product makes the public Cloud more accessible and accelerates its benefits by simplifying operations, security, compliance, and cost. Our customers and partners automate and accomplish complex operational tasks in a few clicks without investing in specialized teams.

In this blog I am going to share how we built our Events-driven, micro-services based SaaS platform which has won multiple accolades and a design win for the Serverless paradigm. We were also a launch partner for Amazon AWS for the AWS Cloud Operations Competency which demonstrates our deep technology bets and the value our modern platform delivers to our customers. We were also featured by AWS as one of the top innovative startups.

I hope this blog provides value to you as a technologist and innovator. At MontyCloud we are always hiring the best. For example we are looking for Principal Platform Engineer that is passionate about building word-class platforms. If you are interested in that position and other opportunities please check out our careers page. I am also eager to hear your feedback on our approach and will be happy to collaborate on technology topics in our focus areas.

 
The public Cloud is a powerful and complex machine that needs to be simplified

Companies across various industries and of all sizes are increasingly adopting cloud computing, recognizing the public cloud’s immense potential for innovation and scalability. Despite the challenges that accompany cloud adoption, its growth shows no signs of slowing down. The journey to cloud adoption is often fraught with difficulties, as traditional tools and practices employed by enterprises frequently prove insufficient or entirely irrelevant in this new landscape. Moreover, managing operations in the cloud presents even more significant challenges.

Customers managing and operating an ever-expanding array of cloud resources often confront compliance challenges, unplanned cost overages driven by underutilized or orphaned resources, and difficulties in upholding the shared security model provided by the cloud. In addition, the total cost of cloud operations, which encompasses infrastructure expenses and the required skilled personnel, continues to grow exponentially. The cost of skilled personnel, in particular, is rising rapidly, as IT teams must now maintain the same level of compliance, security, and cost governance while adapting to a world where physical infrastructure is abstracted by APIs.

By simplifying the public cloud for our customers, we help them innovate rapidly to fully unlock the potential of the cloud. This helps our customers create value faster without investing time and resources in undifferentiated heavy lifting that comes with operating the cloud.

 
Introducing MontyCloud and the MontyCloud DAY2 Platform

MontyCloud is built on the singular vision of helping teams become cloud powerhouses. The MontyCloud DAY2 platform empowers IT teams to be more responsive to user needs by enabling self-service consumption and reducing the total cost of cloud operations, all while delivering a no-code approach to cloud operations excellence. MontyCloud DAY2 accomplishes this by automating every layer of cloud consumption. Customers gain instant, granular visibility into their cloud footprint across multiple accounts and regions, enforce best practices in security and compliance, and manage servers across on-premises, AWS, and other multi-cloud environments.

Additionally, the platform enables self-service provisioning of “Operations-Ready” cloud resources and environments from a catalog of well-architected Infrastructure-as-Code services. This allows teams to collaborate in the context of their business, streamlining cloud operations with just a few clicks.

 
AWS Serverless powers MontyCloud DAY2

As a startup, we understand the importance of focusing on delivering value to our customers and addressing their problems without being bogged down by infrastructure management. Our decision to go with serverless technology stems from the realization that, with limited resources, we need to maximize our potential to serve hundreds of customers and their infrastructure without facing infrastructure woes ourselves.

The public cloud operates as a highly programmable, event-driven machine, and our talented platform engineers recognized that autonomous cloud operations demand a cloud-native solution. This solution should empower customers to observe, collect, and react to cloud events at the scale at which they occur.

Our engineers designed MontyCloudd DAY2 as an AWS native platform that delivers well-managed cloud operations by serving as an always-on event observability layer, tracking every change in a customer’s cloud infrastructure. They understood that building such a system requires vigilance in monitoring potential bottlenecks and ensuring high availability. Delays in processing events or losing events are simply not acceptable. To realize the full potential of autonomous CloudOps, MontyCloud DAY2 must be a persistent observability layer and a system capable of reacting to incidents and anomalies in real-time within customers’ accounts.

To achieve these goals, we utilized a range of AWS serverless technologies and hosted services, including AWS Lambda, SNS, SQS, CloudFront, RDS, DynamoDB, Systems Manager, AWS Config, CloudWatch, and more. For example, AWS Lambda offers a native event-driven programming model that aligns perfectly with our design principles. Lambda’s Function as a Service setup provides numerous benefits, most notably the ability to deploy single-function updates, which enables rapid iteration on improvements, failure management, and experimentation through function aliasing. Our engineers’ decision to use AWS serverless technology showcases their commitment to delivering a cutting-edge, reliable, and scalable solution for cloud operations while efficiently utilizing startup resources.

 
Scaling with Serverless

Addressing Unique Cloud Footprints and Scaling Patterns with AWS Lambda Public cloud adoption involves customers consuming cloud services in diverse patterns, which may include a mix of server-based, serverless, and containerized workloads. This makes each customer’s cloud footprint and scaling patterns unique. Consequently, MontyCloud DAY2 must be built to scale effectively, predict workloads it needs to manage and operate, and adapt to varying scenarios.

AWS Lambda has proven invaluable in this aspect, handling not only scale-ups but also scale-downs, ensuring our functions and operations remain consistent across high and low workload situations. With Lambda, there is no need to manually provision additional compute resources or perform clean-up operations on our infrastructure. By leveraging Lambda and related services, MontyCloud DAY2 can discover, manage, and operate tens of thousands of resources per customer, including server workloads, all without having a single server to manage ourselves. This significantly reduces the operational burden associated with traditional building block services.

The absence of pre-provisioned infrastructure translates into substantial capital expense savings for MontyCloud, which we can pass on to our customers, making our platform even more attractive and cost-effective.

 
Domain-Driven Development

At MontyCloud, we place a strong emphasis on domain-driven design, which is a key design principle behind MontyCloud DAY2. Our platform is modelled using domain-driven design, and our Lambda functions are logically grouped into domains. Each Lambda within a domain shares a common domain data store, allowing for seamless access to domain data. Though our structure emerged naturally, the fact that other technology companies such as Uber have also recognized the merits of this approach further validating our choice and reinforcing our confidence in the strategy.

By leveraging AWS Lambda layers, we can share and manage a common layer of utilities across Lambda functions. This approach promotes consistency and maintainability in our platform's design, ensuring that MontyCloud DAY2 remains robust, efficient, and scalable to meet the unique needs of our customers.

Domain-driven design not only enhances the organization and structure of MontyCloud DAY2, but it also contributes to its adaptability and effectiveness in managing diverse cloud workloads and scaling patterns.

Picture1

Figure 1: Domain Driven Serverless CloudOps Platform 

 
Continuous innovation without legacy bottlenecks

At MontyCloud, we prioritize continuous innovation without the limitations of legacy bottlenecks. By embracing cutting-edge technologies like AWS serverless services and adhering to robust design principles such as domain-driven development, we deliver scalable and efficient solutions for autonomous cloud operations.

Our diverse teams work in a decentralized environment that fosters collaboration and agility, allowing us to rapidly deploy new features and improvements multiple times each week. This not only guarantees a seamless customer experience but also nurtures a dynamic setting for engineers to experiment and innovate. MontyCloud is the ideal place for creative minds passionate about tackling complex challenges through distributed computing.

As we continue to grow and expand our capabilities, we are seeking talented platform engineers who share our enthusiasm for solving intricate problems in distributed computing. If you're looking to make a meaningful impact in the world of cloud operations, we invite you to join our skilled team at MontyCloud. Together, we'll drive the future of autonomous CloudOps at scale, embracing cutting-edge technology and an unwavering commitment to excellence. Apply today and be a part of our exciting journey.

MontyCloud DAY2 Automated Resource Tagging for AWS MAP

MontyCloud DAY2 Automated Resource Tagging for AWS MAP

Have you signed an agreement to begin migrating to AWS? Or are you a Managed Service Provider (MSP) with an AWS Migration Competency delivering AWS...

Read More
Announcing MontyCloud’s CloudOps Copilot - Your Interactive AI Agent that Simplifies Cloud Operations

Announcing MontyCloud’s CloudOps Copilot - Your Interactive AI Agent that Simplifies Cloud Operations

Today I am super excited to announce the availability of MontyCloud’s CoPilot for Cloud Operations, an interactive Agent for simplifying Cloud...

Read More