Voyager Labs’ needed the capability to analyze billions of data points & reveal deep actionable insights from the collected data. On average day, 10 billion records are being collected. With an average of 1 KB per record, the daily amount of data collected accounts to roughly 10 TB. The data ingestion need to support at least 120K events/second to keep up with the volume of collected data.
What we did
After careful and detailed evaluation of customer challenges, the following solution features were prioritized to be a part of a proposed solution and architecture:
- Elasticity – it has to be easy to scale the solution
- Managed – there is a preference for AWS managed solution to increase the speed of product development
- Secure – by utilizing Amazon Identity and Access Management, the access will be centrally managed and governed
The AWS Cloud provides many of the building blocks required to help Voyager Labs implement a secure, flexible, and cost-effective data collection, storage and analytics platform. These include AWS managed services that help ingest, store, find, process, and analyze both structured and unstructured data.
Following the delivery of the solution by DoiT International, Voyager Labs now enjoys the following benefits:
- Fully managed Spark clusters with elastic scaling of the nodes and spot instances
- 300% cost reduction on disk capacity due to migration to Amazon S3
- Relies on managed services without the maintenance burden
- Serverless data ingestion using Amazon Kinesis and Kinesis Firehose
- Consistent research experience with Amazon SageMaker