Our software agency embarked on a challenging project to develop a Marketing Customer Data Platform (CDP) that would serve as a scalable solution for data ingestion from over 40 data providers. The primary objective was to build and maintain a comprehensive identity graph encompassing more than 600 million individuals, each associated with over 400 properties. Additionally, our aim was to provide various features that would enhance marketing campaigns and improve overall efficiency.
"They are the best of the best, and they will meet your expectations and blow them out of the water!"
To efficiently handle the data influx from over 40 diverse data providers, we implemented highly scalable data pipelines. Leveraging cutting-edge technologies and distributed computing frameworks, our pipelines seamlessly processed and transformed data from various sources. This approach enabled us to handle large data volumes and ensure real-time or near-real-time updates to the identity graph.
Building and maintaining an identity graph encompassing more than 600 million individuals was a critical requirement for the CDP. Our team devised robust algorithms and implemented advanced data structures to manage this vast network effectively. This involved resolving duplicates, handling unique identifiers, merging data from multiple sources, and guaranteeing data integrity and accuracy. The identity graph formed the foundation for numerous platform features.
Acknowledging the growing concerns around privacy and the diminishing effectiveness of traditional cookies, we prioritized the development of a cookieless audience creation solution. Using pixel technology, our platform enabled customers to match their websites and marketing campaign traffic with real individuals within the identity graph. This approach allowed for privacy-friendly and accurate audience targeting based on online activities, independent of cookies.
We designed and implemented a hassle-free, intuitive interface within the platform to enable customers to purchase audiences based on personally identifiable information (PII), people taxonomy, and data signals available within the CDP. This feature empowered customers to define their target audience using a variety of demographic, behavioral, and contextual attributes. Leveraging the rich data within the CDP, customers could create highly targeted and effective marketing campaigns.
To maximize the value of third-party audiences, we developed a mechanism to enrich them with our comprehensive identity graph. By merging third-party audience data with the CDP's dataset, we provided additional insights and enhanced targeting capabilities for our customers. This enrichment process involved employing sophisticated data matching algorithms, data cleansing techniques, and intelligent data fusion methods to ensure the accuracy and quality of the enriched audiences.
To facilitate seamless integration with other marketing platforms, we implemented a robust mechanism to export data from the CDP to third-party platforms. This feature enabled customers to effortlessly activate their audiences and leverage the data within their existing marketing workflows. We ensured compatibility with various data exchange formats and APIs, enabling smooth data transfer and integration with popular advertising and marketing platforms.
To handle the enormous volume of data, encompassing over 14 billion data points per week and totaling approximately 5 TB, we implemented a robust data engineering infrastructure. Leveraging AWS Cloud Services, such as S3 buckets for data delivery from providers, Kinesis data streams to create efficient data pipelines, and Lambda functions and Docker containers running on ECS, we structured and linked the data in a scalable manner. The data pipelines comprised multiple stages, including a data lake built on Apache Spark and a high-throughput Elasticsearch database for high-speed searches.
Collaborating closely with the client, we focused on creating a scalable, visually appealing user interface that aligned with their vision while showcasing the technical capabilities of the system. Our team employed a product discovery framework guided by our Product Specialists and Designers, utilizing Figma prototypes. Subsequently, we developed a React Progressive Web App alongside a GraphQL Apollo Server backend to bring the application to life.
After five months of dedicated development, we successfully delivered an MVP encompassing most of the aforementioned aspects. This allowed the company to validate its market fit and obtain valuable user feedback. Moving forward, we remain committed to continuously enhancing the product by adding new features and improving existing functionalities. We aim to ensure that our clients consistently derive maximum value from the platform, empowering them to drive impactful marketing campaigns and achieve their business goals.