Big Data Democratization

For a global company like Robert Half, data is a critical asset. But having a lot of data isn’t the same as being able to use it effectively. When information is scattered across different departments and systems, it’s hard to get a complete picture. This problem, which often leads to costly and inefficient operations, is what Robert Half was dealing with. Adanto worked with the company to build a better way to manage its data, making it more accessible and useful for everyone. The result was a modern data platform that not only lowered costs but also made it easier for the business to make smarter, faster decisions.


About Robert Half
Robert Half is a global leader in professional staffing and consulting services. With over 345 locations worldwide, the company helps businesses find skilled professionals in finance, technology, and other fields. Their extensive network and long-standing reputation make them a key player in the human resources industry.
For Adanto Software, this was a significant partnership, as it gave us the opportunity to apply our expertise to a complex, global business challenge.
Key Results
$2.6M
Annual cost savings from IT infrastructure optimization, automation & efficiency.
$1M+
Annual productivity gains from time & effort savings and faster decision-making.
$500k+
Business enablement gains from improved SLA compliance & new insights, innovation & agility.
Technologies Used
Data Sources/Silos:
- 60+ data sources
- 200+ GB of new data per day
Data Storage:
- Amazon S3
- Amazon EC2
- Amazon Redshift
- Amazon RDS
- Apache Sqoop
- Amazon HDFS (Parquet) with EMR
Query Tools & Analytics:
- Apache Hive
- Apache Pig
- Apache Spark
- R
- Mahout/scikit-learn
- QlikView
- PowerBI
- SAS
The Challenge
Robert Half’s data was trapped in multiple silos. This made it difficult for different departments to share information, and analysts couldn’t easily access the raw data they needed. The problem wasn’t just about access; it was also about cost. With different servers and software licenses spread across the organization, the IT infrastructure was both complex and expensive to maintain. Scaling up for new projects was also difficult and inflexible. Robert Half needed a solution that would centralize its data, make it accessible, and do so in a way that was affordable and easy to scale.
Key goals
Centralized Data Repository: Create a single source for raw data that all departments could use.
Improved Data Governance: Establish procedures for data loading and management to ensure data quality and reliability.
Targeted Data Access: Develop data marts for specific departments and business lines to provide them with the information they needed without unnecessary complexity.
Custom Analytic Applications: Build applications tailored to solve specific business problems and support decision-making.
The Solution
Adanto Software built a scalable and cost-effective Big Data Lake using Amazon Web Services (AWS). This platform consolidates all of Robert Half’s disparate data sources into one central location. Instead of being trapped in silos, data is now stored in its native format in Amazon S3 Buckets. This approach not only eliminates the need for expensive, purpose-built data stores but also makes it easy to share data across the entire organization.
The new infrastructure is designed for efficiency and flexibility. We used the parquet file format with Hadoop/Hive for structured querying, and an autoscaling Hadoop/Spark cluster on AWS handles heavy data processing. For daily data ingestion, we set up incremental data load processes using Apache Sqoop on an EMR cluster. This entire setup gives Robert Half a single, unified view of its data, while also keeping costs in check.
About Robert Half
Robert Half is a global leader in professional staffing and consulting services. With over 345 locations worldwide, the company helps businesses find skilled professionals in finance, technology, and other fields. Their extensive network and long-standing reputation make them a key player in the human resources industry.
For Adanto Software, this was a significant partnership, as it gave us the opportunity to apply our expertise to a complex, global business challenge.

The Challenge
Robert Half’s data was trapped in multiple silos. This made it difficult for different departments to share information, and analysts couldn’t easily access the raw data they needed. The problem wasn’t just about access; it was also about cost. With different servers and software licenses spread across the organization, the IT infrastructure was both complex and expensive to maintain. Scaling up for new projects was also difficult and inflexible. Robert Half needed a solution that would centralize its data, make it accessible, and do so in a way that was affordable and easy to scale.
Key goals
Centralized Data Repository: Create a single source for raw data that all departments could use.
Improved Data Governance: Establish procedures for data loading and management to ensure data quality and reliability.
Targeted Data Access: Develop data marts for specific departments and business lines to provide them with the information they needed without unnecessary complexity.
Custom Analytic Applications: Build applications tailored to solve specific business problems and support decision-making.
The Solution
Adanto Software built a scalable and cost-effective Big Data Lake using Amazon Web Services (AWS). This platform consolidates all of Robert Half’s disparate data sources into one central location. Instead of being trapped in silos, data is now stored in its native format in Amazon S3 Buckets. This approach not only eliminates the need for expensive, purpose-built data stores but also makes it easy to share data across the entire organization.
The new infrastructure is designed for efficiency and flexibility. We used the parquet file format with Hadoop/Hive for structured querying, and an autoscaling Hadoop/Spark cluster on AWS handles heavy data processing. For daily data ingestion, we set up incremental data load processes using Apache Sqoop on an EMR cluster. This entire setup gives Robert Half a single, unified view of its data, while also keeping costs in check.
Key Results
$2.6M
Annual cost savings from IT infrastructure optimization, automation & efficiency.
$1M+
Annual productivity gains from time & effort savings and faster decision-making.
$500k+
Business enablement gains from improved SLA compliance & new insights, innovation & agility.
Technologies Used
Data Sources/Silos:
- 60+ data sources
- 200+ GB of new data per day
Data Storage:
- Amazon S3
- Amazon EC2
- Amazon Redshift
- Amazon RDS
- Apache Sqoop
- Amazon HDFS (Parquet) with EMR
Query Tools & Analytics:
- Apache Hive
- Apache Pig
- Apache Spark
- R
- Mahout/scikit-learn
- QlikView
- PowerBI
- SAS