Automated ETL Migration from DataStage to Azure Data Factory
A Canadian public research university, in a journey to modernize its legacy on-premise Netezza data warehouse on Azure, required a solution to efficiently migrate ETL developed in DataStage to Azure Data Factory (ADF) to reduce costs and enhance its EDW, reporting and analytics applications.
Client Challenges and Requirements
- 143 Parallel Jobs and 12 Sequence Jobs to be converted from DataStage to ADF.
- Strict timeline due to IBM DataStage and Netezza licenses nearing expiration.
- Challenges due to default behavioral differences between source and target technologies.
Bitwise migration services using proprietary automated ETL Conversion solution to accelerate migration timeline within budget requirements.
- Conversion completed in 12 weeks to accelerate time-to-production.
- Automated Data Validation with full and sample validations using Bitwise’s Python based data validation accelerator (reducing efforts by 35%). The detailed reports certified correctness of the migrated code for stakeholder confidence.
- Loaded and Validated data for ~100 tables row by row and column by column.
- Conducted POCs to decide uniform solution approaches instead of scenario specific solutions. This helped retain auto conversion usage and ensured that required efforts are predictable well in advance.
- Developed a reusable error handling and alerting pipeline for rejected records in Synapse.
- Dark Data Element Discovery – identify and cleanup of dormant jobs and dead code.
- Timely solutions to challenges due to differences in source / target technologies.
- Performance Optimization of all 12 Sequence and 9 Singleton Parallel Jobs.
Tools & Technologies We Used
Bitwise ETL Converter
Azure Data Factory (ADF)
Bitwise data validation utility
Expedited conversion of 63 SCD jobs with customization in auto conversion.
20-30% Cost Savings achieved in Azure.
Avoidance of DataStage license renewal costs by completing migration in tight timeline despite delays on customer end .
5/5 Customer Satisfaction Score based on consistency of communication and quality, timeliness of deliverables.