Hi, I'm Raj!

I have 19 years of design, development and ownership experience in enterprise level data platforms in FinTech (Banking and Insurance) domains. I worked ~4 years in international locations (USA, Sweden) in customer facing roles.

On the business front, I have led the development of data products and analytical applications that are part of the banking business process workflows.

On the technology front, I have led end-to-end data engineering projects across on-premise data warehouse, data lake and public cloud platforms and have designed and developed data ingestion and distribution frameworks.

On the governance front, I have exposure to data governance policies (based on BCBS, GDPR, CCPA - data ownership, embedded governance, continuous monitoring, governance council), procedures and tooling (data quality, reconciliation, lineage, data provision/masking).

I have led data engineering teams through all the SDLC phases by following waterfall and agile scrum / kanban methodologies.

I have played roles such as Data Platform Leader, Data Engineering Manager, IT System Owner, Data Modeller, Data Architect, Subject Matter Expert and Information Architecture Chapter Lead.

I have experience in descriptive / prescriptive analytics (Pricing, Marketing, Sales, Operational, Business Management, Risk) and have exposure to predictive analytics (AI/ML model life cycle, MLOps, LLM with RAG, NLP, GenAI).

Please get in touch if you think my professional experience can help solve exciting data engineeing problems.

rajakumaranarivumani

  • Data Warehouse
    • Teradata
    • Informatica PowerCenter
    • ETL/ELT
    • SQL
    • UNIX
    Expert
  • Data Lake
    • Hive
    • Spark
    • Python
    Intermediate
  • Cloud
    • Snowflake
    • Azure Data Factory
    • Azure Databricks
    • Azure Functions
    • Azure Datalake Storage
    • AWS Glue
    • AWS Lambda
    • AWS S3
    Intermediate
  • Web App
    • Angular 2
    • HTML
    • Bootstrap
    • Flask
    • Spring Boot
    • Docker Containers
    Beginner
  • ML / AI / GenAI
    • Decision Tree
    • NLP NER
    Intermediate

Certifications

...

AWS

Dec 2024

...

Microsoft

Nov 2024

...

Duke University

Oct 2024

...

Microsoft

May 2021

...

Microsoft

March 2021

...

University of Michigan

May 2020

...

UC Berkeley

August 2015

...

Duke Univerisy

November 2014

...

IBM

November 2006

Recognitions

2024

  • Rated as Exceeds Expectations (top rating) in the performance appraisal.

2023

  • Rated as Exceeds Expectations (top rating) in the performance appraisal.

2022

  • Rated as Exceeds Expectations (top rating) in the performance appraisal.

2021

  • Rated as Exceeds Expectations (top rating) in the performance appraisal.

2020

  • Rated as Exceeds Expectations (top rating) in the performance appraisal.

2019

  • Received Morgan Stanley certificate of recognition for demonstrating the core value ‘Putting Clients First’ by outstanding delivery in 2019.

  • Rated as Exceeds Expectations (top rating) in the performance appraisal.

2018

  • Rated as Exceeds Expectations (top rating) in the performance appraisal.

2017

  • Received Morgan Stanley certificate of recognition for leading with exceptional ideas in 2017.

  • Received Morgan Stanley certificate of recognition for demonstrating the core value ‘Putting Clients First’ by outstanding delivery in 2017.

  • Rated as Exceeds Expectations (top rating) in the performance appraisal.

2016

  • Received Morgan Stanley certificate of recognition for technical expertise in 2016.

2014

  • Rated as Outstanding Contributor (top rating) in the performance appraisal.

2012

  • Received IBM certificate 'Best of IBM' in 2012.

  • Received IBM certificate 'IBM Top Talent' in 2012.

  • Rated as Outstanding Contributor (top rating) in the performance appraisal.

2011

  • Rated as Outstanding Contributor (top rating) in the performance appraisal.

2009

  • Rated as Outstanding Contributor (top rating) in the performance appraisal.

2008

  • Received IBM certificate 'IBM Top Talent' in 2008.

2007

  • Received IBM certificate 'Best of IBM' in 2007.

Project Portfolio

Data Modernization

Banking Data Lakehouse / Cloud

Leading bank data modernization program which aims to publish the banking datasets in the on-prem data platforms (Teradata and Hadoop) to Snowflake on Microsoft Azure public cloud platform.

Learn more
Banking Data Lake

Designed the data lake storage layer and data ingestion framework and led delivery of the data pipelines for a US based investment bank.

Learn more

Data and Reporting

Banking Data Warehouse

Managed the banking data warehouse for a lending product with 10+ data sources and 25+ consumer groups for a US based investment bank.

Learn more
Metrics Data Warehouse

Designed and led delivery of the data warehouse from scratch for a US based Insurance provider.

Learn more
Data Virtualization

Designed and led delivery of the data distribution layer using a data virtualization platform for a US based investment bank.

Learn more
Regulatory Reporting

Designed and led delivery of data pipelines that feed data into regulatory reporting (CCAR) systems for a US based investment bank.

Learn more
Lending Sub-Servicer Onboarding

Designed and led delivery of data pipelines as part of the lending sub-servicer onboarding for a US based investment bank.

Learn more

Data Visualization

Net Balance Flow

Developed data pipelines to calculate the balance flows across multiple products for a US based investment bank. This was aggregated and presented to stakeholders at various levels in the firm via a dashboard.

Learn more

Data Analytics

Relationship Pricing

Designed and led delivery of an analytics application for a US based investment bank to determine the pricing of new loans based on existing relationships that the client has with the firm.

Learn more
Rate Adjustment Eligibility

Designed and led delivery of an analytics application for a US based investment bank to determine the mortgage loans eligible for rate adjustment. This involves calculating the cash flows and internal rate of return for eligible products.

Learn more
Risk Models

Designed and led delivery of data pipelines that feed data into the market/credit (BASEL) risk models.

Learn more
Anti Money Laundering

Designed and led delivery of the data pipelines to feed customer and transaction data into an Anti Money Laundering solution for a bank in the Nordic region.

Learn more

Data Governance

KYC Data Reconciliation

Designed and led delivery of an automated process to reconcile the KYC data feeding into the compliance processes for a US based investment bank.

Learn more
Data Reconciliation Hub

Designed and developed the Data Reconciliation Hub application for a US based investment bank to reconcile data from various data sources feeding into the data warehouse and data lake.

Learn more
Data Lineage

Designed and developed the Data Lineage web application which is used for impact analysis, identifying sources and consumers of datasets and tracking the process lineage from source to target interfaces.

Learn more
Data Quality Service Open Source

A CLI app / micro service to validate the source data with configurable data quality rules using the open source Great Expectations package.

Learn more See github repository
Data Quality Anomaly Detection Service (ML App)Open Source

A CLI app / micro (ML application) service to detect data quality anomalies in the source data using the open source XGBoost package.

Learn more See github repository
Data Profiling Service (AI App) Open Source

A CLI app / micro service to profile the data. This uses a Natural Language Processing (NLP) Named Entity Recognition (NER) model to classify the data as PII vs Non-PII.

Learn more See github repository
Data Catalog Service (GenAI App) Open Source

A CLI app / micro service to create asset definition for the datasets that will be published to the data catalog. The catalog can be queried (e.g. for datasets, data elements) using the OpenAI GPT LLM model. The data dictionary, system and business glossaries are used for the RAG context.

Learn more See github repository
Data Reconciliation Service Open Source

A CLI app / micro service to reconcile the source data with the reconciliation control measures received from the source. Reconciliation controls (columns, aggregates) can be configured.

Learn more See github repository
Data Lineage Service Open Source

A CLI app / micro service to capture the lineage at the dataset level.

Learn more See github repository

Data Architecture

Application Calendar Service Open Source

A CLI app / micro service to derive the effective business dates based on the batch schedules and holidays (NYSE, Federal Bank holidays).

Learn more See github repository
Metadata Management Service Open Source

A CLI app / micro service to centrally store, manage and serve the metadata to the data management framework. This covers datasets, data quality rules, data reconciliation rules, schedules and holidays.

Learn more See github repository
Data Ingestion Service Open Source

A CLI app / micro service to ingest delimited text files into data lake tables using PySpark API.

Learn more See github repository
Data Distribution Service Open Source

A CLI app / micro service to extract data from the data lake tables using PySpark API.

Learn more See github repository
Serverless ETL Orchestration Framework Open Source

A serverless ETL orchestration framework in the Microsoft Azure public cloud platform. The framework covers data ingestion, curation and delivery aspects while providing configurable data governance controls.

Learn more See github repository