I have 19 years of design, development and ownership experience in enterprise level data platforms in FinTech (Banking and Insurance) domains. I worked ~4 years in international locations (USA, Sweden) in customer facing roles.
On the business front, I have led the development of data products and analytical applications that are part of the banking business process workflows.
On the technology front, I have led end-to-end data engineering projects across on-premise data warehouse, data lake and public cloud platforms and have designed and developed data ingestion and distribution frameworks.
On the governance front, I have exposure to data governance policies (based on BCBS, GDPR, CCPA - data ownership, embedded governance, continuous monitoring, governance council), procedures and tooling (data quality, reconciliation, lineage, data provision/masking).
I have led data engineering teams through all the SDLC phases by following waterfall and agile scrum / kanban methodologies.
I have played roles such as Data Platform Leader, Data Engineering Manager, IT System Owner, Data Modeller, Data Architect, Subject Matter Expert and Information Architecture Chapter Lead.
I have experience in descriptive / prescriptive analytics (Pricing, Marketing, Sales, Operational, Business Management, Risk) and have exposure to predictive analytics (AI/ML model life cycle, MLOps, LLM with RAG, NLP, GenAI).
Please get in touch if you think my professional experience can help solve exciting data engineeing problems.
AWS
Dec 2024
Microsoft
Nov 2024
Duke University
Oct 2024
Microsoft
May 2021
Microsoft
March 2021
University of Michigan
May 2020
UC Berkeley
August 2015
Duke Univerisy
November 2014
IBM
November 2006
Rated as Exceeds Expectations (top rating) in the performance appraisal.
Rated as Exceeds Expectations (top rating) in the performance appraisal.
Rated as Exceeds Expectations (top rating) in the performance appraisal.
Rated as Exceeds Expectations (top rating) in the performance appraisal.
Rated as Exceeds Expectations (top rating) in the performance appraisal.
Received Morgan Stanley certificate of recognition for demonstrating the core value ‘Putting Clients First’ by outstanding delivery in 2019.
Rated as Exceeds Expectations (top rating) in the performance appraisal.
Rated as Exceeds Expectations (top rating) in the performance appraisal.
Received Morgan Stanley certificate of recognition for leading with exceptional ideas in 2017.
Received Morgan Stanley certificate of recognition for demonstrating the core value ‘Putting Clients First’ by outstanding delivery in 2017.
Rated as Exceeds Expectations (top rating) in the performance appraisal.
Received Morgan Stanley certificate of recognition for technical expertise in 2016.
Rated as Outstanding Contributor (top rating) in the performance appraisal.
Received IBM certificate 'Best of IBM' in 2012.
Received IBM certificate 'IBM Top Talent' in 2012.
Rated as Outstanding Contributor (top rating) in the performance appraisal.
Rated as Outstanding Contributor (top rating) in the performance appraisal.
Rated as Outstanding Contributor (top rating) in the performance appraisal.
Received IBM certificate 'IBM Top Talent' in 2008.
Received IBM certificate 'Best of IBM' in 2007.
Leading bank data modernization program which aims to publish the banking datasets in the on-prem data platforms (Teradata and Hadoop) to Snowflake on Microsoft Azure public cloud platform.
Learn moreDesigned the data lake storage layer and data ingestion framework and led delivery of the data pipelines for a US based investment bank.
Learn moreManaged the banking data warehouse for a lending product with 10+ data sources and 25+ consumer groups for a US based investment bank.
Learn moreDesigned and led delivery of the data warehouse from scratch for a US based Insurance provider.
Learn moreDesigned and led delivery of the data distribution layer using a data virtualization platform for a US based investment bank.
Learn moreDesigned and led delivery of data pipelines that feed data into regulatory reporting (CCAR) systems for a US based investment bank.
Learn moreDesigned and led delivery of data pipelines as part of the lending sub-servicer onboarding for a US based investment bank.
Learn moreDeveloped data pipelines to calculate the balance flows across multiple products for a US based investment bank. This was aggregated and presented to stakeholders at various levels in the firm via a dashboard.
Learn moreDesigned and led delivery of an analytics application for a US based investment bank to determine the pricing of new loans based on existing relationships that the client has with the firm.
Learn moreDesigned and led delivery of an analytics application for a US based investment bank to determine the mortgage loans eligible for rate adjustment. This involves calculating the cash flows and internal rate of return for eligible products.
Learn moreDesigned and led delivery of data pipelines that feed data into the market/credit (BASEL) risk models.
Learn moreDesigned and led delivery of the data pipelines to feed customer and transaction data into an Anti Money Laundering solution for a bank in the Nordic region.
Learn moreDesigned and led delivery of an automated process to reconcile the KYC data feeding into the compliance processes for a US based investment bank.
Learn moreDesigned and developed the Data Reconciliation Hub application for a US based investment bank to reconcile data from various data sources feeding into the data warehouse and data lake.
Learn moreDesigned and developed the Data Lineage web application which is used for impact analysis, identifying sources and consumers of datasets and tracking the process lineage from source to target interfaces.
Learn moreA CLI app / micro service to validate the source data with configurable data quality rules using the open source Great Expectations package.
Learn more See github repositoryA CLI app / micro (ML application) service to detect data quality anomalies in the source data using the open source XGBoost package.
Learn more See github repositoryA CLI app / micro service to profile the data. This uses a Natural Language Processing (NLP) Named Entity Recognition (NER) model to classify the data as PII vs Non-PII.
Learn more See github repositoryA CLI app / micro service to create asset definition for the datasets that will be published to the data catalog. The catalog can be queried (e.g. for datasets, data elements) using the OpenAI GPT LLM model. The data dictionary, system and business glossaries are used for the RAG context.
Learn more See github repositoryA CLI app / micro service to reconcile the source data with the reconciliation control measures received from the source. Reconciliation controls (columns, aggregates) can be configured.
Learn more See github repositoryA CLI app / micro service to capture the lineage at the dataset level.
Learn more See github repositoryA CLI app / micro service to derive the effective business dates based on the batch schedules and holidays (NYSE, Federal Bank holidays).
Learn more See github repositoryA CLI app / micro service to centrally store, manage and serve the metadata to the data management framework. This covers datasets, data quality rules, data reconciliation rules, schedules and holidays.
Learn more See github repositoryA CLI app / micro service to ingest delimited text files into data lake tables using PySpark API.
Learn more See github repositoryA CLI app / micro service to extract data from the data lake tables using PySpark API.
Learn more See github repositoryA serverless ETL orchestration framework in the Microsoft Azure public cloud platform. The framework covers data ingestion, curation and delivery aspects while providing configurable data governance controls.
Learn more See github repository