Dr. Florian Hinzpeter

Collaborative and pragmatic data science consulting, from Big Data and Machine Learning to Responsible AI and beyond.

About me

Hi, I’m Florian! I’m a passionate and experienced data science consultant specialized in machine learning, cloud data platforms, big data technologies, and responsible AI. My mission is to help organizations leverage their data to make better decisions and drive growth.
Over the years, I’ve had the privilege of working on some truly exciting data projects in a variety of industries, from health insurance to automotive. Whether I’m building predictive models, designing data pipelines, or performing exploratory data analysis, I always bring my best to the table and strive to deliver exceptional results. One of my heartfelt concerns is to ensure that the impact of machine learning on our society is sustainable and positive. This is why I’ve made it my mission to help companies create more reliable, responsible, and ethical machine learning systems. To achieve this, I’ve acquired extensive knowledge in the areas of bias & fairness in machine learning, explainable AI, and modern ML Ops best practices. I’m truly excited about the cutting-edge technologies that help us to make machine learning fair, transparent, and trustworthy. So, if you’re looking to audit your machine learning models for discrimination or wish to gain a deeper understanding of how your system arrives at decisions, I’m here to help! I am also highly experienced in assessing the level of compliance with respect to the upcoming AI Act of the European Union.

Before starting my career as a data consultant in 2019, I worked as a researcher in theoretical physics at the Technical University in Munich, where I also earned my PhD in 2018. During my research time I had the opportunity to study spatial aspects of biochemical reactions. The discoveries I made were truly fascinating and even led to a publication in Nature Physics.

My services

Data Science & Machine Learning

Custom development of data science solutions that use statistical modelling and machine learning to extract actionable insights from complex data and help you make data-driven decisions.

Cloud
Data Platform

Holistic design and implementation of modern data platforms that enable efficient data storage, processing and analysis. This forms the basis for the development of scalable data products.

Big Data
Engineering

Expert advice on your data infrastructure. From building data ingestion and ETL pipelines to ensuring data governance and quality solutions, I can help you orchestrate and manage your data to unlock its full value.

Responsible
Artificial Intelligence

Tailored advice on developing machine learning systems that are robust, transparent and free from discriminatory bias. My expertise ensures that your systems are compliant with the EU AI Act.

Portfolio

Certifications

References

Lorem Ipsum is simply dummy text

It is a long established fact that a reader will be distracted by the readable content of a page when looking at its layout. The point of using Lorem Ipsum is that it has a more-or-less normal distribution of letters, as opposed to using 'Content here, content here', making it look like readable English.

Lorem Ipsum is simply dummy text

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

Voriger

Project Description	In this project I leveraged Explainable AI indicators to provide model explanations to model end-users (internal staff). Local model explanations were visualized on a dashboard. This allowed the users to investigate the model decision making on single data instances.
Industry	Automotive Industry
Project roles	Explainable AI Expert, Senior Data Scientist
Tasks & Technologies	For model explanations we leveraged the SHAP library together with its rich visualization options. For dashboard development we used streamlit. Software development was done in Python. Deployment was done with Docker.

Project Description	In this role, I helped a client build a complete data science platform in Microsoft Azure and Databricks.
Industry	Automotive Industry
Project roles	Cloud Architect, Solution Architect
Tasks & Technologies	Deployment of a Azure Databricks Data Science Platform including all required Azure resources (Azure Data Factory, Azure Data Lake, Azure Key Vault, Azure Databricks, MSSQL). Provisioning and management of Azure and Databricks resources via Infrastructure as Code (IaC) using Bicep templates, Azure CLI and Databricks CLI as well as implementation of a Continuous Deployment pipeline in Azure DevOps. Design of Data Ownership concepts and implementation of Data Permission Groups using ACLs on the Azure Data Lake Storage and Azure Active Directory (AAD). Synchronization of AAD and Databricks Identity Managamenet via SCIM.

Project Description	The aim of this project was to assess the data science platform and workflow of a company with respect to responsible AI. The assessment consted of reviewing the maturity of the company in the dimensions, (1) Fairness & Bias of machine Learning models, (2) Transparency & Trust of machine learning solutions, (3) technical reliability, (4) data governance and data quality, (5) Ethical and Sustainable AI
Industry	Wealth Management
Project roles	Explainable & Responsible AI Expert
Tasks & Technologies	Creation of a self-assessment framework including a survey to determine the maturity level in the area of Responsible AI, tailored to the requirements of the EU AI Act.

Project Description	The goal of this project was to integrate On-premise data into the Azure Cloud using the Lakehouse paradigm.
Industry	Automotive Industry
Project roles	Explainable AI Expert, Senior Data Scientist
Tasks & Technologies	Integrate on-premises data into a cloud-hosted data lake with ETL and ELT using Azure Data Factory and Databricks. Implement data pre-processing and aggregation pipelines according to the Lakehouse architecture using the Spark and Photon engines along with SparkSQL and PySpark. Implement data-governance using the delta lake engine and Hive Metastore.

Project Description	In this project, I developed the data science and software components for a machine learning system to determine recommendations for health insurance products for policyholders.
Industry	Health Insurance
Project roles	Lead Software Developer & Senior Data Scientist
Tasks & Technologies	Software Development with Python, The suitability of a product for an insured person was determined using Graphical Probabilistical Models (Bayesian Networks), the software was unit- and integration-tested with Pytest, for version control and CI/CD piplines Gitlab was used, for dependency management and virtual environment organisation Poetry was used.

Dr. Florian Hinzpeter

About me

My services

Data Science & Machine Learning

Cloud
Data Platform

Big Data
Engineering

Responsible
Artificial Intelligence

Portfolio

Certifications

Databricks Certified Associate Developer for Apache Spark 3.0

CI/CD YAML Pipelines with Azure DevOps

Databricks Certified Data Engineer Associate

Databricks Certified Machine Learning Associate

Python 3, Deep Dive

REST APIs with Flask and Python

Deployment of Machine Learning Models

PyTorch for Deep Learning and Computer Vision

Build Better Generative Adversarial Networks (GANs)

Fundamentals of Reinforcement Learning

Build Basic Generative Adversarial Networks(GANs)

Applied Plotting, Charting & Data Representation in Python

Introduction to Big Data

Big Data Modeling and Management Systems

Version Control with Git

Applied Machine Learning in Python

Introduction to Data Science in Python

SQL Bootcamp

Deep Learning Specialization

Machine Learning Specialization

References

Contact me directly here!

Copyright ©2023 Dr. Florian Hinzpeter
Imprint | Privacy policy

Copyright ©2023 Dr. Florian Hinzpeter | Imprint | Privacy policy

Project Description	The aim of this project was to predict the length of stay of insured patients in hospital using machine learning methods. With accurate predictions, the insurance company was able to control the length of stay in a more targeted way.
Industry	Health Insurance
Project roles	Lead Data Scientist, Senior Software Developer
Tasks & Technologies	Software development in Python,feature engineering and machine learning pipline assembly with scikit-learn, Gradient Boosted Tree Regression with Catboost, Oversampling of minority samples (SOMTE_NC) with imbalanced-learn, code version control with Gitlab, dependency management and virtual environment organisation with Poetry

Project Description	The aim of this project was to modernise the methodology of estimating used car prices. To do this, large amounts of data from real transactions and online exchanges were used to train a machine learning system. The machine learning system was then used to assist the valuation experts.
Industry	Automotive Industry
Project roles	Lead Data Scientist, Machine Learning Engineer
Tasks & Technologies	Software development in Python, Gradient Boosted Tree Regression with Catboost, API design and implementation with FastAPI, Data preparation with Spark, Model versioning and experiment tracking with MLflow, Databricks was used as the data platform.

Project Description	In this project I leveraged topic modeling approaches (Latent Dirichlet Allocation) to extract generic car accident scenarios using car repair data that contained information about repaired or replaced parts and working positions.
Industry	Automotive Industry
Project roles	Lead Data Scientist, Senior Software Developer
Tasks & Technologies	Software development in Python, training of Latent Dirichlet Allocation with SparkML, deployment with docker, model version and orchestration with MLflow

Project Description	The goal of this project was to design and provision a generic pipeline that automates the process of continuous integration and continuous deployment (CI/CD).
Industry	Health insurance
Project roles	DevOps Expert, Software Developer
Tasks & Technologies	The CI/CD pipeline was implemented in Gitlab using Gitlab Runners. For continuous integration unit- and integration-tests, linting, and docker-linting was implemented. For continuous deployment load, acceptance, performance tests were implemented.

Project Description	The goal of this project was to use natural language processing techniques to classify emails into different categories of related content.
Industry	Automotive Industry
Project roles	Data Scientist
Tasks & Technologies	For email preprocessing and tokenization we used the NLTK python library, for modelling we used a support vector machine from Scikit-Learn.

Project Description	This workshop consisted of three one-hour sessions. The goal of this workshop was to train data science consultants in the area of auditing and mitigating discriminatory bias in machine learning models. Each workshop session consisted of a theoretical part and a hands-on coding part.
Industry	Consulting
Project roles	Explainable & Responsible AI Expert
Tasks & Technologies	In this workshop we presented different bias metrics and way of how to quantify discriminatory behavior. Furthermore we presented various ways of how to mitigate those biases. For the coding demos we used the python packages Fairness 360 and Aequitas.

Project Description	This project consisted of introducing ML Ops best practices such as code, data, and model versioning as well as automated deployment pipelines
Industry	Automotive Industry
Project roles	Machine Learning Engineer
Tasks & Technologies	For code versioning we used Git as a software development best practice, for model versioning we used MLFlow and its model registry functionality. For data versioning we used Delta Lake. The deployment pipelines were build using Azure DevOps. Model containerization was done with Docker.

Dr. Florian Hinzpeter

About me

My services

Data Science & Machine Learning

Cloud Data Platform

Big Data Engineering

Responsible Artificial Intelligence

Portfolio

Certifications

References

Contact me directly here!

Copyright ©2023 Dr. Florian Hinzpeter Imprint | Privacy policy

Copyright ©2023 Dr. Florian Hinzpeter | Imprint | Privacy policy

Cloud
Data Platform

Big Data
Engineering

Responsible
Artificial Intelligence

Copyright ©2023 Dr. Florian Hinzpeter
Imprint | Privacy policy