Mikel Añibarro Ortega

R&D Data Scientist, PhD

Bridging the gap between scientific complexity and actionable business insights. Focused on machine learning, advanced data analysis, and AI engineering.

About Me

I'm a PhD Chemist turned Data Scientist, combining 6+ years of rigorous research and experimental design with hands-on expertise in data analysis and machine learning. Currently deepening my knowledge in AI Engineering and MLOps, with a focus on building systems that don't just model reality, they ship to production.

My expertise spans tabular ML, multivariate analysis (PCA, PLS, LDA), experimental design (DoE, RSM), and A/B testing. I bridge the gap between complex technical data and clear business strategy—whether in FoodTech, Pharma, or Industrial optimization.

What I bring: Scientific precision meets scalable solutions. I don't just analyze data; I prescribe next moves.

🎯 Scientific Rigor

Trained in hypothesis testing, experimental design, and statistical validation. Every insight is reproducible and defensible.

🔄 Cross-functional Bridge

5+ years mentoring international teams. I translate between technical complexity and business strategy effortlessly.

📊 Tabular Data Expert

Specialized in ML on structured data. I know when to use simple statistics vs. complex models—and why.

🌍 Domain Deep Dives

Food science, chemistry, pharma background. I understand the domain, not just the algorithms.

Featured Projects

1

Bioactive Extraction Optimizer

Predictive modeling for phytochemical recovery

Problem: Standardizing Aloesin extraction from Aloe vera by-products required manual, inefficient experimental iterations.

💡 Solution

Deployed a Response Surface Methodology (RSM) model that maps solvent/temperature/time relationships to yield optimization. Transformed experimental data into an interactive decision-making tool that reduces time-to-optimal-extraction by ~70% and minimizes solvent consumption.

🔬 Technical Core

  • Analytical: HPLC-DAD-ESI/MSn quantification of Aloesin
  • Design: Central Composite Rotatable Design (CCRD) with Reduced Cubic Models
  • Impact: 70% reduction in optimization time and solvent consumption, reproducible predictions

🛠 Tech Stack

Python Streamlit Statsmodels Plotly Scikit-learn
2

Food Labeling Audit

Uncovering hidden truths in food industry data

Problem: Food labeling is contradictory and often misleading. Current standards (NutriScore, BIO seals) have blind spots that enable manipulation.

💡 Key Findings

  • The NutriScore Hack: Blind to industrial processing; ultra-processed get A/B ratings by adding isolated fiber/protein
  • BIO Label Limitation: Guarantees sustainability but not nutritional quality—ecologic ≠ nutritious
  • Private Labels Win: Against stereotypes, store brands offer cleaner labels, less sugar, more fiber than big corporations
  • Plant-based Paradox: In meat-culture countries, vegan alternatives prioritize taste mimicry over protein density, adding chemical complexity

📊 Dataset & Scope

  • 10,000+ products from OpenFoodFacts snapshot
  • 20 nutritional & labeling features
  • Cross-segment analysis: categories, brands, regions

🛠 Tech Stack

Python Pandas Matplotlib Seaborn
3

NutriScore 2.0

Transparent, manipulation-proof food classification

Problem: Original NutriScore is incomplete & manipulable. It doesn't account for additives, processing complexity, or "false healthy" products.

💡 Solution

Built a custom additive classification system using EFSA safety data + PubMed API mining, then applied K-means clustering on enriched feature space. Result: A food classification that detects manipulated products and "false healthy" items that pass original NutriScore.

🔬 Methodology

  • Data Engineering: EFSA additive database + PubMed API for bioavailability/risk data
  • Feature Creation: Additive toxicity index, processing complexity score, nutrient density
  • Modeling: K-means clustering validated with Hierarchical Clustering dendrogram convergence
  • Robustness: Detects "false healthy" products (NutriScore A/B but flagged in 2.0)

📊 Dataset

  • Complete OpenFoodFacts database (~800k products)
  • Engineered features: additives, nutritional profiles, processing markers
  • External validation: EFSA + PubMed data integration

🛠 Tech Stack

Python Scikit-learn K-means Streamlit APIs

Skills & Expertise

📊 Data Science & ML

  • Machine Learning (Supervised & Unsupervised)
  • Tabular Data Analysis
  • K-means, Hierarchical Clustering
  • Classification & Regression
  • Feature Engineering & Selection
  • Model Validation & Evaluation

📈 Statistical Analysis

  • Multivariate Analysis (PCA, PLS, LDA)
  • Design of Experiments (DoE)
  • Response Surface Methodology (RSM)
  • A/B Testing & Hypothesis Testing
  • ANOVA & Statistical Inference
  • Central Composite Designs

💻 Programming & Tools

  • Python (Pandas, NumPy, Scikit-learn)
  • SQL & Database Queries
  • Streamlit & Plotly (Visualization)
  • Jupyter Notebooks & Git
  • APIs & Web Scraping

🌐 Soft Skills

  • Cross-cultural Team Leadership
  • Mentoring & Knowledge Transfer
  • Technical Communication (Tech & Non-tech)
  • Stakeholder Management
  • Project Coordination
  • Problem-solving & Adaptability

🔬 Domain Expertise

  • Food Science & Technology
  • Analytical Chemistry
  • Bioactive Compounds
  • Functional Foods
  • By-product Valorization
  • Pharmaceutical/Nutraceutical Development

🗣️ Languages

  • Spanish (Native)
  • Basque (Native)
  • English (C1 - Professional)
  • Portuguese (C1 - Professional)

Professional Experience

Research Scientist

Jan 2025 — Oct 2025

Polytechnic Institute of Bragança (IPB) | Portugal

  • Coordinated R&D data acquisition and analysis for cross-functional research teams
  • Mentored 5+ junior researchers; ensured data quality, methodological rigor, and independence
  • Translated complex quantitative findings into actionable research insights

PhD Candidate & Researcher

Sep 2019 — Apr 2025

University of Vigo / IPB | Spain & Portugal

  • Multivariate Analysis: Identified correlations between extraction methods, chemical composition, and biological activities using PCA, PLS, DoE, and RSM
  • Experimental Design: Central Composite Rotatable Design (CCRD) optimizations; A/B testing of nutraceutical formulations
  • Publications: Author of 30+ ISI-indexed scientific papers; 50+ conference communications
  • Impact: PhD awarded Cum Laude with International Distinction; co-inventor of 1 patent (PT117256-A)

QC Laboratory Technician

Nov 2015 — Jun 2016

FYM - Italcementi Group | Spain

  • Executed standardized analytical tests on cement samples using XRF and chromatographic techniques
  • Investigated data outliers; identified root causes and implemented corrective actions

Laboratory Attendant

Feb 2017 — Jul 2017

Terai Cosmética | Spain

  • Executed cosmetic formulation processes under GMP protocols

Education

PhD in Agri-Food Science & Technology

University of Vigo & Polytechnic Institute of Bragança | 2019–2025

🎓 Cum Laude with International Distinction

MSc in Chemistry & Pharmacy of Natural Products

University of Salamanca & Polytechnic Institute of Bragança | 2017–2019

BSc in Chemistry

University of the Basque Country (EHU) | 2010–2016

Professional Certifications

IBM Data Science Professional Certificate (2025) | Google Advanced Data Analytics Specialization (2025) | The Bridge Data Science & AI Bootcamp (2026)

Key Achievements

30+ ISI-indexed publications | 50+ international conference presentations | 1 Patent (PT117256-A) | Young Editorial Board Member in Foods and Antioxidants journals.