David Zimmermann, PhD

David Zimmermann, PhD

Data Scientist

whoami

I am David Zimmermann, a data scientist from Cologne, Germany interested in solving real-world problems using machine learning, data science, and other appropriate measures. In my past, I did a PhD in computational finance, where I simulated financial markets using agent-based models to find the effects of high-frequency trading.

This blog was built with the intention to share different applications, tricks, and also some mild shenanigans with data, R, C++, Python, or other areas. If you have any comments, ideas, issues, or just want to give me a thumbs up, leave a note or write an email to david_j_zimmermann {at] hotmail [dot} com.

The source of this blog can be found on my .

Interests

  • Data Science
  • Machine Learning
  • Programming

Education

  • PhD in Computational Finance, 2019

    Universität Witten/Herdecke, Germany

  • MSc Finance & Investment, 2015

    University of Edinburgh, UK

  • BA in Corporate Management & Economics (minor in Public Administration and International Relations), 2014

    Zeppelin Universität, Germany

Recent Blog Posts

Introducing dataverifyr: A Lightweight, Flexible, and Fast Data Validation Package that Can Handle All Sizes of Data

In every data project, there should be a check that the data actually looks like what you expect it to look like. This can be as simple as stopifnot(all(data$values > 0)), but as with everything “simple”, you typically want to have some additional features, such as cleaner error messages, rules separated from your R script (eg in a yaml file), result visualization, and last but least, a library that does this as fast as possible.

Introducing RITCH: Parsing ITCH Files in R (Finance & Market Microstructure)

Recently I was faced with a file compressed in NASDAQ’s ITCH-protocol, as I wasn’t able to find an R-package that parses and loads the file to R for me, I spent (probably) way to much time to write one, so here it is.

The Importance of Out-of-Sample Tests and Lags in Forecasts and Trading Algorithms

I recently had the opportunity to listen to some great minds in the area of high-frequency data and trading. While I won’t go into the details about what has been said, I wanted to illustrate the importance of proper out-of-sample testing and proper variable lags in potential trade algorithms or arbitrage models that has been brought up.

A Gentle Introduction to Finance using R: Efficient Frontier and CAPM - Part 1

The following entry explains a basic principle of finance, the so-called efficient frontier and thus serves as a gentle introduction into one area of finance: “portfolio theory” using R. A second part will then concentrate on the Capital-Asset-Pricing-Method (CAPM) and its assumptions, implications and drawbacks.

Speeding "Bayesian Power Analysis t-test" up with Snowfall

This is a direct (though minor) answer to Daniel’s blogpost http://daniellakens.blogspot.de/2016/01/power-analysis-for-default-bayesian-t.html, which I found very interesting, as I have been trying to get my head around Bayesian statistics for quite a while now.

Recent & Upcoming Talks

WIP

Work in Progress, will be updated shortly

Projects

RITCH

An R package/interface to the ITCH Protocol for financial message data

SchellingR

An R package to efficiently simulate Schellings Urban Migration Model

varsExtra

An R package to make handling VAR results easier

colorTable

A Package that makes it easy to color tables

Option Valuation

An interactive shiny application to make it easier to understand financial options