Notes

Custom Pandas very easily will work bad when using multithreading (which is the default mode of Dask when using dask dataframes). In theory, Pandas, as well as NumPy, releases the GIL under certain circumnstances… but in our case (for example, as soon as you write a custom apply), that will not be the case and multithreading will perform even worse than single-threaded. Thus, it is more interesting to use multiprocessing mode with only one thread per process.