When talking about education and large language models the general sentiment appears to be that they are simply the end of education. I find this quite bewildering since I have exactly the opposite perception.
I certainly do not aim to disentangle the emotional-cognitive mess of the topic1 but rather …
A common critique of machine learning research is that the results of many papers are based on trial-and-error, often representing only minor modifications of existing approaches and without any proper justification of the design decisions. The typical justification for publication is that a model resulted in a good score on …
For detecting outliers in non-normally distributed data a common approach is computing the modified z-scores.
Instead of the standard deviation the median absolute deviation is used as a measure of variability. The metric is defined as the median of the the absolute deviations from the sample median.
Inspecting a data set is usually not a linear process, but involves iterative refinement of the initial assumptions and question. One might have an initial question but after taking a look at the data set realize that it requires multiple preprocessing steps before it can be used to address the …