# grochmal.org - Ancillary Coding and Math Support ### Mathematics¶

• PCA from Scratch
Principal Component Analysis (PCA) is a technique for linear data visualization and general dimensionality reduction. The technique is based on decomposition into eigenvectors and eigenvalues. Eigenvectors and eigenvalues can be explained and exemplified with reasonable ease. Yet, to calculate them from a matrix can be quite tricky.
• Sample Bias Derivation
Statistics over a population are well defined but when all the data we have is a collected sample of the population we need to account for issues with the collection of the sample. The simplest correction is to subtract one from the number of data points when calculating the statistics. That "minus one" seem quite arbitrary at first but there is a lot more theory behind that value than it first appears.
• Least Squares Derivation
Least Squares is quite and old technique, likely the eighteen century or even as early as Newton, but still is the basis for most regressions. The derivation of the technique in a few dimensions can be done by hand and then extrapolated into further dimensionality.
• MOOC Confidence Analysis - Sustainable Energy Access
Statistical Analysis of MOOC Data. We take data from a MOOC (Massive Online Open Courses) platform and build statistics on participants of the course. Notably statistics on the attendance of the course and the confidence of participants in applying the knowledge in the material.
• MOOC Confidence Analysis - SEA on a different MOOC platform
Given a new dataset one should be able to write only the parts that fit the data model and the analysis part should just work on the new dataset - perhaps with minimal tweaks. This is an example of such new dataset analysis, which builds the same data model using data collected from a different MOOC platform.
• q-Entropic Forms
Entropy is a measure of information. When faced with a prediction, e.g. in machine learning, one uses known information to make a prediction. In linear problems such predictive information can be well represented by entropy. In non-linear problems on the other hand different entropic forms are needed.
• q-parametrized Neurons
Non-extensive Entropy is better suited to describe learning systems than plain Entropy. Notably because plain entropy is a special case of non-extensive entropy. Parallels can be drawn between problem linearity and entropic forms.

### Unix¶

• Genetic Algorithms on ANNs
Genetic Algorithms (GA) are an optimisation technique rarely used for Neural Network (NN) training, derivative based techniques have higher memory and processing time efficiency. This does not mean than GAs are not a viable option, and for a small problem training NNs with GAs can be easily coded.

### Hacking¶

• Strange Attractors in Stocks
Taking the dripping faucet analysis to the stock exchange produces a few interesting patterns. Both the differences between the open an close positions for the day, as well as between the high and low values on a day, have fractal patterns when compared between days. Patterns even more notable when compared across different numbers of days.
• Simple MNIST Network Weight Evolution