July | Filipe Mendes

Go

Having fundamental tools for the software development process, such as testing, test coverage, and race condition detection, built-in into Go is a very good decision. It helps developers to try to stay within the boundaries of what’s provided out of the box.

$ go test            # Compiles and runs the local tests 
$ go test -cover     # Runs the local tests and generates a basic test coverage report
$ go test -race      # Runs all tests and detects race conditions on go routines

“A chan struct{} is the smallest data type available from a memory perspective so we get no allocation versus a bool.”
“Make use of make when creating a channel, rather than var ch chan struct{}. Otherwhen when you use var the variable will be initialised with the “zero” value of the type. So for string it is “”, int it is 0, etc.”

Machine learning

“When training a (machine learning) model you must always have both a training set (~80%) and a validation set (remaining ~20) and you must measure the accuracy of your model only on the validation set.”
“A classification model is one that attempts to predict a class, or category. A regression model is one that predicts numeric quantities, such as temperatures and locations”.
“If you train a large enough model for a long enough time, it will eventually memorise the label of every item in the dataset! The result will not be a useful model”."
“This suggests a good rule of thumb for converting a dataset into an image representation: if the human eye can recognise categories from the images, then a deep learning model should be able to do so too.”
“It’s important to keep a testing set, apart from the validation set, and the training set, if companies offshore their model trainings, because the offshore company could be training on the validation set (as well) and make the model overfitted. So in this case it would make the model great at predicting all available data but perform very poorly with unseen data.”
“On each epoch, which is one complete pass over all images in the dataset, we focus on a specific part of each image. (…) This is also why we should do several epochs.”
“An image recognizer can, only recognize images. But a lot of things can be represented as images, (…) For instance, a sound can be converted to a spectrogram, (…) A time series can easily be converted into an image by simply plotting the time series on a graph”

Questions

Q1: What does long enough mean? What does large enough model mean? What are the limits?

Long enough means when model stars to memorise specific training examples including noise and irrelevant details. Excellent at predicting training examples but not unseen data. This is overfitting.
To help regularise, you should monitor the models performance on a separate validation set, and when the performance on the validation set stars to worsen, you stop training. This indicates the model is starting to overfit.
A very high learning rate can make the model converge quickly but might jump optimal points or lead to memorisation. So monitor both training loss/accuracy and validation loss/accuracy on each model.

by Gemini 2.5-flash

Quotes

Go - Learn go with tests
Machine Learning - Practical deep learning - Lesson 1
Gemini 2.5-flash