Blog Archive

Fighting the regression

November 13, 2024

This is a journey through the trials, tactics, and surprises of building a reliable time series model despite relentless regression to the mean.

Tokenization

August 29, 2024

“Tokenization is at the heart of much weirdness of LLMs. Do not brush it off.” AK

Applied Ml resource

August 23, 2024

Thought I'd share this awesome repo for applied ML from the prodigous blogger Eugene Yan :) - https://github.com/eugeneyan/applied-ml

Gradient checkpointing

August 19, 2024

Gradient checkpointing enables you to run a more powerful model on your machine - beneficial under training.

Batch processing

August 18, 2024

A crucial technique in training neural network (NN) models. Instead of processing individual data samples one at a time, batch processing groups multiple samples into batches and processes them simultaneously.

Software 2.0

August 18, 2024

In Software 2.0 most often the source code comprises the dataset that defines the desirable behavior....

RAG -> stop hallucinations!

August 18, 2024

Using them in a RAG architecture brings some different constraints to the table. I think the biggest one is the expectation, especially in a corporate setting, for being factual. But this is not the strength of an LLM. In fact many have deemed the ‘hallucination problem’ a feature and not a bug....

Special token injection attacks

August 18, 2024

A special token is one designed to only be relevant to the model - such as <|end_of_text|>. These can cause standard software engineering bugs if handled poorly in the code.

GPU - Model memory

August 18, 2024

How much GPU do you need? State of the art performance requires state of the art machinery. A100's are not cheap!