Series

AI explained

Short notes from the AI explained series: LLMs, agents, multimodal models, and the ideas that made the field move.

Mar 24, 2024

Reducing the Reversal Curse?

A follow-up on the reversal curse and how reverse training may help language models handle reversed relations.

Mar 17, 2024

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Apple's MM1 paper distilled into practical lessons on data mixtures, image resolution, synthetic data, and multimodal architecture.

Read post →

Feb 18, 2024

Chain-of-Thought: Do LLMs Need Prompts to Think?

A look at CoT decoding, an approach that explores alternative decoding paths to reveal reasoning without explicit chain-of-thought prompts.

Read post →

Feb 18, 2024

Efficient Exploration for LLMs

A look at active exploration for RLHF-style feedback collection and why choosing better questions can improve LLM training efficiency.

Read post →

Feb 9, 2024

An Interactive Agent Foundation Model

A short note on embodied agent foundation models, multimodal perception, planning, and what scale could mean for agent capabilities.

Read post →

Jan 30, 2024

Evaluating Multi-modal Large Language Models

A short overview of how a large benchmark evaluates multimodal language models across generalization, trustworthiness, and causality.

Read post →

Jan 21, 2024

Self-Rewarding LLMs

An explanation of self-rewarding language models, where models generate prompts, judge responses, and iteratively train from their own feedback.

Read post →

Jan 14, 2024

Towards Conversational Diagnostic AI

A short explainer on AMIE, Google's diagnostic dialogue system for medical interviews and clinical reasoning.

Read post →

Jan 9, 2024

AI Content Detectors Are Not Reliable

A follow-up note on why AI text detectors can fail, from perplexity and burstiness limits to false positives and bias against non-native writers.

Read post →

Jan 7, 2024

Language Models May Not Be Few-Shot Anymore

A look at task contamination and why benchmark gains in zero-shot and few-shot LLM evaluation may be misleading.

Read post →

Dec 27, 2023

What Could LLM Do with Your Smartphone?

A look at AppAgent, a multimodal agent that learns to operate smartphone apps through tapping, swiping, screenshots, and memory.

Read post →

Dec 17, 2023

How Susceptible Are LLMs to Persuasive Misinformation?

A study of how persuasive misinformation can change LLM responses, from rejection and uncertainty to acceptance.

Read post →

Dec 11, 2023

Who Would Learn Faster? A Chicken or Vision Transformer?

A NeurIPS study comparing newborn chicks and Vision Transformers on view-invariant object recognition from limited visual experience.

Read post →

Dec 8, 2023

Mamba: Selective State Space Model That Outperformed Transformers

An introduction to Mamba, selective state space models, and why linear-time sequence modeling is exciting for language models.

Read post →

Dec 1, 2023

Scalable Extraction of Training Data from Production Language Models

A look at how researchers extracted memorized training examples from ChatGPT and what that means for privacy and copyright.

Read post →

Nov 24, 2023

Removing Irrelevant Text for Better Answer Generation

A short explanation of System 2 Attention, a method for regenerating context before answering to reduce distraction and sycophancy.

Read post →

Nov 24, 2023

The Reversal Curse

An explanation of why LLMs can learn A equals B but fail to answer the reverse relation B equals A.

Read post →

Nov 16, 2023

How to Detect AI-Written Texts?

A practical explanation of Ghostbuster, perplexity features, and why AI text detection remains difficult.

Read post →

Nov 12, 2023

Three ML Blogs Worth Reading

A short recommendation list of high-quality machine learning blogs by Lilian Weng, Eugene Yan, and Chip Huyen.

Read post →

Nov 9, 2023

Implicit Reasoning in LLMs

An explanation of implicit chain-of-thought reasoning through hidden states and knowledge distillation.

Read post →

Nov 8, 2023

The Technology Behind Illusion-Like Generated Images

A quick explanation of ControlNet and how diffusion models can generate QR-code and illusion-like images.

Read post →

Nov 2, 2023

Large Language Models Understand and Can Be Enhanced by Emotional Stimuli

A short explainer on EmotionPrompt and how emotional stimuli can affect LLM task performance.

Read post →

Oct 26, 2023

Enhancing LLMs Reasoning Abilities with Step-Back Prompting

An explanation of step-back prompting and how abstraction can improve LLM reasoning on complex tasks.

Read post →

Oct 26, 2023

Controlling DALL-E 3 Output with Seeds

A short practical note on using random seeds with DALL-E 3 to make image generation more reproducible and controllable.

Read post →

Oct 20, 2023

What Are LLM Agents?

A practical explanation of LLM chains, agents, tools, memory, and why autonomous planning changes how LLM apps behave.

Read post →

Oct 13, 2023

Are LLMs the Future of Image Generation?

A look at why visual tokenizers like MAGVIT-v2 make language-model-based image and video generation more competitive with diffusion models.

Read post →

Oct 6, 2023

Summarize a Video with LLM: a Tutorial

A short tutorial on downloading YouTube transcripts, restoring punctuation, and using the OpenAI API to summarize or query a video.

Read post →

Sep 25, 2023

Claude Doesn't Get the Attention It Deserves

A short note on Claude's long-context advantage, why full-document context matters, and Anthropic's analysis of prompting strategies.

Read post →

Sep 24, 2023

RAIN: Aligning LLMs Without Finetuning

A short note on RAIN, a rewindable inference technique for making pretrained LLM outputs more helpful and harmless without weight updates.

Read post →

Sep 7, 2023

Transform Your CV into an Interactive Chatbot with LLM, FAISS and LangChain

A step-by-step tutorial for building an interactive CV chatbot with TRURL, Hugging Face embeddings, FAISS, and LangChain.

Read post →

Aug 28, 2023

Why SentenceBERT Became Useful Again in LLM Pipelines

A short note on why LangChain, FAISS, and RAG made smaller embedding models like SentenceBERT important again.

Read post →

May 1, 2021

Detect Waste: Project Summary

By Agnieszka Mikołajczyk

DetectWaste Project

A wrap-up of the 5-month non-profit project, roles, outputs, arXiv paper, repository, and blog series.

Read post →

Mar 5, 2021

Detect Waste: Pseudo-labeling for Waste Classification

By Sylwia Majchrowska, Agnieszka Mikołajczyk

DetectWaste Project

Using pseudo-labeling and OpenLitterMap data to expand a waste classifier beyond the labeled dataset.

Read post →

Feb 9, 2021

Detect Waste: Comparing Approaches on Extended TACO

By Sylwia Majchrowska, Agnieszka Mikołajczyk

DetectWaste Project

EfficientDet results on 7-class and one-class waste detection, plus the motivation for separating detection and classification.

Read post →

Dec 8, 2020

Detect Waste: TACO Dataset Analysis

By Maria Ferlin, Agnieszka Mikołajczyk

DetectWaste Project

Exploratory analysis of extended TACO annotations, category mapping, bounding boxes, and dataset imbalance.

Read post →

Nov 20, 2020

Detect Waste: Project Introduction

By Maria Ferlin, Agnieszka Mikołajczyk

DetectWaste Project

How the Detect Waste story began: motivation, recycling rules, waste categories, and the TACO dataset.

Read post →

Jan 6, 2020

Sound-Based Bird Classification

By Agnieszka Mikołajczyk, Magdalena Kortas

Bird Song Classification

How a WiMLDS Trójmiasto team used deep learning, acoustics, and ornithology to classify bird species from sound.

Read post →