Introducing talkie: a 13B vintage language model from 1930
Summary
Researchers have created talkie, a 13 billion-parameter language model (a neural network with 13 billion adjustable values) trained entirely on English text from before 1931 to study how AI performs on historical knowledge and invention tasks. The base model uses only out-of-copyright data, but the chat version required fine-tuning (additional training to adjust behavior) with help from modern AI systems like Claude, which introduced some knowledge from after 1931 that the researchers are working to eliminate.
Solution / Mitigation
The talkie team states they 'aspire to eventually move beyond this limitation' by using 'vintage base models themselves as judges to enable a fully bootstrapped era-appropriate post-training pipeline,' meaning they plan to use talkie's own historical knowledge rather than modern AI systems for future training adjustments. However, this is described as a future goal, not a solution currently implemented.
Classification
Affected Vendors
Original source: https://simonwillison.net/2026/Apr/28/talkie/#atom-everything
First tracked: April 28, 2026 at 02:00 AM
Classified by LLM (prompt v3) · confidence: 75%