Everyday Thai logo

Vocabulary Test

Thai vocabulary size

Estimate your Thai vocabulary size

Sample words from seven frequency bands and turn your responses into a rough vocabulary estimate.

You will see one Thai word at a time. Choose whether you know it well enough to understand it without a dictionary.

The public version currently uses the cleaned OpenSubtitles Thai bank. The Thai Web bank is under review because some items still need band cleanup and typo filtering.

FAQ

How is the estimate calculated?

We sample words from seven frequency bands, count how many you say you know in each band, and then scale that hit rate to the full size of the band.

The current scoring model uses your recognition rate in each frequency band and scales that rate to the full size of the band, with a simple 12% confidence margin.

What is the methodology?

This MVP uses a self-report recognition format. Each run pulls five random words from each frequency band, for 35 words total.

The current public version uses the filtered OpenSubtitles Thai source. It started as a raw ranked list that we auto-filtered to remove spaces, Latin script, digits, punctuation, and obvious malformed fragments before sampling evenly within each rank band.

The result is best treated as a rough size estimate, not a certified placement result. The Thai Web bank is currently being cleaned up separately before it returns as a public option.

What is the source corpus?

The public test currently uses a filtered Thai OpenSubtitles frequency list derived from subtitle dialogue. Thai Web 2018 (thTenTen18) is still referenced in our workflow, but it is temporarily removed from the public test while we clean invalid forms and re-band questionable entries.

Thai Web 2018 corpus summary screenshot
Thai Web 2018 (thTenTen18) screenshot showing corpus counts and source summary. The public test is currently driven by the cleaned OpenSubtitles Thai sample while the Thai Web bank is under review.