How to Train Your Agent on 100+ Pages in 2 Minutes
You want your agent to know your entire website. But adding pages one by one? Nobody has time for that.
Maybe you've got a docs site with 200 pages. Maybe your help centre keeps growing. Maybe you've been putting this off because it sounds like a nightmare.
Good news: Chat Thing can crawl your entire site and add it to your agent automatically.
The old way vs the new way
The old way:
Copy URL. Paste. Click add. Repeat. Repeat. Repeat. Give up after 15 pages and hope nobody asks about the other 185.
The new way:
Give Chat Thing one URL. Click "Crawl". Go make a coffee. Come back to an agent that knows your entire website.
We'll discover up to 600 pages automatically, let you review them, and sync the lot in one go.
How to crawl your website
- Go to your agent's tab (see documentation here).
- Click New data source and select Website
- Click the Crawl button
- Enter your website's URL and hit Crawl
- Wait while we discover your pages (larger sites take a few minutes)
- Review the discovered URLs and toggle off any you don't want
- Click Add URLs and then Synchronise
That's it. Your agent now knows your entire site.
Pro tip: Use content selectors
By default, we grab everything in the page body (minus headers and footers, which would just waste tokens by repeating on every page).
But you can get smarter. If your content lives in a specific container, like main or .article-content, set that as your CSS selector. Your agent gets cleaner data, and you use fewer storage tokens.
Worth spending 5 minutes on before you sync 200 pages.
What about sites behind a login?
If your site uses Basic Auth (that browser popup asking for username/password), we've got you covered. Just add your credentials in the scraping settings and we'll access the protected content. Other login types (like form-based logins or OAuth) aren't supported though - we can only crawl public pages or Basic Auth protected ones.
When crawling makes sense
Great for:
- Documentation sites
- Help centres and FAQs
- Marketing sites with lots of pages
- Blogs and content libraries
- Any site where pages link to each other
Maybe not ideal for:
- Single-page apps (crawling won't find much)
- Sites with lots of duplicate content
- Pages behind form logins or OAuth (Basic Auth is fine)
Keep it fresh with auto-sync
Once you've crawled your site, turn on auto-sync to keep it up to date. We'll check for new and changed pages on whatever schedule you set.
Your agent stays current without you lifting a finger.
Go try it
Got a website with more than a handful of pages? Give crawling a try. It takes about 2 minutes to set up, and your agent gets instant access to everything.
Create a free account and add your first website data source. Your future self will thank you.



