{"id":681,"date":"2024-02-23T15:00:00","date_gmt":"2024-02-23T15:00:00","guid":{"rendered":"https:\/\/cherylroll.com\/site-migrations-ai-powered-redirect-mapping-437793\/"},"modified":"2024-02-23T15:00:00","modified_gmt":"2024-02-23T15:00:00","slug":"site-migrations-ai-powered-redirect-mapping-437793","status":"publish","type":"post","link":"https:\/\/cherylroll.com\/site-migrations-ai-powered-redirect-mapping-437793\/","title":{"rendered":"How to speed up site migrations with AI-powered redirect mapping"},"content":{"rendered":"
Migrating a large website is always daunting. Big traffic is at stake among many moving parts, technical challenges and stakeholder management.<\/p>\n
Historically, one of the most onerous tasks in a migration plan has been redirect mapping. The painstaking process of matching URLs on your current site to the equivalent version on the new website.<\/p>\n
Fortunately, this task that previously could involve teams of people combing through thousands of URLs can be drastically sped up with modern AI models.<\/p>\n
The term “AI” has become someone conflated with “ChatGPT” over the last year, so to be very clear from the outset, we are not talking about using generative AI\/LLM-based systems to do your redirect mapping. <\/p>\n
While there are some tasks that tools like ChatGPT can assist you with, such as writing that tricky regex for the redirect logic, the generative element that can cause hallucinations could potentially create accuracy issues for us.<\/p>\n
The primary advantage of using AI for redirect mapping is the sheer speed at which it can be done. An initial map of 10,000 URLs could be produced within a few minutes and human-reviewed within a few hours. Doing this process manually for a single person would usually be days of work.<\/p>\n
Using AI to help map redirects is a method you can use on a site with 100 URLs or over 1,000,000. Large sites also tend to be more programmatic or templated, making similarity matching more accurate with these tools.<\/p>\n
For larger sites, a multi-person job can easily be handled by a single person with the correct knowledge, freeing up colleagues to assist with other parts of the migration.<\/p>\n
While the automated method will get some redirects “wrong,” in my experience, the overall accuracy of redirects has been higher, as the output can specify the similarity of the match, giving manual reviewers a guide on where their attention is most needed<\/p>\n
Using automation tools can make people complacent and over-reliant on the output. With such an important task, a human review is always required.<\/p>\n
The script is pre-written and the process is straightforward. However, it will be new to many people and environments such as Google Colab can be intimidating.<\/p>\n
While the output is deterministic, the models will perform better on certain sites than others. Sometimes, the output can contain “silly” errors, which are obvious for a human to spot but harder for a machine.<\/p>\n
By the end of this process, we are aiming to produce a spreadsheet that lists “from” and “to” URLs by mapping the origin URLs on our live website to the destination URLs on our staging (new) website.<\/p>\n
For this example, to keep things simple, we will just be mapping our HTML pages, not additional assets such as CSS or images, although this is also possible.<\/p>\n
You’ll need to perform a standard crawl on your website. Depending on how your website is built, this may or may not require a JavaScript crawl<\/a>. The goal is to produce a list of as many accessible pages on your site as possible.<\/p>\n