Finding broken internal links is somewhat straightforward because these are links you created, so when you make changes you are well aware to double-check to see if these links still work.
When it comes to external links, thus referencing URLs from other websites, we can never tell when they are broken. You might have articles or content that reference these pages and readers will end up on a broken page if the page is no longer available or has been moved.
This could lead to a bad reader experience or incohesive content if your content heavily relies on this link.
Scanning your content for broken links
If you have existing content on your site you can scan it for broken links using Screamingfrog.
- Head over to their website and download a version suitable for your operating system.
- Go ahead and open it up once you are all done with the setup.
- Enter your website URL in the text bar right next to the Screemingfrog logo
- Hit start next to the text bar to start crawling your site.
- The tabbed section right underneath the text bar segments the results into different data points. We are interested in the Internal and External tabs, these show all the internal and external links found on your website.
- From the Status Code column you can tell if a link is broken or not depending on the code shown, generally, a status code of 200 or 301 is ok, any other status especially 404 means the page is broken. You can also google the status codes to further figure out what they mean. You may also just copy the URL from the Address column and try accessing it from your browser to see if it loads up.
NB: The free version of Screamingfrog will crawl through only 500 pages, you will have to upgrade to the paid version to get it to scan through your entire website.
Catching links right when they break
When a link breaks, you may also want to know as soon as possible. To do that you will need to monitor the link constantly. Luckily enough there are many tools out there for this.
We will look at a free version, Cron-job.org. This service will attempt to visit the URL you provide at a scheduled interval. If it's unable to access the URL for some reason you will be notified by email.
- Go ahead and create an account
- Click on the create cronjob button
- From the Create cronjob page enter the external link you will like to monitor in the URL box
- Also set how often you will like the service to check the link. You can set a daily internal.
- The most important part is the notification, ideally you will want to be notified when the link visit fails and notified in case it's back up. See the screenshot above to see which options to toggle
- Once you are done either click on TEST RUN or CREATE to verify the setup or create your cron job respectively.
Provide customers with an archived version
Wayback Machine is a service that periodically scans the web and stores copies of websites. You can get an external site crawled and then link to that instead if that works for your case.
These pages are likely to stay up compared to the live versions. Here is an example of Facebook from 2009.
Here is another article for you 😊 "BrainFuck Interpreter using method chaining"