Building a Web Crawler? Here Are All The Places That It Will Probably Fail At

  • If your web crawler is stuck, you need to know
  • If your web crawler is slowing down, you need to know
  • if you are having internet issues, you need to know
  • if the data you are getting is weird, you need to know
  • We the web pages don't load
  • Internet is down
  • When the content at the URL has moved
  • You are shown a CAPTCHA challenge.
  • The web page changes its HTML, so your scraping doesn’t work.
  • Some fields that you scrape are empty some of the time, and there is no handler for that.
  • The web pages take a long time to load
  • The web site has blocked you completely

--

--

--

Founder @ ProxiesAPI.com

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

5 Developer Tools to make you Super Productive

Methodology for implementing an RPA approach — Pilot

Set and use Zsh as default shell in WSL on Windows 10 the right way

[Software] Development as Freedom — Part 2

Understanding Google Cloud IAM concepts with stick figures

Popular Open-Source Alternatives to paid software

SQL Injection Prevention

How Do You Structure a Complex Content Architecture in Drupal 8?

How Do You Structure a Complex Content Architecture in Drupal 8?

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Mohan Ganesan

Mohan Ganesan

Founder @ ProxiesAPI.com

More from Medium

My Top 5 PropTech Companies of 2022

Proptech

How to Find the Right Funding Plan for your Startup

Establishing Smart Parking with the Internet of Things

Zero Trust, Authorship And Our Creative Future