The author in my opinion misrepresents the stance of the NY Times here.
It’s a false belief that reading something (whether by human or machine) somehow implicates copyright.
The Times issue isn't just that someone or thing is reading materials. The Times takes issue with a group intentionally enmass collecting large amounts of their data (in their case articles) with the intention of distributing them packed into a product to 3rd parties engaging in commercial activities without paying a licensing fee. The Times fears that them doing this damages the potential market for future and past articles from them.
In essentially the Times fears that Common Crawl is acting a fence for other groups to infringe on their copyrighted works.
Factors of Fair Use:
- The purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes.
- The nature of the copyrighted work.
- The amount and substantiality of the portion used in relation to the copyrighted work as a whole.
- The effect of the use upon the potential market for or value of the copyrighted work.