Nope, to do that we would need to crawl the whole internet to see what files are publically linked. And we would need a time machine to do that for all days from when they started hosting files. And also BBSs, mailing lists and IRC etc. It is a approach from the wrong side.
We have access to the raw file tree via FTP.
For Fileplanet it is different, we crawled that from the website end so we know what was public and what not.
Actually the FP staff used a metric like that to determine what files to keep (if I understood it correctly), files that did not get any hits in a timespan they checked will/are/might be deleted.
Well, archive first, think later. Maybe we can find a way to make it public in a good way.
I will personally grab a copy of the planetquake stuff and publish a file list. With a proper request and reasoning (proof) I could send out files from that then.