How to get all image, pdf and other files links from a website?

11155 · January 5, 2012, 9:42am

I am working on an application where I have to

1) get all the links of website 2) and then get the list of all the files and file extensions in each of the web page/link.

I am done with the first part of it now I have to get the all the files/file-extensions in each of the page.

Can anybody guide me how to parse the links/webpage and get the file- extensions in the page?

Peter_Hickman1 · January 5, 2012, 9:48am

Is it me or has this particular homework question turned up a few times already?

Hint: This has been asked and answered before quite recently (yesterday even) so try reading the mailing list.